nhafer88 / Movies_ETL Public

Notifications You must be signed in to change notification settings
Fork 0
Star 0

Performed the Extract, Transform and Load (ETL) process to create a data pipeline on movie datasets using Python, Pandas, Jupyter Notebook and PostgreSQL.

0 stars 0 forks Branches Tags Activity

Notifications

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
Resources		Resources
.gitignore		.gitignore
ETL_clean_kaggle_data.ipynb		ETL_clean_kaggle_data.ipynb
ETL_clean_wiki_movies.ipynb		ETL_clean_wiki_movies.ipynb
ETL_create_database.ipynb		ETL_create_database.ipynb
ETL_function_test.ipynb		ETL_function_test.ipynb
Movielens_extract.ipynb		Movielens_extract.ipynb
README.md		README.md
Wiki_extract.ipynb		Wiki_extract.ipynb
re_sets.png		re_sets.png
re_types.png		re_types.png

Repository files navigation

Amazing Prime Hackathon

Project Overview

This analysis project provides a visualization for predicting which low-budget movies being released will become popular. Data from Wikipedia (movies released since 1990) and Movielens/Kaggle (movie ratings) were utilized in this project. The tasks in this project:

Extract the data from the data sources
Tranform the data in clean data set using Python and Pandas
Load the data set into a SQL table

SQL Queries

Row count for Movies Table

Row count for Ratings Table

About

Performed the Extract, Transform and Load (ETL) process to create a data pipeline on movie datasets using Python, Pandas, Jupyter Notebook and PostgreSQL.

python json csv sql postgresql movie-database pandas pgadmin4 etl-pipeline

Report repository

Releases

No releases published

Packages

No packages published

Languages

Jupyter Notebook 100.0%