Movies-ETL

Purpose

The purpose of this exercise was to perform ETL on various movie datasets to predict what films would become popular for a streaming service.

Challenge

Created a function (extract_transform_load) which runs through various datasets to perform extraction, transformation, and loading of the information. In this case, the function extracts data from three files: wikipedia-movies.json, movies_metadata.csv, and ratings.csv. It then performs the transformation, which includes cleaning data from both the Wikipedia and Kaggle datasets. Once the data is cleaned, the Wikipedia and Kaggle Movies datasets are merged into one dataset. Once we clean the ratings dataset, we also add it to our dataframe in a new dataframe. The newly transformed datasets are then loaded into SQL database for further analysis.

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
Resources		Resources
.gitattributes		.gitattributes
.gitignore		.gitignore
ETL_clean_kaggle_data.ipynb		ETL_clean_kaggle_data.ipynb
ETL_clean_wiki_movies.ipynb		ETL_clean_wiki_movies.ipynb
ETL_create_database.ipynb		ETL_create_database.ipynb
ETL_function_test.ipynb		ETL_function_test.ipynb
README.md		README.md
wiki_movies.ipynb		wiki_movies.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Movies-ETL

Purpose

Challenge

About

Releases

Packages

Languages

jlozano1990/Movies-ETL

Folders and files

Latest commit

History

Repository files navigation

Movies-ETL

Purpose

Challenge

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages