Skip to content

Extracted movie data from Kaggle and Wikipedia, transformed the data to be usable for a hackathon competition using Pandas and Jupyter Notebook, and loaded the data to SQL in PGAdmin 4 and merged the datasets.

Notifications You must be signed in to change notification settings

mdbinger/Movies-ETL

Repository files navigation

Movies-ETL

Module 8 of Data Analytics Bootcamp

Overview

A company known as Amazing Prime is hosting a hackathon and is asking us to help prepare the datasets that the coders will be working with. They gathered data from Wikipedia and Kaggle for us to work with. The main focus is to create a function that will help us clean up large datasets and merge them together.

We are using python and pandas in Jupyter Notebook, as well as SQL in PGAdmin 4 to clean up a significant amount of data from Wikipedia and Kaggle. We first read in all the data and clean it up in jupyter notebook, then we merge the datasets and pass them over to SQL.

About

Extracted movie data from Kaggle and Wikipedia, transformed the data to be usable for a hackathon competition using Pandas and Jupyter Notebook, and loaded the data to SQL in PGAdmin 4 and merged the datasets.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published