A company known as Amazing Prime is hosting a hackathon and is asking us to help prepare the datasets that the coders will be working with. They gathered data from Wikipedia and Kaggle for us to work with. The main focus is to create a function that will help us clean up large datasets and merge them together.
We are using python and pandas in Jupyter Notebook, as well as SQL in PGAdmin 4 to clean up a significant amount of data from Wikipedia and Kaggle. We first read in all the data and clean it up in jupyter notebook, then we merge the datasets and pass them over to SQL.