This repository will include all my data analysis work using SQL (MySQL & PostgreSQL), including my coursework and my work from DataCamp Data Scientist Certificate & Projects
, which will be conducted in Jupyter Notebook.
Project Name | Description | Topic |
---|---|---|
Classify Song Genres from Audio Data | Using a dataset comprised of songs of two music genres (Hip-Hop and Rock) to train a classifier to distinguish between the two genres based only on track information derived from Echonest. | Data Manipulation, Importing and Cleaning Data, Data Visualization, PCA, Decision Tree, Logistic Regression |
Predicting Credit Card Approvals | Build a machine learning model to predict if a credit card application will get approved. | Data Manipulation, Logistic Regression, Importing and Cleaning Data, Grid Search, Applied Finance |
Find Movie Similarity from Plot Summaries | Use NLP and clustering on movie plot summaries from IMDb and Wikipedia to quantify movie similarity. | Data Manipulation, TF-IDF, K-Means Clustering, Dendrogram, Probability & Statistics |
Disney Movies and Box Office Success | Explore Disney movie data, then build a linear regression model to predict box office success. | Data Manipulation, Data Visualization, Importing and Cleaning Data, Linear Regression, Probability & Statistics |
Mobile Games AB-Testing with Cookie Cats | Analyze the result of an A/B test via player retention from the popular mobile puzzle game, Cookie Cats. | Data Manipulation, A/B Tesing, Bootstrapping, Data Visualization |
The Android App Market on Google Play | Load, clean, and visualize scraped Google Play Store data to gain insights into the Android app market. | Data Manipulation, Data Visualization, Sentiment Analysis, Probability & Statistics, Importing & Cleaning Data |
Investigating Netflix Movies and Guest Stars in The Office | Apply foundational Python skills by manipulating and visualizing movie and TV data. | Data Manipulation, Data Visualization, Programming |
A Visual History of Nobel Prize Winners | Explore a dataset from Kaggle containing a century's worth of Nobel Laureates. | Data Manipulation, Data Visualization, Importing & Cleaning Data |
Dr. Semmelweis and the Discovery of Handwashing | Reanalyse the data behind one of the most important discoveries of modern medicine: handwashing. | Data manipulation, Data Visualization, Bootstrap Analysis Importing & Cleaning Data |
The GitHub History of the Scala Language | Find the true Scala experts by exploring its development history in Git and GitHub. | Data Manipulation, Data Visualization, Importing & Cleaning Data |