This repository contains data science projects I have completed through my data science practicum.
Project Name | Description | Libraries used |
---|---|---|
Analyzing Borrower Risk | Given data on customer's credit worthiness, I determined if a customer's marital status and number of children have an impact on whether they will default on a loan. | pandas |
Crankshaft List Analyst | As an analyst at Crankshaft List, hundreds of free advertisements for vehicles are published on the site every day. I studied data collected over the last few years and determined which factors influence the price of a vehicle. | pandas, numpy, matplotlib |
Ice Videogame Analyst | Identify patterns that determine whether a game succeeds or not. This allows me to spot potential big winners and plan advertising campaigns. | pandas, matplotlib, numpy, scipy, seaborn, math |
Telecom Plans ML Model | Mobile carrier Megaline has found out that many of their subscribers use legacy plans. They want to develop a model that would analyze subscribers' behavior and recommend one of Megaline's newer plans: Smart or Ultra. | Pandas, Matplotlib, Numpy, Sklearn |
Beta Bank Customers | Beta Bank customers are leaving: little by little, chipping away every month. Given data on clients' past behavior and termination of contracts with Beta Bank, build a model with the maximum possible F1 score. | pandas, numpy, sklearn |
Oil Well Model | You have data on oil samples from three regions. Parameters of each oil well in the region are already known. Build a model that will help to pick the region with the highest profit margin. Analyze potential profit and risks using the Bootstrapping technique. | pandas, numpy, scipy, sklearn, matplotlib.pyplot |
IMDB Sentiment Analysis | Given reviews from IMDB, build and train a model to automatically detect negative moview reviews. | tensorflow, sklearn, matplotlib, seaborn, tqdm, lightgbm, nltk, spacy |
Time series Cab Fare Prediction | Sweet Lift Taxi company has collected historical data on taxi orders at airports. To attract more drivers during peak hours, we need to predict the amount of taxi orders for the next hour. Build a model for such a prediction. The RMSE metric on the test set should not be more than 48. | Time Series Analysis, Python, Sci-kit Learn, Regression |
Customer Age Detection | The supermarket chain Good Seed would like to explore whether Data Science can help them adhere to alcohol laws by making sure they do not sell alcohol to people underage. | Computer Vision, Keras, CNN, Python |