Skip to content


Repository files navigation

Technical Projects

This repository contains data science projects I have completed through my data science practicum.

Project Name Description Libraries used
Analyzing Borrower Risk Given data on customer's credit worthiness, I determined if a customer's marital status and number of children have an impact on whether they will default on a loan. pandas
Crankshaft List Analyst As an analyst at Crankshaft List, hundreds of free advertisements for vehicles are published on the site every day. I studied data collected over the last few years and determined which factors influence the price of a vehicle. pandas, numpy, matplotlib
Ice Videogame Analyst Identify patterns that determine whether a game succeeds or not. This allows me to spot potential big winners and plan advertising campaigns. pandas, matplotlib, numpy, scipy, seaborn, math
Telecom Plans ML Model Mobile carrier Megaline has found out that many of their subscribers use legacy plans. They want to develop a model that would analyze subscribers' behavior and recommend one of Megaline's newer plans: Smart or Ultra. Pandas, Matplotlib, Numpy, Sklearn
Beta Bank Customers Beta Bank customers are leaving: little by little, chipping away every month. Given data on clients' past behavior and termination of contracts with Beta Bank, build a model with the maximum possible F1 score. pandas, numpy, sklearn
Oil Well Model You have data on oil samples from three regions. Parameters of each oil well in the region are already known. Build a model that will help to pick the region with the highest profit margin. Analyze potential profit and risks using the Bootstrapping technique. pandas, numpy, scipy, sklearn, matplotlib.pyplot
IMDB Sentiment Analysis Given reviews from IMDB, build and train a model to automatically detect negative moview reviews. tensorflow, sklearn, matplotlib, seaborn, tqdm, lightgbm, nltk, spacy
Time series Cab Fare Prediction Sweet Lift Taxi company has collected historical data on taxi orders at airports. To attract more drivers during peak hours, we need to predict the amount of taxi orders for the next hour. Build a model for such a prediction. The RMSE metric on the test set should not be more than 48. Time Series Analysis, Python, Sci-kit Learn, Regression
Customer Age Detection The supermarket chain Good Seed would like to explore whether Data Science can help them adhere to alcohol laws by making sure they do not sell alcohol to people underage. Computer Vision, Keras, CNN, Python