A collection of projects completed for Practicum's Data Scientist professional training program.
Project Name | Notebook | Description | Libraries |
---|---|---|---|
Python | basic_python.ipynb | Analyze and compare music preferences and user behavior between two cities. Test hypotheses regarding user activity and music preferences between cities. | Pandas |
Telecom Classifying Client Churn | classifying_churn.ipynb | Build the best model to predict churn of clients using AUC-ROC metric. | NumPy, Pandas, matplotlib, seaborn, math, time, functools, re, IPython.display, sklearn, catboost, lightgbm, xgboost, random, sys |
computer_vision | computer_vision.ipynb | Build and test a regression model using supply photos in order to predict age of individuals featured in photos. | Pandas, Seaborn, matplotlib, tensorflow, keras |
Machine Learning for Texts | ml_for_text.ipynb | Train a model for categorizing positive and negative reviews with (F1 score minimum=0.85) | NumPy, Pandas, matplotlib, seaborn, re, math, tgdm |
Time Series | time_series.ipynb | Predict Peak hours for taxis in the Chicagoland area based on historical data using the RMSE evaluation metric. | NumPy, Pandas, matplotlib, sciPy, seaborn, time, math, statsmodels, sklearn, IPython, sys, catboost, lightgbm, xgboost |
Numerical Methods | numerical_methods.ipynb | Generate a model that predicts the value of a car based on vehicle's historical data. | NumPy, Pandas, matplotlib, seaborn, time, math, sklearn, random, sys, catboostregressor, decisiontree |
Linear Algebra | linear_algebra.ipynb | Utilize Machine Learning to classify customers and predict whether or not they will likely receive insurance benefits based on a number of characteristics such as income, number of family members, etc. | NumPy, Pandas, math, seaborn, matplotlib, sklearn, IPython, sys |
Machine Learning in Business | ml_in_business.ipynb | Select the appropriate region to build an oil well based on which will generate highest profit margin via machine learning and bootstrapping methods. | NumPy, Pandas, math, seaborn, matplotlib, sklearn, scipy, random, sys |
Supervised Machine Learning | supervised_ml.ipynb | Build a model (minimum F1 Score = 0.59) that will predict which customers will likely leave based on past behavior. | NumPy, Pandas, math, matplotlib, sklearn, random, sys |
Machine Learning | machine_learning.ipynb | Build an ML model (accuracy > 0.75) to predict the appropriate phone service plan based on a number of various customer characteristics for Megaline, a new phone company. | NumPy, Pandas, sklearn, sys |
Predicting Peak Hours for Taxis in the Chicagoland Area | predicting_peak_hours.ipynb | Rank top neigborhoods based on most popular drop-off stops for Zuber, an up and coming ride share company. | NumPy, Pandas, matplotlib, seaborn, scipy |
Integrated Project 1 | integrated_project1.ipynb) | Analyze various characteristics of video games including type of platform, genre and ESRB ratings in order to identify which of those most strongly influence sales. | NumPy, Pandas, matplotlib, sciPy, seaborn |
Gold Recovery Prediction | integrated_project2.ipynb | Build a model that will accurately predict gold recovery outcomes and calculate the final sMAPE (symmetric mean absolute percentage error) value to evaluate model performance. Utilized cross-validation techniques for final model evaluation. | NumPy, Pandas, matplotlib, sciPy, seaborn |
Statistical Data Analysis | statistical_data_analysis.ipynb | Analyze different phone plans based on revenue and existing customers for the marketing team of a new phone company. | NumPy, Pandas, matplotlib, sciPy |
Exploratory Data Analysis | exploratory_data_analysis.ipynb | Utilize exploratory data analysis techniques to visualize and analyze data collected in order to determine the factors that most strongly impact vehicle price. | NumPy, Pandas, matplotlib |
credit_score | credit_score.ipynb | Determine whether marital status and number of children impact whether customers will default on a loan, which will be used to ultimately determine the customer's credit score. | NumPy, Pandas |
Authors
Dina Saadeh