This repository contains data science projects created at Codeup in 2019.
Using data from a study on domestic violence cases in Chicago (Block, C. R., 1995), analyzed reported cases to predict the key indicators of recidivism (re-assault) and created webapp with personalized resources for victims of domestic violence.
Natural Language Processing Project uses Naive Bayes, Linear Support Vector Machine, and Logistic Regression algorithms and Keras library to predict the programming language of Github repository's README files. Can we tell what language the repo is written in based on the readme file alone? Data used can be found here: Github repo files 1 Github repo files 2 Github repo files 3
Time Series Analysis Project uses Autoregression model, AutoRegressive Integrated Moving Average (ARIMA) model, Moving Average model, and Seasonal Autoregressive Integrated Moving Average (SARIMA) model to predict a user's information based off of a limited set of health metrics from a Fitbit. Can we deduce health information from Fitbit data alone, e.g. height, weight, and sex? Data used can be found here: Fitbit messy data (tar.gz)
Classification Project uses classification algorithms like Logistic Regression, K-Nearest Neighbors, Decision Trees and Random Forests to predict churn of customers. Data used can be found at the Kaggle competition: here