This repository containts portfolio of data science projects completed by me for academic (viz. Galvanize, GeorgiaTech, UC Davis, Coursera, Udacity, EdX), self learning, and hobby purposes. Projects are presented in the form of iPython Notebooks, python source code, markdown files.
Some case studies projects completed at Galvanize 13 weeks full-time Data Science Immersive Program:
- Sale price prediction of heavy equipment at auction based on its usage, equipment type, and configuration
- Churn prediction with a ride-sharing company using Logistic Regression, Random-Forest, Gradient boosting models
- Classification of spam emails using Natural Language Processing TF-IDF vectorization and Naïve Bayes classifier
- Recommendation System using ALS Matrix Factorization in Spark on multi-node EC2 cluster on AWS
- Fraud Detection System
- Centrality & Communities in a social media dataset using python networkx graph library
Project: DocReach - Predicting physician specialty from text data Overview: A large social media marketing firm wants to target doctors / physicians based on their practice area. For example, A marketing campaign to target Cardiologists for heart related news feed. git: https://github.com/krishnatray/galvanize-dsi-capstone
- Linear Regression
- KNN
- KMeans Clustering
- Decision Tree
- Dice Simulation https://github.com/krishnatray/data-science-portfolio/blob/master/60_Simulation/dice_game_simulation.ipynb
- Boston Housing Prices Simple Linear Regression https://github.com/krishnatray/data-science-portfolio/blob/master/02_1_Boston_Housing_Prices_Linear_Regression.ipynb
- Social Network Logistic Regression https://github.com/krishnatray/data-science-portfolio/blob/master/02_2_SocialNetwork_LogisticRegression.ipynb
- Simple: Fibonacci, Calculate Pi, Tower of Hanoi
- Intermediate:
- Advance