Skip to content

ankit-kothari/Data-Science-Journey

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Data Science Journey

🔗 LinkedIn: https://www.linkedin.com/in/ankit-kothari-510a9623

📧 Email: ankit256@gmail.com

Data Science Must

Tools: Github, Docker, Pyspark, pandas, plotly

The raw data has been downloaded from the USCIS Website which has an individual csv file for each year. It has data regaridng Employers, Initial Approvals, Continuing Approvals, Initial Denials, Continuing Denials, and demographic data. The goal of this analysis is to look at different trends around H1B visas touching Employers and States

This project visualizes how INR changed in the last 20 years under three different Prime Minister of India

Tools: pandas, sqlite3, plotly, mapbox, data optimization, DASH, Heroku

Probablity and Statistics

Theory: Hypothesis Testing, AB Testing, Data Distributions, Parametric and Non-Parametric Test

Tools: Python, Pandas, scipy, plotly, statsmodel

Machine Learning

  • Bike Rental Prediction: Comparing Decesion Treed Models and Enssemble Methods using Random Forest to predict the bike rentals at a given hour of the day
  • Credit Risk Analysis: Comparing and exploring Hyperparameters to tune Logistic Regression, XGBoost and Artificial Nueral Network to predict whether a lender will pay their loan back. Uses publically available data from LendingClub.com

ML Algorithms: Linear Regression, Logistic Regression, Decesion Tree Model, Random Forest, XGBoost, ANN, Ensemble Models

Feature Extractions: Data Cleaning, Normalizing/Scaling of the data, Binning, Sampling, Correlation Matrix, Hyperparameter Tuning

Tools: Python, Pandas, sklearn, keras,

Natural Language Processing

Deep Learning Algorithms: distilBERT,BERT, LSTM, BiLSTM, 1D-CNN, GRU, Word Embeddings, Sentence Encoders, TF-IDF, LDA, NMF

Text Analysis: Text Cleaning using spacy, NER, POS, Text Classification, Chatbots, Topic Modeling

Tools: Python, Pandas, TF2.0, keras, Pytorch, spacy, pyspark, Slack RTM API, seaborn, plotly

Machine Vison and Opencv

Deep Learning Algorithms: CNN, OpenCV, Keras

Image Analysis: Blurring, Thresholding, Edge Detection, Morphological transformations, Contour detection, Affine Transformation, Transfer Learning, VGG19

Tools: Python, Pandas, TF2.0, keras, Pytorch, spacy, pyspark, OpenCV