🔗 LinkedIn: https://www.linkedin.com/in/ankit-kothari-510a9623
📧 Email: ankit256@gmail.com
- Optimization of Large Datasets:Process Pools, Threading, Downcasting, Memory optimization
- Github:Squashing, Rebasing, Merging, fetch, remote,
- Docker :Creating a Dockerfile, images running containers.
- Gradient Descent and Stochastic Gradient Descent From Scratch:Exploring GD and SGD on Linear and Logistic Regression from scracth.
- Shapes in Deep Learning: Exploring the shapes of outputs of different layers like ANN, RNN,LSTM, CNN, BiLSTM, Maxpooling
- Transformer Encoder from Scratch: Building Encoder blocks comprising of Multi-Head Attention, Feed-Forward blocks.
- Basics of Spark and Map Reduce:Exploring Basiscs of Pyspark and how to manipulate data using Transformations and Actions
Tools: Github, Docker, Pyspark, pandas, plotly
The raw data has been downloaded from the USCIS Website which has an individual csv file for each year. It has data regaridng Employers, Initial Approvals, Continuing Approvals, Initial Denials, Continuing Denials, and demographic data. The goal of this analysis is to look at different trends around H1B visas touching Employers and States
This project visualizes how INR changed in the last 20 years under three different Prime Minister of India
3. Identifying customer segments that would increase sales the most and target them with ads in social media.
Tools: pandas, sqlite3, plotly, mapbox, data optimization, DASH, Heroku
- All about Normal Distribution with Scipy and Plotly
- How to plan an AB Test?
- Analyze an A/B test from the popular mobile puzzle game Cookie Cats
Theory: Hypothesis Testing, AB Testing, Data Distributions, Parametric and Non-Parametric Test
Tools: Python, Pandas, scipy, plotly, statsmodel
- Bike Rental Prediction: Comparing Decesion Treed Models and Enssemble Methods using Random Forest to predict the bike rentals at a given hour of the day
- Credit Risk Analysis: Comparing and exploring Hyperparameters to tune Logistic Regression, XGBoost and Artificial Nueral Network to predict whether a lender will pay their loan back. Uses publically available data from LendingClub.com
ML Algorithms: Linear Regression, Logistic Regression, Decesion Tree Model, Random Forest, XGBoost, ANN, Ensemble Models
Feature Extractions: Data Cleaning, Normalizing/Scaling of the data, Binning, Sampling, Correlation Matrix, Hyperparameter Tuning
Tools: Python, Pandas, sklearn, keras,
- Data Cleaning, Extraction and Topic Modeling
- Spacy Playground
- Topic Modeling
- Chatbots Using 4 different archtitectures, TF-IDF, Word Embeddings, Sentence Embeddings, and TF-hub sentence Encoders, compare the efficiency of all these models.
- Pytorch approach to classification
- TF2.0 and Keras approach to classification
- Multi Label classification using distilBERT
- Experiments with Transformers and Hugging Face
Deep Learning Algorithms: distilBERT,BERT, LSTM, BiLSTM, 1D-CNN, GRU, Word Embeddings, Sentence Encoders, TF-IDF, LDA, NMF
Text Analysis: Text Cleaning using spacy, NER, POS, Text Classification, Chatbots, Topic Modeling
Tools: Python, Pandas, TF2.0, keras, Pytorch, spacy, pyspark, Slack RTM API, seaborn, plotly
- Style Transfer using Pytorch
- Basics of opencv
- Identifying digits and predicting digits using opencv and keras
- How to scan a document
Deep Learning Algorithms: CNN, OpenCV, Keras
Image Analysis: Blurring, Thresholding, Edge Detection, Morphological transformations, Contour detection, Affine Transformation, Transfer Learning, VGG19
Tools: Python, Pandas, TF2.0, keras, Pytorch, spacy, pyspark, OpenCV