Machine Learning Engineer

Resume.pdf | LinkedIn | Twitter | GitHub | GitLab | Bento.me | avr13405@gmail.com

Skills

Programming Skills: Python, bash, LaTex
Database: SQL, MongoDB
Technical Skills:
- TensorFlow, Keras, Scikit-Learn, Gensim, NLTK, Pytesseract, PyCaret, Pandas, NumPy, Matplotlib, Seaborn, Regex, SciPy, BeautifulSoup, Git, GitHub, Excel, PowerPoint
Familiar Tools:
- XGBoost, LightGBM, Streamlit, Flask, FastAPI, Docker, OpenCV, Plotly, bokeh, Selenium

Work Experience

SyncMOF | Backend Engineer Intern | (May 2024 – Present)

Skills: Python, Pandas, Numpy, pwlf, Optuna

Feature extraction using Piecewise Linear Fit algorithm and B-splines Interpolation.
Built an efficient Data Analysis and Data Processing Pipeline.
Wrote fast, efficient, and manageable code by packaging the code and following official Python PEP8 style guides.

Wint Wealth | Data Science Intern | (Oct 2023 – Feb 2024)

Skills: Python, Web Scraping, Web Crawling, Beautiful Soup, AWS Lambda, AWS Simple Queue Service, AWS S3, Cron, Regex, Code Refactoring, Team Coordination, Teamwork, Notion

Built an internal Python utility library, centralizing the reused code in the ML codebase, thereby reducing code duplication and streamlining the whole codebase. Implemented SSH tunneling into EC2 and locally connect to DocumentDB, performing faster local testing.
Built an efficient Web Crawling and Scraping Pipeline in a scalable fashion to scrape 20+ finance news sources, reducing the scraping time from 3 days to 4 hours.
Implemented a serverless solution using AWS Lambda, SQS, Document DB, and S3, optimizing efficiency and scalability in the scraping pipeline.
Built a dashboard to keep track of Scraping Pipeline using Appsmith, fetching data from MongoDB, AWS Cloudwatch, and AWS SQS.
Worked in a fast-paced startup environment.

SiviSoft | AI/ML Intern | (Sept 2023 – Oct 2023)

Skills: Python, Code Refactoring, Code Debugging, AWS CLI, AWS S3, NLP, Regex, pdfplumber, Jira, Elasticsearch, Elasticview, Team Coordination, Teamwork

Working with Medical PDF data(extracting patient data, scanned PDF data).
Using Python and NLP; as of now, mostly working with Python.
Have done lots of Code Debugging and Code Refactoring.
Helping other interns/contract-based employees with their Jira tickets and setting up their environment.
Worked for a little over 5 weeks, left due to mental health reasons and work culture.

Culinda Inc., | Data Science Intern | (Aug 2022 – Jan 2023)

Skills: Python, CyberSecurity, Statistics, Data Analysis, Machine Learning, IoT/IoMT

Created a POC using Python on Cyber risk quantification using FAIR, STRIDE Model for quantifying cyber risk to IoMT/IoT devices.
Wrote Python scripts that analyzed Terabytes of data to generate (text & excel) reports that checked if the data flow in the pipeline was happening as expected. (Data Validator Tool)
Worked in Baselining for hospitals' network data to identify any malicious behavior.

Articles

Projects

NLP Projects

Fake News Classification | GitHub
- Technologies Used: Python, TensorFlow, scikit-learn, nltk, langdetect, wordcloud, matplotlib, regex, numpy, pandas
- Implemented an LSTM Model on Kaggle Fake News Dataset with over 70K news text data, with 97% accuracy
- Along with standard text pre-processing, langdetect library was used to identify & remove news in other languages(French, German, Arabic, etc.) giving better model performance.
- For EDA, WordCloud, and plotting of bi-grams and tri-grams were used to identify the general words present in the corpus.
- LSTM Model was build using TensorFlow along with pre-trained GloVe Word Embeddings.
Topic Modeling Using RACE Dataset | GitHub
- Technologies Used: Python, Regex, NLTK, Gensim, Scikit-Learn, tSNE, pyLDAvis, bokeh, Git
- This NLP Project aims to use statistical models to reveal the abstract “topics” present in a large set of text documents, thus trying to classify documents based on different themes they convey.
- Three Topic Modeling algorithms were used namely, Latent Semantic Analysis(LSA), Latent Dirichlet Allocation(LDA), and Non-Negative Matrix Factorization(NMF).
- BERTopic & Top2Vec were also explored which gave quite good results.
Medical Embeddings and Clinical Trial Search Engine | Github
- Technologies Used: Python, Gensim, Word2Vec, FastText, Streamlit, Git
- The Project aims to train SkipGram and FastText Models on COVID-19 Clinical Trials Dataset and builds a Search Engine where user can type any COVID-19 related keyword and it presents all the top n similar results from the dataset

Computer Vision Projects

Image Coloring using Autoencoders | Github
- Technologies Used: Python, TensorFlow, Keras, scikit-image, matplotlib, numpy
- I tried using Autoencoders and Transfer Learning for this one. I tried VGG16 and InceptionResNetV2 as an encoder/feature extractor layer and a custom decoder layer.
Muti-class Image Classification Model | Github
- Technologies Used: Python, tensorflow, keras, matplotlib, flask, gunicorn, pathlib, numpy
- The project aims to classify images into driving license, social security, and others category by using a CNN model architecture.
- An accuracy of 96% was achieved on test data of 150 images. Deployment was done using gunicorn and flask API.

Machine Learning & Python Projects

Basic Library Management System API | GitHub
- Technologies Used: Python, FastAPI, Pydantic, MongoDB, Docker, GCP
- This project implements a RESTful API for a Library Management System using FastAPI with MongoDB Atlas as the database, deployed as a Docker image on GCP.
Business License Status Prediction | GitHub
- Technologies Used: Python, scikit-learn, h2o, tensorflow, flask, gunicorn
- The project aims to predict if a customer's license should be issued, renewed, or cancelled depending on features in the dataset. The problem statement was presented at ZS Data Science Challenge - 2019.
Medical Data Extraction Project | Github
- Technologies Used: Python, Regex, OpenCV, Pytesseract, FastAPI
- Python backend was built using pytesseract, OpenCV, Regular expressions and FastAPI as a web serving framework
- Auto extracted important fields from patient details and medical prescriptions. Image processing was performed in OpenCV and then pytesseract was used for image to text conversion. The last step was to use Regular Expression (Regex) for extracting important fields from the text
SQL Project: Provide Insights to Management in Consumer Goods Domain
- Project Github Link & Certificate of Participation
Credit Card Default Prediction | Github
- This a classic Credit Card Default Prediction project where based on customer profile we want to predict whether the borrower is likely to default in the next 2 years or not have a delinquency of more than 3 months.
- LogisticRegression, RandomForst, XGBoost, LightGBM, and a vanilla Neural Network was implemented in modeling.
Regression Models for House Price Prediction | GitHub
- House Price Prediction on Pune Real-estate dataset using different regression models like Linear, Ridge, Lasso, Elastic Net, Random Forest, XGBoost, K-Nearest Neighbours, Support Vector Regressor, XGBoost.
- Also, multi-layer perceptron(MLP) was implemented using TensorFlow
Kaggle House Price Prediction | Link
- My very first Project.

Knowledge Repo

NLP with TensorFlow

My Notes from the book Natural Language Processing with TensorFlow, 2nd-ed. by Thushan Ganegedara
Things I have become familiar with:
- Word Embeddings;
- Project: Sentence Classification using CNN
- RNNs, LSTMs, GRUs;
  - Project: NER with RNNs
- Seq2Seq Learning, Language Modelling, Neural Machine Translation(NMT)
  - Project: Neural Machine Translation: English to German
  - Project: Language Modelling: Generating Text using LSTMs
- Currently learning Transformers:
  - Project: QnA with BERT using HuggingFace

Machine Learning with PyTorch and Scikit-Learn

My Notes from Machine Learning with PyTorch and Scikit-Learn by Sebastian Raschka.
Things covered so far:
- Perceptron, Gradient Descent
- Logistic Regression, Decision Tree, SVM, KNN
- Feature Selection, Regularization(L1 & L2)
- Dimensionality Reduction: PCA, LDA
- Model Evaluation & HyperParameter Tuning
- Ensemble Learning: Bagging, Boosting
- Sentiment Analysis, Topic Modelling

Deep Learning with TensorFlow and Keras

My Notes from the book Deep Learning with TensorFlow and Keras, 3rd Edition
Will cover selective topics from this book

Machine Learning using Python

Notes from Machine Learning using Python by Manaranjan Pradhan, U Dinesh Kumar
This was the very first ML book I read.

About Me

📖 I'm interested in NLP & ML Engineering. And Looking forward to building my career there. I document my learning on GitHub and share it with the LinkedIn AI Community.
🕵🏼‍♂️ Besides my studies, I'm interested in learning about myself from a spiritual & psychological perspective.
👀 𝐋𝐨𝐨𝐤𝐢𝐧𝐠 𝐟𝐨𝐫 𝐦𝐲 𝐟𝐢𝐫𝐬𝐭 𝐟𝐮𝐥𝐥-𝐭𝐢𝐦𝐞 𝐫𝐨𝐥𝐞 𝐚𝐬 𝐚 𝐌𝐚𝐜𝐡𝐢𝐧𝐞 𝐋𝐞𝐚𝐫𝐧𝐢𝐧𝐠 𝐄𝐧𝐠𝐢𝐧𝐞𝐞𝐫, 𝐩𝐫𝐞𝐟𝐞𝐫𝐚𝐛𝐥𝐲 𝐬𝐭𝐚𝐫𝐭𝐢𝐧𝐠 𝐰𝐢𝐭𝐡 𝐚𝐧 𝐢𝐧𝐭𝐞𝐫𝐧𝐬𝐡𝐢𝐩.
👉🏼 Priority For Me: I'm looking for a fun work environment, especially a mentor under whom I can work and learn a lot of stuff, one who is willing to commit to me just as I will, and one who sees my potential.
⭐ Open to Remote Opportunities (both Internationally & within India)
😃 Contact me if you find me interesting. I'm active on LinkedIn🌼

Education

BS in Data Science & Application(CGPA: 8.5) | IIT Madras | 2021-2025(Expected)
12th Std. CBSE Board(Percentage: 86.8%) | Star International School, Ranchi, JH | 2020

Name		Name	Last commit message	Last commit date
Latest commit History 140 Commits
assets/img		assets/img
README.md		README.md
_config.yml		_config.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

assets/img

assets/img

README.md

README.md

_config.yml

_config.yml

Repository files navigation

Machine Learning Engineer

Resume.pdf | LinkedIn | Twitter | GitHub | GitLab | Bento.me | avr13405@gmail.com

Skills

Work Experience

SyncMOF | Backend Engineer Intern | (May 2024 – Present)

Wint Wealth | Data Science Intern | (Oct 2023 – Feb 2024)

SiviSoft | AI/ML Intern | (Sept 2023 – Oct 2023)

Culinda Inc., | Data Science Intern | (Aug 2022 – Jan 2023)

Articles

Projects

NLP Projects

Computer Vision Projects

Machine Learning & Python Projects

Knowledge Repo

NLP with TensorFlow

Machine Learning with PyTorch and Scikit-Learn

Deep Learning with TensorFlow and Keras

Machine Learning using Python

About Me

Education

About

Releases

Packages

avr2002/portfolio-avr

Folders and files

Latest commit

History

Repository files navigation

Machine Learning Engineer

Resume.pdf | LinkedIn | Twitter | GitHub | GitLab | Bento.me | avr13405@gmail.com

Skills

Work Experience

SyncMOF | Backend Engineer Intern | (May 2024 – Present)

Wint Wealth | Data Science Intern | (Oct 2023 – Feb 2024)

SiviSoft | AI/ML Intern | (Sept 2023 – Oct 2023)

Culinda Inc., | Data Science Intern | (Aug 2022 – Jan 2023)

Articles

Projects

NLP Projects

Computer Vision Projects

Machine Learning & Python Projects

Knowledge Repo

About Me

Education

About

Resources

Stars

Watchers

Forks