I am a Software Engineer passionate about all facets of data including Data Science, Machine Learning, Deep Learning, Data Engineering & Data Analytics
- 🌱 I’m currently learning Snowflake, AWS technologies
- 💬 Ask me about Relational Databases, Machine Learning, Deep Learning, LLMs, NLP
Code Word Detection By Large Language Models
Tech Stack: Python,Pandas, Hugging Face, OpenAI, VertexAI, FastText
Abstract : Various machine learning and deep learning models have been developed over the years for code word detection, each with its own set of complexities such as the requirement for a large dataset for training, intricate preprocessing steps and substantial hardware resources.
This paper explores the code word detection capabilities of Large Language Models (LLM) such as GPT-3, and PaLM among several others. Through several experiments, I have proved that Large Language Models can achieve state-of-the-art results at code word detection tasks, without requiring extensive training or fine-tuning of these models. LLMs significantly outperform a baseline model which was trained on the same dataset that was used to test the LLMs.
Techstack : Python, Pandas, Numpy, Sklearn used to implement Column transformer, Pipelines, PCA, Hyperparameter tuning, Ensemble learning, Model training and testing in ML projects.
Keras, tensorflow used in Deep Learning projects for building Neural Network, dropout layers, regularization, Hyperparameter tuning, weight initialization and early stopping.
Spotify Song Popularity Prediction
Breast Cancer Survival Prediction
GoodReads book review Ratings Prediction using BiLSTM (NLP Problem)
Pneumonia Prediction in Chest X-Ray using CNN, VGG16
Dogs vs Cats Classifier using CNN, VGG16 & Data Augmentation Techniques
Leaf Disease Classifier using CNN
Titanic Survival Prediction
Airline Passenger Satisfaction Prediction
Bangalore House Price Prediction Website
Techstack : Python, Langchain, OpenAI APIs, Hugging Face libraries, TogetherAI APIs, Chroma Vertor DB
Retrieval Augmented Generation (RAG) QA over custom files using Google AI
Retrieval Augmented Generation (RAG) QA over custom files using OpenSource embeddings and LLM
Fact Search Assistant using Lanchain Tools & Agents
Techstack : Python, R, Libraries in R such as GA, Quantmod, Gramevol, TTR
Algorithmic Trading using R
Portfolio Optimisation using R
Test Case Prioritisation
Test Data Generation
Techstack : Apache Spark, SparkML
Exploratory Data analysis of Customer Order & Fakefriends dataset using Apache PySpark
Techstack : Django, FastAPI
To-Do App