A daily practice series where I work on different datasets, apply machine learning techniques, and keep improving my skills in data preprocessing, modeling, and evaluation.
Along with weekday data practice, every Saturday & Sunday I focus on fine-tuning large language models (LLMs) to strengthen my skills in modern NLP and transfer learning.
The main goal is to stay consistent, learn by doing, and build a portfolio that shows real progress.
- Practice data analysis, cleaning, and visualization every day.
- Experiment with ML/DL models on diverse datasets.
- Perform LLM fine-tuning on weekends (Sat–Sun).
- Build a collection of reproducible notebooks for interviews and learning.
- Develop the habit of hands-on problem solving.
Each folder/day includes:
- Dataset / Source
- Exploratory Data Analysis (EDA)
- Data Cleaning steps (sometimes quick or minimal)
- Modeling & Evaluation
- Key Learnings
Weekend Folders (Sat–Sun):
- Fine-tuning experiments with models like BERT, Qwen, GPT-style models, etc.
- Training scripts & configs
- Evaluation reports
This repo is about learning and consistency.
Some notebooks may have:
- Rough or rushed cleaning steps.
- Quick experiments instead of polished code.
- Areas marked for future improvement.
The focus is on showing progress, not perfection.
- Data preprocessing (missing values, scaling, encoding)
- ML algorithms (Logistic Regression, Random Forests, SVMs, Neural Nets, etc.)
- Deep learning basics (CNNs, RNNs, Transformers)
- LLM fine-tuning (LoRA, QLoRA, PEFT, TRL)
- Evaluation metrics (Accuracy, Precision, Recall, F1, ROC-AUC, BLEU, Perplexity)
- Iterative experimentation (Kaggle-style approach)
- Shows commitment to daily practice
- Covers a wide variety of datasets and problems
- Adds LLM fine-tuning experience for cutting-edge NLP skills
- Demonstrates growth over time
- Useful for interview discussions on approach and learnings
- Clean up and refine selected projects.
- Add automated evaluation helpers.
- Expand fine-tuning to more advanced LLMs.
- Share selected notebooks as Kaggle kernels or blog posts.
🔗 This repo represents my journey of learning data every day and fine-tuning models every weekend, moving closer to becoming an AI/ML scientist.