Deep Learning Tasks Repository

Welcome to my Deep Learning Tasks repository! 🚀
This repo contains a collection of completed deep learning exercises and experiments, implemented in Jupyter Notebooks (.ipynb). Each notebook explores a different aspect of deep learning — from fundamentals to advanced architectures.

All tasks were completed as part of the Deep Learning course (minor: Intellectual Data Analysis) as well as Deep Learning-2 course at HSE.
Each notebook is self-contained; imports are provided inline, and main libraries are also listed in requirements.txt.

📂 Repository Structure

├── DL1_introductory_tasks.ipynb
├── DL2_image_classification.ipynb
├── DL3_text_classification.ipynb
├── DL4_transformers_NER.ipynb
├── DL5_image_segmentation.ipynb
├── DL6_diffusion_models.ipynb
├── DL7_NN_CanGen.ipynb
├── requirements.txt
├── LICENSE
└── README.md

📑 Task Overview

Notebook	Topic	Key Concepts	Notes
`DL1_introductory_tasks.ipynb`	🔰 Introduction to Deep Learning	PyTorch basics, tensors, autograd	Introductory exercises to get familiar with PyTorch and essential DL libraries.
`DL2_image_classification.ipynb`	🌱 Plant Species Classification	CNNs, custom architectures, optimization	Built and trained CNNs in PyTorch for image classification, experimented with model customization and training strategies.
`DL3_text_classification.ipynb`	📰 News & Comment Classification	NLP, text classification, Hugging Face models	Trained models to classify news articles, predicted categories for unseen items, applied sentiment analysis with Hugging Face, and built analytics on most positive/negative news/comment categories.
`DL4_transformers_NER.ipynb`	🏷️ Named Entity Recognition (NER) with Transformers & LLMs	Tokenizer-independent NER, BIO tagging, HuggingFace token classification, DataCollator, span alignment, LLM-assisted annotation, Optuna tuning	Built a full NER pipeline from scratch: reconstructed BIO labels into tokenizer-independent spans, aligned character-level entities to BPE tokens, and tokenized datasets for model training. Used Qwen-2.5 7B-Instruct to generate synthetic annotations, implemented strict validation, retry logic, and span post-processing, and merged valid LLM-generated samples into the training set. Fine-tuned BAAI/bge-small-en-v1.5 with HuggingFace Trainer, evaluated token-classification metrics, and achieved strong results even with limited synthetic annotation.
`DL5_image_segmentation.ipynb`	🧠 Image Segmentation (U-Net, LinkNet)	Encoder–decoder architectures, skip connections, VGG backbones, loss engineering, deep supervision, post-processing, experiment tracking	Implemented U-Net and LinkNet from scratch with a VGG13 encoder. Explored architectural refinements (residual decoder blocks, batch normalization), advanced optimization strategies (BCE + Dice loss scheduling, deep supervision), and Albumentations-based data augmentation. Logged training with TensorBoard, performed systematic ablations, and improved validation IoU from baseline U-Net to 0.92+ with post-processing via morphological operations.
`DL6_diffusion_models.ipynb`	🌫️ Diffusion Models	Forward–reverse stochastic processes, noise schedules, score matching, denoising diffusion, conditional generation, variance scheduling, training stability, visualization of trajectories	Implemented and analyzed DDPM-style diffusion from first principles. Visualized forward noising trajectories (Swiss Roll and MNIST), compared linear/quad/sigmoid noise schedules via $\sqrt{\bar{\alpha}_t}$, and studied their impact on information decay. Built training loop with adaptive LR annealing, TensorBoard logging, and batch normalization statistics refresh. Extended the model from grayscale MNIST to RGB Colored-MNIST, debugging UNet channel/normalization mismatches and adapting the backbone for 3-channel inputs. Trained conditional diffusion with class embeddings and evaluated sample quality across schedules.
`DL7_NN_CanGen.ipynb`	🎧 Neural Candidate Generation (SASRec)	Sequential recommendation, causal Transformers, in-batch negative sampling, LogQ debiasing, ranking metrics, temporal splitting, dataset engineering	Built an end-to-end neural candidate generation pipeline on Yandex Yambda. Performed time-based train/val/test splits, item remapping to dense IDs, and autoregressive sequence dataset construction with sliding prefixes. Implemented SASRec backbone with causal masking, padding-aware attention, and variable-length handling. Added two training objectives: random negatives and in-batch negatives with LogQ correction for popularity bias. Implemented HitRate@k and DCG@k evaluation, batched collation utilities, and efficient masking. Demonstrated how loss choice affects ranking quality and catalog coverage.

⚙️ Setup & Installation

To run the notebooks locally, you’ll need Python 3.8+ and the dependencies listed in requirements.txt.

git clone https://github.com/yourusername/deep-learning-tasks.git
cd deep-learning-tasks
pip install -r requirements.txt

Or, open directly in Google Colab.

🧑‍💻 Usage

Launch Jupyter Notebook or Jupyter Lab:

jupyter notebook

Open the notebook of interest (e.g., DL1_introductory_tasks.ipynb).
Run cells step by step to explore code, results, and commentary.

📊 Results & Visualizations

Each notebook includes:
• Explanations of the approach
• Training/validation metrics (accuracy, loss curves)
• Key visualizations (e.g., confusion matrices, generated images)
• Reflections on results and limitations

🛠 Dependencies

Main libraries used across notebooks:
• PyTorch
• NumPy, Pandas
• Matplotlib, Seaborn
• scikit-learn

See requirements.txt for the full list.

🌟 Future Work

Planned extensions:
• More tasks on NLP and transformers
• Advanced optimization and hyperparameter tuning experiments
• Applied case studies (healthcare, finance, etc.)

📜 License

This repository is released under the MIT License.
Feel free to fork, explore, and build upon it!

👤 Author

Created by Anastasiia Lapshina.
Feel free to reach out via GitHub Issues if you’d like to collaborate or discuss ideas.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Deep Learning Tasks Repository

📂 Repository Structure

📑 Task Overview

⚙️ Setup & Installation

🧑‍💻 Usage

📊 Results & Visualizations

🛠 Dependencies

🌟 Future Work

📜 License

👤 Author

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 200 Commits
DL1_introductory_tasks.ipynb		DL1_introductory_tasks.ipynb
DL2_image_classification.ipynb		DL2_image_classification.ipynb
DL3_text_classification.ipynb		DL3_text_classification.ipynb
DL4_transformers_NER.ipynb		DL4_transformers_NER.ipynb
DL5_image_segmentation.ipynb		DL5_image_segmentation.ipynb
DL6_diffusion_models.ipynb		DL6_diffusion_models.ipynb
DL7_NN_CanGen.ipynb		DL7_NN_CanGen.ipynb
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

Deep Learning Tasks Repository

📂 Repository Structure

📑 Task Overview

⚙️ Setup & Installation

🧑‍💻 Usage

📊 Results & Visualizations

🛠 Dependencies

🌟 Future Work

📜 License

👤 Author

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages