Field of interests: LLM, NLP, RL, Graphs, Distributed Systems
My telegram channel: Cat's Shredinger
- Languages:Β Python, SQL
- DS/ML/DL: Β Β SkLearn, PyTorch, Transformers
- Big Data: Β Β Β Β Β Hadoop, Spark
- DevOps:β Β Β Β Β Linux, Git, Docker
Job Position | Company | Field | Work Period |
---|---|---|---|
Head of AI Transformation | Social Discovery Group | LLM, Conversational AI | 2024-05 β now |
Research Scientist Lead | SberDevices | LLM, GigaChat | 2023-04 β 2024-05 |
NLP Team Lead | SberDevices | Search, Information Retrieval | 2022-10 β 2023-04 |
NLP Tech Lead | Sber AI Lab | NLP, MLOps, Mentoring | 2021-05 β 2022-10 |
Senior NLP Engineer | Tinkoff AI Lab | Virtual Assistant "Oleg" | 2021-02 β 2021-04 |
Middle NLP Engineer | MTS AI Lab | NER with Pseudo-Labeling | 2020-05 β 2021-02 |
Junior Data Scientist | Sberbank | ML with Tabular Data, CV | 2018-07 β 2020-05 |
- Masterβs Degree @ Lomonosov Moscow State University (2019 - 2023)
- Bachelor's Degree @ Plekhanov Russian University of Economics (2015 - 2019)
- MUSE TF -> PT - convert Multilingual Universal Sentence Encoder from TensorFlow to PyTorch and ONNX
- QaNER - unofficial implementation of QaNER paper (NER via Extractive Question Answering)
- RLLib - Reinforcement Learning library
- MUSE as Service - REST API for sentence embedding using Multilingual Universal Sentence Encoder
- PyTorch NER - pipeline for training NER models using PyTorch
- Text Classification Baseline - pipeline for building text classification TF-IDF + LogReg baselines
- Graph-Based Clustering - clustering using graph connected components and spanning trees
- From Model to Service: Flask + Gunicorn + Docker @ Sberloga
- QaNER - NER via Exractive QA @ Sberloga
- Git Hooks Is All You Need @ Sberloga
- Web-Service for Sentence Embeddings @ Sberloga
- How to start a career in DS @ REU Data Science Club
- Practical Reinforcement Learning (with honors) @ Coursera
- Introduction to Deep Learning (with honors) @ Coursera
- Bayesian Methods for Machine Learning (with honors) @ Coursera
- Hadoop. System for processing large amounts of data @ Stepik
- deNews @ ETHOnline 2022
- Alzheimer's MRI Analysis @ Synthetic Health Data Hackathon 2020
- Key contributor to GigaChat: Russian most advanced LLM
- 500+ stars on GitHub and 10 packages in PyPI with 38k+ downloads
- Contributor to PyTorch, Scikit-Learn, SciPy
- Open Data Science Best Contributor 2020
More information in my LinkedIn π