Skip to content
View GitHub-User228's full-sized avatar
🎯
🎯

Block or report GitHub-User228

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 250 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
GitHub-User228/README.md

Hi there 👋 I am Egor

Here are the tools and technologies that I have been working with or have worked with:

Ubuntu mlflow Apache Airflow DVC Apache Spark Docker FastAPI Uvicorn Redis Prometheus Grafana Python HTML5 JavaScript S3 Postgres Apache Hadoop NumPy Pandas SciPy Matplotlib Seaborn OpenCV PIL Pydantic nVIDIA cuDF cuML Optuna Flask scikit-learn LightGBM XGBoost CatBoost implicit PyTorch transformers Hugging Face Prometheus Grafana

Internal Projects


TNavigator Data Parser and Manager

This repository represents a python library to work with TNavigator data. You will be able to parse certain data and manage it. Optionally, you can use this library via cli. This is a closed repository for now.

Libaries & Tools utilised

  1. Python
  2. Numpy
  3. Pandas
  4. Pydantic
  5. Networkx
  6. ecl_data_io
  7. json
  8. yaml
  9. click
  10. logging

Pet-projects


Recommendation System in Electronic Commerce

This repository covers the topic on how to build the recommendation system and deploy it as a ML microservice. This system works with the data from ecommerce-dataset and the goal is to build a recommendation system that would increase the number of add_to_cart events.

Libaries & Tools utilised

  1. Apache Airflow
  2. MLFlow
  3. Python
  4. NetworkX
  5. Pydantic
  6. scikit-learn
  7. CatBoost
  8. LightGBM
  9. XGBoost
  10. implicit
  11. Docker
  12. FastAPI
  13. Redis
  14. Prometheus
  15. Grafana
  16. Uvicorn


Music Recommendation System

This repository covers the topic on how to build the recommendation system and deploy it as a ML microservice. This system works with the data from Yandex Music and the goal is to recommend new tracks to the users.

Libaries & Tools utilised

  1. MLFlow
  2. Spark
  3. Python
  4. Pydantic
  5. scikit-learn
  6. CatBoost
  7. implicit
  8. Docker
  9. FastAPI
  10. Redis
  11. Prometheus
  12. Grafana
  13. Uvicorn


FastAPI ML Microservice Deployment

This repository covers all steps on how to deploy a ML microservice using FastAPI, Python, Docker and Redis and monitor it via Prometheus and Grafana. The microservice is able to reject requests if there are too many of them, validate an input used by the ML model (e.g. a feature must be within a certain range) and send back a proper response indicating all necessary info about the nature of the error if it occurs.

Libaries & Tools utilised

  1. Docker
  2. FastAPI
  3. Redis
  4. Prometheus
  5. Grafana
  6. Python
  7. Pydantic
  8. Uvicorn


Automatic ETL Project for Realty Data From Yandex

This repository contains a custom Extract, Transform, Load (ETL) pipeline that utilizes Docker, PostgreSQL, Python and Cron to model an automatic ETL pipeline. Realty Data from Yandex is automatically parsed and then processed via the pipeline.

Libaries & Tools utilised

  1. Docker
  2. CronJob
  3. Postgres
  4. Python
  5. Pandas
  6. SQLAlchemy
  7. Requests
  8. BeautifulSoap
  9. Geopy


AutoParser

An example of how to automate parsing process of news rss feeds (or any news sites with certain modifications) using Cron Jobs and (optionally) proxies, which are also dynamically parsed.

Libaries & Tools utilised

  1. Cronjob
  2. Python
  3. crontab
  4. feedparser
  5. bs4
  6. proxy_parse


Hybrid Model for Russian News Sentiment Analysis

An implementation of the hybrid model for Russian News Sentiment Analysis, which is based on neural networks and stacking approach. The model can be used for predicting the sentiment of news text

Libaries & Tools utilised

  1. PyPI
  2. Python
  3. PyTorch
  4. transformers
  5. joblib


DL-FastAPI-App

A deep learning application built using FastAPI, Flask and Docker. It allows users to transform images based on specified textual prompts using the frontend service made via Flask. The application leverages state-of-the-art models from Hugging Face to perform image transformations, making it a useful tool for various image processing tasks.

Libaries & Tools utilised

  1. Docker
  2. FastAPI
  3. Flask
  4. JavaScript
  5. HTML5
  6. Python
  7. PyTorch
  8. transformers
  9. PIL
  10. OpenCV
  11. Cuda


MLE-Airflow

This project is designed to give a simple example of how to use Apache Airflow for managing ML workflows based on the telecompany churn dataset stored in a PostgreSQL database. Specifically, it covers how to build an ETL pipeline by utilising DAGs, plugins, hooks and callbacks (to Telegram).

Libaries & Tools utilised

  1. Docker
  2. Airflow
  3. S3
  4. Postgres
  5. Python
  6. Pandas
  7. SQLAlchemy
  8. Requests
  9. Telegram


MLE-DVC

A simple example of how to use DVC for logging ML models based on the telecompany churn dataset stored in a PostgreSQL database.

Libaries & Tools utilised

  1. Docker
  2. DVC
  3. S3
  4. Postgres
  5. Python
  6. Pandas
  7. scikit-learn
  8. CatBoost
  9. joblib
  10. SQLAlchemy


ML Project with Airflow and DVC

This is a project in which both Airflow and DVC are utilised. Airflow is used to automate ETL pipelines, while DVC is used for logging ML models. A dataset is based on realty data from Yandex. The dataset is stored within S3 storage in a PostgreSQL database.

Libaries & Tools utilised

  1. Docker
  2. Airflow
  3. DVC
  4. S3
  5. Postgres
  6. Python
  7. Pandas
  8. scikit-learn
  9. CatBoost
  10. joblib
  11. SQLAlchemy
  12. Telegram


MLE MLflow Project

This is a project which covers the buisiness problem of improving the key metrics of the model for predicting the value of Yandex Real Estate flats. The goal is to make the training process and other related processes easily repeatable and improve key model metrics that impact the company's business metrics, particularly the increase in successful transactions. MLflow framework is considered in order to run a large number of experiments and ensure reproducibility.

Libaries & Tools utilised

  1. MLflow
  2. S3
  3. Postgres
  4. Python
  5. Pandas
  6. scikit-learn
  7. CatBoost
  8. joblib
  9. Pydantic


Special Tools for Minecraft

An implementation of the special tools for Minecraft, which are built on the top of mcpi. Can be used for building any photos directly in the game

Libaries & Tools utilised

  1. PyPI
  2. Python
  3. mcpi
  4. OpenCV


Kaggle competitions

Here are repositories which are related to Kaggle competitions:

  1. Predict Future Sales
  2. Store Sales - Time Series Forecasting

PS: I have competed in a lot more Kaggle competitions, but the corresponding code is somewhere missing on the local machine. Probably, I will redo the coding in the future and publish new repositories.

Popular repositories Loading

  1. mcpi2 mcpi2 Public

    Python 1 1

  2. AutoParser AutoParser Public

    Python

  3. Industrial_ML Industrial_ML Public

    Jupyter Notebook

  4. EvolutionaryComputing EvolutionaryComputing Public

    Jupyter Notebook

  5. ContinuousMathematicalModelling ContinuousMathematicalModelling Public

    Jupyter Notebook

  6. AdvancedBigDataProject AdvancedBigDataProject Public

    Jupyter Notebook