- Personal Information
- Summary
- Languages
- References
- Skills
- Professional Experience
- Education
- Certifications
- Publications
- Full Name: Irina Ryndova
- Address: Antalya, Turkey (Remote Only)
- Phone (WhatsApp, Telegram): +905342342398
- Email: ryndovaira@gmail.com
- LinkedIn: ryndova-irina
- GitHub: ryndovaira
- Kaggle: ryndova
Machine Learning Engineer with over 9 years of experience, combining a strong background in software engineering with expertise in AI-driven solutions. Skilled in all stages of machine learning workflows, including data preprocessing, model development, and deployment. Recent projects include retrieval-augmented generation (RAG) systems and applications in healthcare. Equally comfortable working independently or as part of a team, with a collaborative and methodical approach to problem-solving. AWS-certified and open to expanding expertise into new areas.
- English: B2+ (Upper Intermediate)
- Russian: Native
A PDF of the recommendation letter is available via Google Drive. Further details, including referee contact information, can be provided upon request as needed.
- Programming Languages: Python, SQL
- Databases: Relational (MySQL, SQLite), NoSQL (Mongo), Vector (Pinecone, LlamaIndex, Chroma)
- Libraries & Frameworks:
- Core ML Libraries: NumPy, Pandas, Scikit-learn, PyTorch, Keras
- NLP & Specialized Tools: Hugging Face, LangChain, FAISS
- Visualization: Matplotlib, Seaborn, Plotly
- Others: FastAPI, MMDetection, lm-evaluation-harness, Supervisely
- Models & APIs: OpenAI API (ChatGPT), LLaMA 2/3, Gemini
- Machine Learning Techniques:
- Retrieval-Augmented Generation (RAG)
- Traditional / Deep Machine Learning
- Natural Language Processing (NLP) / Natural Language Understanding (NLU)
- Exploratory Data Analysis (EDA)
- Data Processing & Analysis
- Model Tuning and Evaluation
- Data Visualization
- Operating Systems: Linux, Windows
- Development Tools: Docker, Docker Compose, Git, GitHub, GitLab, JupyterLab, Supervisely
- Infrastructure & Platforms: AWS, Azure, Google Cloud, IBM Cloud Pak for Data, Cerebras
- Experiment Tracking & Automation: MLflow, CI/CD Pipelines (GitHub Actions), Automated Testing
- Soft Skills:
- Approachable and supportive colleague
- Collaborative and methodical in problem-solving
- Adaptable to diverse roles and team dynamics
- Domain Knowledge: Electronic Healthcare Records (EHR)
- Programming Languages: C++, Java, R, SPARQL (AnzoGraph DB)
- Libraries & Frameworks: Cython, Qt, pyTelegramBotAPI
- Development Tools: Google Test, Mercurial, TeamCity
Company Name: Quantori LLC
Location: Remote
Dates of Employment: August 2023 – November 2024
Project: Development of a Chatbot System for Oncology Treatment Support
Technologies Used:
- Programming Languages: Python
- Libraries & Frameworks: Pandas, NumPy, Scikit-learn, Hugging Face Transformers, FAISS, Pinecone, LangChain, lm-evaluation-harness
- Machine Learning Models: OpenAI API (ChatGPT-4), LLaMA 2/3, RAG pipeline
- Development Tools: JupyterLab, Pytest, FastAPI, Git, Docker (Docker Compose), Streamlit
- Infrastructure & Platforms: Linux, IBM Cloud Pak for Data, Cerebras
Responsibilities:
- Processed and structured raw datasets, including clinical records, genomic data, pathology reports, and laboratory results. Collaborated with domain experts to clarify medical terminology and resolve data inconsistencies.
- Performed EDA and created visualizations to analyze trends and improve data quality.
- Designed and implemented an RAG pipeline integrating public (OpenAI API) and private (LLaMA) data sources to provide evidence-based treatment recommendations.
- Fine-tuned LLaMA models using Cerebras infrastructure for private HIPAA-compliant data and OpenAI API models for public data, ensuring tone, relevance, and clinical alignment.
- Deployed the chatbot on-premise using FastAPI and developed a basic interface with Streamlit for user interaction and testing.
- Stored vector embeddings with FAISS during prototyping and transitioned to Pinecone for scalable deployment.
- Configured CI/CD pipelines with GitHub Actions to automate deployment workflows.
- Conducted basic testing of chatbot responses, including tone, relevance, and functionality, refining outputs based on feedback.
- Translated high-level research ideas into actionable technical tasks. Prepared regular progress reports to track milestones and communicate findings effectively.
- Onboarded new team members and supported them during their initial project phases.
Company Name: Quantori LLC
Location: Remote
Dates of Employment: September 2022 – August 2023
Project: Classifying and Scoring Edema (condition caused by excess fluid in the lungs) on Chest X-Ray Images
Technologies Used:
- Programming Languages: Python
- Libraries & Frameworks: Pandas, NumPy, Scikit-learn, Matplotlib, Seaborn, MMDetection, Supervisely
- Experiment Tracking: MLflow
- Development Tools: JupyterLab, Pytest, Git, GitHub
- Infrastructure & Platforms: AWS (S3, EC2), Linux
Responsibilities:
- Processed labeled datasets provided by domain experts, converting them into COCO format to enable compatibility with MMDetection workflows.
- Collaborated on configuring MLflow for shared experiment tracking and performance monitoring, supporting reproducibility across the team.
- Assisted in conducting experiments with MMDetection, helping to evaluate and refine models for classifying and scoring edema features.
- Designed visualizations to present results and analyze model outputs, providing insights into model behavior and debugging processes.
- Provided software development support for the project, contributing to the research team’s goal of publishing a peer-reviewed study in Radiology Advances.
Company Name: Quantori LLC
Location: Remote
Dates of Employment: September 2021 – August 2022
Project: Discovering Genetic Patterns in Autoimmune Disease Patients
Technologies Used:
- Programming Languages: Python, SQL
- Libraries & Frameworks: Pandas, NumPy, Scikit-learn, Matplotlib, Seaborn, HDBSCAN, SHAP
- Development Tools: JupyterLab, Pytest, Git, GitHub
- Infrastructure & Platforms: AWS (Aurora, S3, SageMaker), Linux
Responsibilities:
- Collaborated with researchers to analyze datasets from UK Biobank, supporting their efforts to uncover trends and generate insights into autoimmune diseases.
- Conducted extensive exploratory data analysis (EDA) to identify potential patterns and validate hypotheses generated by researchers.
- Performed patient segmentation through clustering techniques to group individuals with shared biological characteristics, aiding researchers in exploring sub-populations of interest.
- Applied SHAP to investigate feature importance during clustering, enhancing the interpretability and reliability of segmentation results.
- Retrieved and analyzed longitudinal electronic healthcare records (EHR) from AWS Aurora databases to provide researchers with detailed data summaries.
- Automated experimental pipelines and implemented testing processes to ensure reliable workflows and consistent data quality.
- Studied autoimmune diseases, traditional treatments, and ICD-10/9 classifications to gain the necessary domain knowledge for clustering and data analysis tasks.
- Prepared detailed reports summarizing EDA and clustering results, enabling researchers to identify potential genetic patterns and areas for further study.
- Reviewed the work of junior engineers, provided guidance, and resolved technical challenges to maintain project momentum.
Company Name: Quantori LLC
Location: Remote
Dates of Employment: July 2021 – September 2021
Project: Organizing Bioinformatics Data and Project Resources
Technologies Used:
- Programming Languages: R
- Infrastructure & Platforms: AWS (S3, EC2), Linux
Responsibilities:
- Collaborated with the lead data engineer to structure raw bioinformatics data provided in diverse file formats, ensuring accessibility and usability for analysis.
- Organized and consolidated the client’s project resources, restructuring R scripts and related materials into a clear and centralized repository to enable effective collaboration.
- Reviewed and refined existing R scripts to clean and process data, ensuring compatibility with the project’s requirements.
- Worked with the client to retrieve missing files and resolve inconsistencies in the provided data and resources.
Company Name: LevelUp
Location: Remote (Russia)
Dates of Employment: May 2020 – July 2021
Technologies Used:
- Programming Languages: Python, SQL
- Libraries & Frameworks: Pandas, NumPy, Matplotlib, Seaborn, Scikit-learn, Keras, XGBoost, LightGBM, Imblearn
- Development Tools: JupyterLab, Pytest, Git, GitHub
Responsibilities:
- Designed and developed comprehensive teaching materials, including lecture slides, coding exercises, and projects tailored for beginner-level students in small groups (10-20 participants).
- Delivered interactive lessons on Python fundamentals, object-oriented programming, and core data science workflows.
- Mentored students in Python scripting, data preprocessing, and feature engineering techniques, ensuring a strong foundation in practical programming skills.
- Evaluated homework assignments and projects to track student progress, providing constructive feedback to support their learning.
- Taught machine learning concepts, including regression models, decision trees, and ensemble techniques, with real-world examples.
- Introduced students to hands-on data manipulation, exploratory data analysis (EDA), and creating end-to-end pipelines using Python libraries.
- Guided students in using JupyterLab for effective coding workflows and debugging tools like Pytest to ensure reproducibility.
Location: Remote
Dates of Employment: October 2020 – June 2021
Summary:
Worked independently on diverse data science projects, focusing on developing machine learning models, performing data analysis, and extracting insights from complex datasets.
Technologies Used:
- Programming Languages: Python
- Libraries & Frameworks: Pandas, Scikit-learn, NumPy, Matplotlib, Seaborn, XGBoost, LightGBM, Keras, BERT
- Development Tools: JupyterLab, Git, GitHub
- Version Control: Git
Contributions:
- Payment Behavior Analysis: Built machine learning models, such as decision trees, logistic regression, and gradient boosting, to analyze payment behavior and classify users into distinct customer groups.
- Customer Satisfaction Improvement: Conducted sentiment analysis on Google Play reviews using NLP techniques, including tokenization and text vectorization, to identify key improvement areas and enhance user satisfaction.
- Click Prediction Model: Developed a predictive model using algorithms like random forests and logistic regression to determine the likelihood of users clicking on ads in a web browser.
- Data Exploration and Visualization: Conducted exploratory data analysis (EDA) and created visualizations using Matplotlib and Seaborn to effectively communicate insights.
- NLP and Text Analytics: Applied NLP techniques, including BERT-based text classification, to extract insights from unstructured text data.
Company Name: MTS AI
Location: Saint-Petersburg, Russia (Remote)
Dates of Employment: June 2020 – October 2020
Project: Sentiment Analysis and Data Clustering
Technologies Used:
- Programming Languages: Python 3
- Libraries & Frameworks: Pandas, Scikit-learn, XGBoost, LightGBM, Imblearn, NLTK, NumPy, Matplotlib, Seaborn, pyTelegramBotAPI
- Development Tools: JupyterLab, Pytest, Git, GitLab
- Operating Systems: Linux
Responsibilities:
- Clustered and classified data from customer reviews and conversations with bots and agents for sentiment analysis.
- Developed supplementary Python scripts and demo projects, including a Telegram Bot for data interaction.
- Performed data preprocessing, feature engineering, and model training using machine learning algorithms.
- Created unit tests using Pytest to ensure robustness and quality of data analysis pipelines.
Company Name: MTS AI
Location: Saint-Petersburg, Russia (Remote)
Dates of Employment: March 2019 – October 2020
Project: Development of an Automatic Speech Recognition Application
Technologies Used:
- Programming Languages: C++, Python 3, Cython
- Libraries & Frameworks: Kaldi
- Development Tools: Google Test, Pytest, Docker, Git, GitLab
- Operating Systems: Linux
Responsibilities:
- Created decoders using the Kaldi toolkit for automatic speech recognition (ASR).
- Led a small team of engineers, including task assignment, code review, and mentoring.
- Developed unit tests using Google Test and Pytest to ensure high-quality code.
- Set up and maintained pipelines for build preparation and testing.
- Developed Python function wrappers using Cython to integrate C++ functionality.
Dates of Employment: November 2017 – March 2019
Company Name: Speech Technology Center (STC Group)
Location: Saint-Petersburg, Russia
Project: Development of a Large-Scale C++ Project
Technologies Used:
- Programming Languages: C++, Python 3, Java
- Development Tools: Google Test, Pytest, SWIG 3, Docker
- Version Control & CI/CD: Git, GitLab, Mercurial, TeamCity
- Operating Systems: Linux, Windows
Responsibilities:
- Contributed to the development of a large-scale speech SDK project in C++ with over 10 years of active development.
- Created supplementary scripts in Python 3 to enhance project functionality.
- Took part in demo projects to showcase the product to clients.
- Developed a Java function wrapper using SWIG 3 for the C++ project.
- Managed build configurations in TeamCity for continuous integration.
- Prepared unit tests using Google Test to ensure code quality.
- Participated in pre-release integration testing with Java and Python.
Dates of Employment: March 2015 – November 2017
Company Name: Russian Institute of Radio Navigation and Time
Location: St. Petersburg, Russia
Project: Software Development and Legacy Code Migration
Technologies Used:
- Programming Languages: C, C++
- Libraries & Frameworks: Qt
Responsibilities:
- Supported and enhanced software solutions for navigation systems using C, C++, and Qt.
- Assisted in migrating legacy code from Assembler to C, improving software stability and maintainability.
- Worked on microcontroller programming for embedded system integration.
- Developed an FTP client-server application for secure data transfer over internal networks, meeting specific technical requirements.
Saint Petersburg State Electrotechnical University "LETI," Russia
- Master's in Computer Engineering and Informatics
- Sep 2014 - Jun 2016
- Bachelor's in Computer Engineering and Informatics
- Sep 2010 - Jun 2014
Credly Profile: irina-ryndova
-
AWS Certified Machine Learning – Specialty
Issued: Jan 2023 | Expires: Jan 2026 -
IBM Data Science Specialization (Coursera)
-
Applied Data Science Specialization (Coursera)
-
Introduction to Data Science Specialization (Coursera)
-
Stanford University - Machine Learning (Coursera)
-
Other Courses on Coursera:
- Python for Data Science and AI
- Data Analysis with Python
- Data Visualization with Python
- Databases and SQL for Data Science
"Explainable AI to Identify Radiographic Features of Pulmonary Edema"
- Description: A study developing a deep learning method to identify radiographic features of pulmonary edema, a condition caused by excess fluid in the lungs.
- Contribution: Software development, validation, and manuscript review and editing.
- Published in: Radiology Advances, May 2024.
- DOI: 10.1093/radadv/umae003