Machine Learning Engineer | NLP & Generative AI | Drug Discovery | MLOps
I'm a machine learning engineer with 8+ years of experience building AI systems across healthcare and drug discovery. My work blends NLP, generative models, LLMs, and MLOps to build production-grade AI systems. Iโve published in peer-reviewed journals, deployed AI tools at scale, and worked with cross-functional teams in fast-paced startups and global R&D orgs.
- ๐ง LLMs & NLP: Prompt-tuning, retrieval-augmented generation, Transformers, semantic search, QA agents
- ๐งฌ Drug Discovery: Molecular generation, retrosynthesis, proteinโligand modeling
- โ๏ธ MLOps & Deployment: AzureML, AWS SageMaker, DVC, Docker, CI/CD pipelines
- ๐งพ Data Engineering: SQL/NoSQL, Spark, data pipelines for clinical, biomedical & text data
- ๐ฌ AI Research: 1st-author paper in Journal of Chemical Information and Modeling, others in ICML CompBio & Springer
Languages: Python, SQL, Bash
Frameworks: PyTorch, Transformers, XGBoost, scikit-learn, FastAPI
MLOps: DVC, MLflow, Docker, Poetry, AzureML, AWS, Kubernetes (basic)
Tools: LangChain, Hugging Face, Neo4j, Spark, Databricks, FAISS
Databases: PostgreSQL, MongoDB, S3, BigQuery
Infra as Code: Terraform (basic)
-
๐ Chem42 Molecule Generator
Built generative model pipeline (GNN + validity filtering) for real-world synthesizable molecules. -
๐ฏ LLM-powered Retrosynthesis
Cleaned and generated 10M+ reaction entries using OpenAI + Hugging Face. Boosted top-5 retrosynthesis accuracy by 7%. -
๐ Semantic Healthcare Knowledge Graph
Used PyTorch, Spark, and biomedical ontologies to create 2M+ node graph. Enabled concept-level document retrieval.
-
Model for Predicting ProteinโLigand Unbinding Kinetics through Machine Learning
Journal of Chemical Information and Modeling, ACS, 2020Introduced a static-structure-based approach to predict log(k_off) using RF + structural descriptors.
-
High Performance of Gradient Boosting in Binding Affinity Prediction ICML, CompBio Workshop, 2022
Benchmarked SOTA GNNs vs. GBDTs for proteinโligand binding affinity; showed GBDTs outperform with graph-derived features.
-
Others in RSC AI in Chemistry, Springer Lecture Notes
- ๐ผ LinkedIn
- ๐ง Email: nurlybek.amangeldiev@gmail.com
- ๐ Based in Kazakhstan | Open to remote & relocation
โI like building real things with real impact.โ