Awesome Agentic ML

Agentic ML refers to autonomous AI systems that can plan, execute, and iterate on machine learning workflows with minimal human intervention—from data preprocessing to model training, evaluation, and deployment.

🤖 This resource list is maintained with the help of Claude Opus 4.5.

Frameworks & Platforms

End-to-end platforms and frameworks for building agentic ML systems.

Project	Description	Stars
AutoGluon	Open-source AutoML toolkit by Amazon with foundational models and LLM agents.
Karpathy	Agentic ML Engineer using Claude Code SDK and Google ADK. By K-Dense.
K-Dense Web	Autonomous AI Scientist platform with dual-loop multi-agent system for research, coding, and ML.	-

AutoML Agents

LLM-powered agents for automated machine learning pipelines.

Project	Description	Stars
AIDE	AI-powered data science agent using tree search for solution exploration.
AIRA-dojo	Meta's AI research agents using search policies (Greedy, MCTS, Evolutionary).
AutoGluon Assistant	Multi-agent system for end-to-end multimodal ML automation. Also known as MLZero.
AutoMind	Adaptive agent with expert knowledge base from 455 Kaggle competitions and tree search. By ZJU NLP.
AutoML-Agent	Multi-Agent LLM Framework for Full-Pipeline AutoML.
FM Agent	Baidu's foundation model agent for ML engineering tasks.
InternAgent	ML engineering agent with DeepSeek-R1 integration.
MLE-STAR	Google's ML engineering agent using web search and targeted code block refinement. Built with ADK.	-
ML-Master	AI-for-AI agent integrating exploration and reasoning with adaptive memory. By SJTU SAI.
OpenHands	Open-source AI software development agent adaptable to ML tasks.
R&D-Agent	Microsoft's research & development agent for ML tasks.
SELA	Tree-Search Enhanced LLM Agents for AutoML using MCTS. Part of MetaGPT.

Research Papers

Academic papers on agentic ML, autonomous ML systems, and LLM-based ML agents.

Benchmarks & Evaluation

Papers introducing benchmarks and evaluation methodologies for agentic ML systems.

MLE-bench: Evaluating Machine Learning Agents on Machine Learning Engineering (2024) - Paper | Code
Benchmark by OpenAI with 75 Kaggle competitions for evaluating ML engineering agents.
MLE-Smith: Scaling MLE Tasks with Automated Multi-Agent Pipeline (2025) - Paper
Automated pipeline transforming raw datasets into competition-style MLE challenges.
MLAgentBench: Evaluating Language Agents on Machine Learning Experimentation (ICML 2024) - Paper
Benchmark for evaluating LLM agents on ML research tasks including model training and debugging.
MLR-Bench: Evaluating AI Agents on Open-Ended Machine Learning Research (2025) - Paper
Benchmark with 201 research tasks from NeurIPS, ICLR, and ICML. Includes MLR-Judge for automated evaluation.
DataSciBench: An LLM Agent Benchmark for Data Science (2025) - Paper | Code
Comprehensive benchmark with Task-Function-Code (TFC) framework for rigorous evaluation of LLMs on data science tasks.

Multi-Agent Systems

Frameworks using multiple specialized agents for end-to-end ML pipelines.

AutoML-Agent: A Multi-Agent LLM Framework for Full-Pipeline AutoML (ICML 2025) - Paper | Code
Multi-agent system with data, model, and operation agents for full-pipeline automation.
LightAutoDS-Tab: Multi-AutoML Agentic System for Tabular Data (2025) - Paper | Code
Combines LLM-based code generation with multiple AutoML tools (AutoGluon, LightAutoML, FEDOT).
MLZero: A Multi-Agent System for End-to-end Machine Learning Automation (NeurIPS 2025) - Paper | Code
Transforms raw multimodal data into ML solutions with zero human intervention.
SmartDS-Solver: Agentic AI for Vertical Domain Problem Solving in Data Science (ICLR 2026 Submission) - Paper
Reasoning-centric system with SARTE algorithm for data science problem solving.

Search & Planning Methods

Papers using tree search, MCTS, or structured planning for ML workflow optimization.

AI Research Agents for Machine Learning (2025) - Paper | Code
Formalizes AI research agents as search policies with operators. Compares Greedy, MCTS, and Evolutionary strategies.
AutoMind: Adaptive Knowledgeable Agent for Automated Data Science (2025) - Paper | Code
Features curated expert knowledge base from 455 Kaggle competitions, agentic knowledgeable tree search, and self-adaptive coding strategy.
I-MCTS: Enhancing Agentic AutoML via Introspective Monte Carlo Tree Search (2025) - Paper | Code
Introspective node expansion with hybrid LLM-estimated and actual performance rewards.
MLE-STAR: Machine Learning Engineering Agent via Search and Targeted Refinement (2025) - Paper | Blog
Uses web search to retrieve models and targeted code block refinement via ablation studies.
ML-Master: Towards AI-for-AI via Integration of Exploration and Reasoning (2025) - Paper | Code
Integrates exploration and reasoning with adaptive memory mechanism.
PiML: Automated Machine Learning Workflow Optimization using LLM Agents (AutoML 2025) - Paper
Persistent iterative framework with adaptive memory and systematic debugging.
SELA: Tree-Search Enhanced LLM Agents for Automated Machine Learning (2024) - Paper | Code
Leverages MCTS to expand the search space with insight pools.

Domain-Specific Agentic ML

Agentic systems tailored for specific ML domains.

AgenticSciML: Collaborative Multi-Agent Systems for Emergent Discovery in Scientific ML (2025) - Paper
Specialized agents propose, critique, and refine SciML solutions.
AI-Driven Automation Can Become the Foundation of Next-Era Science of Science Research (NeurIPS 2025 Position) - Paper
Position paper on AI automation for scientific discovery with multi-agent systems to simulate research societies.
ClimateAgent: Multi-Agent Orchestration for Complex Climate Data Science Workflows (TMLR) - Paper
Multi-agent framework for end-to-end climate data analytics with dynamic API awareness and self-correction.
The AI Cosmologist: Agentic System for Automated Data Analysis (2025) - Paper
Automates cosmological data analysis from idea generation to research dissemination.
TS-Agent: Structured Agentic Workflows for Financial Time-Series Modeling (2025) - Paper
Modular framework for financial forecasting with structured knowledge banks.

LLM-Based ML Optimization

Using LLMs for specific ML optimization tasks.

Using Large Language Models for Hyperparameter Optimization (2023) - Paper
Iterative HPO via LLM prompting. Matches or outperforms Bayesian optimization in limited-budget settings.

Foundation Models for ML

Pre-trained models that enable rapid ML development.

TabPFN: A Transformer That Solves Small Tabular Classification Problems in a Second (ICLR 2023) - Paper | Code
Prior-Data Fitted Network using in-context learning for instant tabular classification.
Unlocking the Full Potential of Data Science Requires Tabular Foundation Models, Agents, and Humans (NeurIPS 2025 Position) - Paper
Position paper on collaborative systems integrating agents, tabular foundation models, and human experts for data science.

Datasets & Benchmarks

Benchmarks and datasets for evaluating agentic ML systems.

Benchmark	Description	Link
AutoML-Agent Benchmark	18 diverse datasets across tabular, CV, NLP, time-series, and graph tasks.	Paper
DataSciBench	Comprehensive data science benchmark with TFC framework for LLM evaluation.	Paper \| GitHub
GAIA	General AI Assistants benchmark testing real-world reasoning and tool use.	Paper
MLE-bench	Kaggle-based benchmark for ML engineering agents by OpenAI. 75 competitions.	Paper \| GitHub
MLAgentBench	Benchmark for LLM agents on ML experimentation tasks.	Paper
MLR-Bench	Open-ended ML research benchmark with 201 tasks from major ML conferences.	Paper

Contributing

Contributions are welcome! To add a project or paper, simply open an issue or submit a PR.

License

To the extent possible under law, the authors have waived all copyright and related rights to this work.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Awesome Agentic ML

Contents

Frameworks & Platforms

AutoML Agents

Research Papers

Benchmarks & Evaluation

Multi-Agent Systems

Search & Planning Methods

Domain-Specific Agentic ML

LLM-Based ML Optimization

Foundation Models for ML

Datasets & Benchmarks

Contributing

License

About

Uh oh!

Releases

Packages

License

jxucoder/awesome-agentic-machine-learning

Folders and files

Latest commit

History

Repository files navigation

Awesome Agentic ML

Contents

Frameworks & Platforms

AutoML Agents

Research Papers

Benchmarks & Evaluation

Multi-Agent Systems

Search & Planning Methods

Domain-Specific Agentic ML

LLM-Based ML Optimization

Foundation Models for ML

Datasets & Benchmarks

Contributing

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Packages