Skip to content

A curated list of awesome agentic machine learning projects, frameworks, and resources.

License

Notifications You must be signed in to change notification settings

jxucoder/awesome-agentic-machine-learning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 

Repository files navigation

Awesome Agentic ML Awesome

Agentic ML refers to autonomous AI systems that can plan, execute, and iterate on machine learning workflows with minimal human intervention—from data preprocessing to model training, evaluation, and deployment.

🤖 This resource list is maintained with the help of Claude Opus 4.5.


Contents


Frameworks & Platforms

End-to-end platforms and frameworks for building agentic ML systems.

Project Description Stars
AutoGluon Open-source AutoML toolkit by Amazon with foundational models and LLM agents. GitHub stars
Karpathy Agentic ML Engineer using Claude Code SDK and Google ADK. By K-Dense. GitHub stars
K-Dense Web Autonomous AI Scientist platform with dual-loop multi-agent system for research, coding, and ML. -

AutoML Agents

LLM-powered agents for automated machine learning pipelines.

Project Description Stars
AIDE AI-powered data science agent using tree search for solution exploration. GitHub stars
AIRA-dojo Meta's AI research agents using search policies (Greedy, MCTS, Evolutionary). GitHub stars
AutoGluon Assistant Multi-agent system for end-to-end multimodal ML automation. Also known as MLZero. GitHub stars
AutoMind Adaptive agent with expert knowledge base from 455 Kaggle competitions and tree search. By ZJU NLP. GitHub stars
AutoML-Agent Multi-Agent LLM Framework for Full-Pipeline AutoML. GitHub stars
FM Agent Baidu's foundation model agent for ML engineering tasks. GitHub stars
InternAgent ML engineering agent with DeepSeek-R1 integration. GitHub stars
MLE-STAR Google's ML engineering agent using web search and targeted code block refinement. Built with ADK. -
ML-Master AI-for-AI agent integrating exploration and reasoning with adaptive memory. By SJTU SAI. GitHub stars
OpenHands Open-source AI software development agent adaptable to ML tasks. GitHub stars
R&D-Agent Microsoft's research & development agent for ML tasks. GitHub stars
SELA Tree-Search Enhanced LLM Agents for AutoML using MCTS. Part of MetaGPT. GitHub stars

Research Papers

Academic papers on agentic ML, autonomous ML systems, and LLM-based ML agents.

Benchmarks & Evaluation

Papers introducing benchmarks and evaluation methodologies for agentic ML systems.

  • MLE-bench: Evaluating Machine Learning Agents on Machine Learning Engineering (2024) - Paper | Code
    Benchmark by OpenAI with 75 Kaggle competitions for evaluating ML engineering agents.

  • MLE-Smith: Scaling MLE Tasks with Automated Multi-Agent Pipeline (2025) - Paper
    Automated pipeline transforming raw datasets into competition-style MLE challenges.

  • MLAgentBench: Evaluating Language Agents on Machine Learning Experimentation (ICML 2024) - Paper
    Benchmark for evaluating LLM agents on ML research tasks including model training and debugging.

  • MLR-Bench: Evaluating AI Agents on Open-Ended Machine Learning Research (2025) - Paper
    Benchmark with 201 research tasks from NeurIPS, ICLR, and ICML. Includes MLR-Judge for automated evaluation.

  • DataSciBench: An LLM Agent Benchmark for Data Science (2025) - Paper | Code
    Comprehensive benchmark with Task-Function-Code (TFC) framework for rigorous evaluation of LLMs on data science tasks.

Multi-Agent Systems

Frameworks using multiple specialized agents for end-to-end ML pipelines.

  • AutoML-Agent: A Multi-Agent LLM Framework for Full-Pipeline AutoML (ICML 2025) - Paper | Code
    Multi-agent system with data, model, and operation agents for full-pipeline automation.

  • LightAutoDS-Tab: Multi-AutoML Agentic System for Tabular Data (2025) - Paper | Code
    Combines LLM-based code generation with multiple AutoML tools (AutoGluon, LightAutoML, FEDOT).

  • MLZero: A Multi-Agent System for End-to-end Machine Learning Automation (NeurIPS 2025) - Paper | Code
    Transforms raw multimodal data into ML solutions with zero human intervention.

  • SmartDS-Solver: Agentic AI for Vertical Domain Problem Solving in Data Science (ICLR 2026 Submission) - Paper
    Reasoning-centric system with SARTE algorithm for data science problem solving.

Search & Planning Methods

Papers using tree search, MCTS, or structured planning for ML workflow optimization.

  • AI Research Agents for Machine Learning (2025) - Paper | Code
    Formalizes AI research agents as search policies with operators. Compares Greedy, MCTS, and Evolutionary strategies.

  • AutoMind: Adaptive Knowledgeable Agent for Automated Data Science (2025) - Paper | Code
    Features curated expert knowledge base from 455 Kaggle competitions, agentic knowledgeable tree search, and self-adaptive coding strategy.

  • I-MCTS: Enhancing Agentic AutoML via Introspective Monte Carlo Tree Search (2025) - Paper | Code
    Introspective node expansion with hybrid LLM-estimated and actual performance rewards.

  • MLE-STAR: Machine Learning Engineering Agent via Search and Targeted Refinement (2025) - Paper | Blog
    Uses web search to retrieve models and targeted code block refinement via ablation studies.

  • ML-Master: Towards AI-for-AI via Integration of Exploration and Reasoning (2025) - Paper | Code
    Integrates exploration and reasoning with adaptive memory mechanism.

  • PiML: Automated Machine Learning Workflow Optimization using LLM Agents (AutoML 2025) - Paper
    Persistent iterative framework with adaptive memory and systematic debugging.

  • SELA: Tree-Search Enhanced LLM Agents for Automated Machine Learning (2024) - Paper | Code
    Leverages MCTS to expand the search space with insight pools.

Domain-Specific Agentic ML

Agentic systems tailored for specific ML domains.

  • AgenticSciML: Collaborative Multi-Agent Systems for Emergent Discovery in Scientific ML (2025) - Paper
    Specialized agents propose, critique, and refine SciML solutions.

  • AI-Driven Automation Can Become the Foundation of Next-Era Science of Science Research (NeurIPS 2025 Position) - Paper
    Position paper on AI automation for scientific discovery with multi-agent systems to simulate research societies.

  • ClimateAgent: Multi-Agent Orchestration for Complex Climate Data Science Workflows (TMLR) - Paper
    Multi-agent framework for end-to-end climate data analytics with dynamic API awareness and self-correction.

  • The AI Cosmologist: Agentic System for Automated Data Analysis (2025) - Paper
    Automates cosmological data analysis from idea generation to research dissemination.

  • TS-Agent: Structured Agentic Workflows for Financial Time-Series Modeling (2025) - Paper
    Modular framework for financial forecasting with structured knowledge banks.

LLM-Based ML Optimization

Using LLMs for specific ML optimization tasks.

  • Using Large Language Models for Hyperparameter Optimization (2023) - Paper
    Iterative HPO via LLM prompting. Matches or outperforms Bayesian optimization in limited-budget settings.

Foundation Models for ML

Pre-trained models that enable rapid ML development.

  • TabPFN: A Transformer That Solves Small Tabular Classification Problems in a Second (ICLR 2023) - Paper | Code
    Prior-Data Fitted Network using in-context learning for instant tabular classification.

  • Unlocking the Full Potential of Data Science Requires Tabular Foundation Models, Agents, and Humans (NeurIPS 2025 Position) - Paper
    Position paper on collaborative systems integrating agents, tabular foundation models, and human experts for data science.


Datasets & Benchmarks

Benchmarks and datasets for evaluating agentic ML systems.

Benchmark Description Link
AutoML-Agent Benchmark 18 diverse datasets across tabular, CV, NLP, time-series, and graph tasks. Paper
DataSciBench Comprehensive data science benchmark with TFC framework for LLM evaluation. Paper | GitHub
GAIA General AI Assistants benchmark testing real-world reasoning and tool use. Paper
MLE-bench Kaggle-based benchmark for ML engineering agents by OpenAI. 75 competitions. Paper | GitHub
MLAgentBench Benchmark for LLM agents on ML experimentation tasks. Paper
MLR-Bench Open-ended ML research benchmark with 201 tasks from major ML conferences. Paper

Contributing

Contributions are welcome! To add a project or paper, simply open an issue or submit a PR.


License

CC0

To the extent possible under law, the authors have waived all copyright and related rights to this work.

About

A curated list of awesome agentic machine learning projects, frameworks, and resources.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published