A complete guide to start and improve in AI engineering in 2026 without ANY background in the field and stay up-to-date with the latest news and state-of-the-art techniques!
This guide is intended for anyone with zero or a small background in programming, AI, or machine learning who wants to become a strong AI engineer in 2026. It is organized by how you like to learn: videos, articles, books, docs, courses, and real projects.
There is no single correct order to follow, but a classic path is from top to bottom. If you dislike books, skip them. If you do not want to follow an online course, skip that too. With enough motivation, projects, and repetition, you can absolutely learn this field.
Most resources listed here are free. Paid resources are clearly labelled, and some paid course and book links are affiliate links that support this guide at no extra cost to you. Thank you, and have fun learning!
Don't be afraid to repeat videos, learn from multiple sources, and build messy projects. Repetition and debugging are where the real learning happens.
Maintainer: louisfb01, also active on YouTube, the What's AI Podcast, and my personal newsletter if you want to see and hear more about AI.
Tag Louis-François Bouchard on X or LinkedIn if you share this guide, and feel free to suggest additions through pull requests.
If this guide helps you, please star the repo and share it. That is the main way other builders find it.
Watch AI Engineering Foundations: What Developers Actually Need to Know Today first, then subscribe to What's AI for more AI engineering videos.
This guide is updated throughout 2026 as the stack moves.
- Prerequisites and learning path
- Start with short YouTube and video introductions
- Books and long-form reading
- Online courses
- Practice and projects
- Prompting and structured outputs
- Reasoning models and test-time compute
- Context engineering and long context
- Retrieval-Augmented Generation (RAG)
- Embeddings, rerankers, and vector databases
- Tools, MCP, and computer use
- Workflows, agents, and multi-agent systems
- Evaluations, observability, and harnesses
- Fine-tuning and data curation
- Multimodal and document understanding
- Voice agents and realtime AI
- Deployment, inference, and open-weight models
- AI coding agents and developer tools
- AI safety, security, and guardrails
- Communities, subreddits, and Discords
- Newsletters, podcasts, and blogs
- People to follow
- How to find an AI engineering job
- Learn more and build more with AI
Before you start collecting resources, keep the goal clear: this guide is for becoming a better AI engineer, not merely a better agentic coder.
Coding agents like Codex, Claude Code, Cursor, and similar tools can write code, scaffold apps, and speed up almost every step. You should use them. But AI engineering is the judgment layer behind the work: deciding what to build, what architecture fits, how to evaluate it, where it will fail, and whether it is reliable enough to ship.
This guide is not about outsourcing your thinking to an agent. It is about using those tools while building the foundations, taste, and decision-making ability to become a true AI engineer.
In 2026, AI engineering goes well past prompting. You need context engineering, Retrieval-Augmented Generation (RAG), tools and the Model Context Protocol (MCP), workflow and agent design, evaluations, observability, harnesses, deployment, security, and a working understanding of reasoning models.
That is also why this guide, and our courses, prioritize learning by building. I learned AI engineering by building, and I now interview and hire AI engineers for consulting work at Towards AI, so this guide is biased toward the decision-making skills I actually look for. You can learn a lot alone with coding agents, but structure and expert feedback help you turn projects into true expertise instead of a pile of fragile demos.
There is no single correct order. If you want a default path, I would do this:
- Watch a few foundational videos to pick up vocabulary and intuition.
- Pick one free course and one framework whose docs you commit to reading end to end.
- Pick one or two books to build a solid foundation you can return to when the tools change.
- Optionally take one or two advanced applied courses with real projects, especially if you want a structured path before breaking things on your own.
- Build two or three small but real projects that break in interesting ways.
- Add evaluations, tracing, and deployment before you call anything production-ready.
After that, you should have the foundations of a solid AI engineer ready for many entry-level or transition roles. Most importantly, keep learning and keep an open mind. This field changes fast, and the best engineers stay curious instead of getting religious about one model, framework, or workflow.
Resources use compact markers from 1️⃣ to 🔟. 1️⃣ means absolute beginner, like an intro Python course; 3️⃣ is beginner-friendly AI vocabulary; 5️⃣ is practical builder material you can apply in a project; 7️⃣ is production engineering depth; 9️⃣ is advanced systems or research; and 🔟 is the kind of senior-level paper or technique you may want to revisit after you have shipped a few systems. Lower numbers first, scars later.
You can use this guide with your favorite AI agent. Paste the prompt below into Codex, Claude Code, ChatGPT, Cursor, or another assistant, then tell it how you like to learn:
Use this repo as my AI engineering roadmap: https://github.com/louisfb01/start-ai-engineering
Create a personalized learning plan for me. First ask about my background, coding level, available time, budget, preferred learning style, and goals. Then choose the most relevant resources from the repo, explain why you picked them, order them from easiest to hardest, and turn them into a weekly plan with projects, checkpoints, and what I should be able to build after each stage.
- 1️⃣ Learn Python - Free interactive tutorial to learn Python fundamentals if you have never touched the language.
- 1️⃣ AI Python for Beginners - DeepLearning.AI. Free short course from Andrew Ng's team, lighter on-ramp than a full bootcamp.
- 2️⃣ Python Fundamentals + CS Concepts — A One-Stop Starter Class - Louis-François Bouchard, What's AI. Free playlist covering Python fundamentals and core computer science concepts in one place. The right starting point if you want a single resource before jumping into LLM development.
- 2️⃣ Beginner Python for AI Engineering - Towards AI. An LLM-native Python course for people who want to go straight to building with LLMs, not through six months of classical scripting first. (Paid, $149)
If you already know some Python, you can jump into the rest of this guide. You do not need a mathematics PhD or deep research background. You do need basic Python, comfort reading docs, willingness to debug messy systems, and enough curiosity to build things that break. The last point matters more than people expect.
Video is still the fastest way to pick up vocabulary and mental models.
- 4️⃣ AI Engineering Foundations: What Developers Actually Need to Know Today - Louis-François Bouchard. A one-hour webinar on what AI engineers need to know today: how LLMs work, their limitations, when to use prompting, RAG, workflows, or agents, and why evaluations and security matter before production.
- 2️⃣ How AI Works in Super Simple Terms - StatQuest with Josh Starmer. The gentlest possible on-ramp: how AI like ChatGPT works explained through a super simple example with no heavy math. Start here if any of the other foundational videos feel overwhelming.
- 2️⃣ Mastering AI Jargon - Your Guide to OpenAI & LLM Terms - Louis-François Bouchard. A practical glossary for the terms you keep seeing around OpenAI, GPT, LLMs, prompting, and generative AI.
- 3️⃣ Intro to Large Language Models - Andrej Karpathy. One hour. Still the cleanest high-level tour of what an LLM is and how it works.
- 4️⃣ AI Fundamentals for Builders - Understand transformers and fix LLM limitations - Louis-François Bouchard. A builder-focused session on transformer intuition, common LLM limitations, and the techniques used to work around them.
- 5️⃣ A Hackers' Guide to Language Models - Jeremy Howard, fast.ai. 90 minutes, practical and builder-oriented, assumes you can code.
- 6️⃣ Deep Dive into LLMs like ChatGPT - Andrej Karpathy. 2025. Three and a half hours covering the full LLM training and inference stack, free. The single best investment if you only watch one long video this year.
- 2️⃣ StatQuest with Josh Starmer - Josh Starmer. The clearest visual explanations of ML and neural network concepts on YouTube. Ideal for building solid intuition about how transformers, attention, and training actually work before you start building.
- 3️⃣ 3Blue1Brown - Grant Sanderson. Visual math and deep learning intuition. The neural networks and attention series are widely considered the best visual introductions to these concepts.
- 3️⃣ DeepLearning.AI - Andrew Ng's official channel. Free recorded short courses on prompting, RAG, agents, evals, and more. Most of the DeepLearning.AI short courses land here first.
- 3️⃣ IBM Technology - Clear concept explainers on LLMs, RAG, agents, and enterprise AI. Good for quickly getting up to speed on a new concept with no background noise.
- 3️⃣ Tech With Tim - Tim Ruscica. 1.89M subscribers. Beginner-to-intermediate coding and AI projects in Python. Strong for learners who want to build working things (AI games, assistants, chatbots, small ML projects) alongside the theory.
- 4️⃣ What's AI - Practical AI engineering explainers from Louis-François Bouchard. Useful for RAG, agents, MCP, evals, and learning how to reason about the stack instead of only chasing tools.
- 4️⃣ Hugging Face - Official tutorials across the open-source AI ecosystem. Covers fine-tuning, inference, datasets, and new model releases.
- 5️⃣ LangChain - Official channel for LangChain and LangGraph. Tutorial-first videos on agents, workflows, and graph-based orchestration.
- 5️⃣ Jeremy Howard - fast.ai co-founder. Practical, builder-oriented, strong on software craft and AI-assisted coding.
- 5️⃣ Two Minute Papers - Károly Zsolnai-Fehér. Short, enthusiastic summaries of AI research papers. Good for staying aware of what is being published without reading every paper.
- 5️⃣ Bycloud - Weekly video essays on AI news and research, aimed at builders.
- 6️⃣ Andrej Karpathy - Former Tesla AI and OpenAI. Best long-form explanations of how LLMs actually work — essential mental models for anyone building on top of them.
- 7️⃣ Umar Jamil - Line-by-line implementations of transformers, vision-language models, and LoRA. Strong for understanding what is happening inside a model when you are debugging or fine-tuning.
- 8️⃣ Yannic Kilcher - In-depth walkthroughs of new research papers. Essential for staying current with model releases and understanding what papers actually claim vs. what they prove.
Podcasts and longer listening are collected in the Newsletters, podcasts, and blogs section below.
If you prefer reading to watching, this path goes very far, especially with these books focusing on actually coding and building.
- 5️⃣ Building LLMs for Production - Towards AI. 465 pages covering prompting, RAG, fine-tuning, reliability, and shipping. Used as an internal reference manual in many companies. The Academy e-book version is also available. (Paid, $29 e-book)
- 5️⃣ Hands-On Large Language Models - Jay Alammar and Maarten Grootendorst. Visual, code-first companion that pairs well with Chip Huyen's book. (Paid)
- 5️⃣ Prompt Engineering for LLMs - John Berryman and Albert Ziegler. Written by GitHub Copilot engineers, with useful field-tested patterns. (Paid)
- 6️⃣ LLM Engineer's Handbook - Paul Iusztin and Maxime Labonne. Production-focused, built around a real end-to-end project. Pairs with the companion code repo. (Paid)
- 7️⃣ AI Engineering - Chip Huyen. The most-read book on O'Reilly for this space. Strong on system design, evaluation, and when each technique earns its place. (Paid)
- 8️⃣ Build a Large Language Model (From Scratch) - Sebastian Raschka. Foundations and intuition. Code a GPT-style LLM from scratch in PyTorch, no libraries that hide the internals. The right book for developers who want to move past calling APIs and actually understand transformers, tokenization, attention, and fine-tuning. Pairs with the companion LLMs-from-scratch repo. (Paid)
- 4️⃣ The Illustrated Transformer - Jay Alammar. The classic visual reference for the transformer architecture. Worth having open when reading about attention, tokenization, or embedding layers.
- 5️⃣ Prompt Engineering - Lilian Weng. The cleanest overview of prompting techniques from a research perspective.
- 5️⃣ Patterns for Building LLM-based Systems & Products - Eugene Yan. Seven patterns that almost every shipped LLM product ends up using.
- 6️⃣ The Illustrated Retrieval Transformer - Jay Alammar. Useful for intuition on how retrieval-style architectures differ from pure decoder-only models.
- 7️⃣ LLM Powered Autonomous Agents - Lilian Weng, OpenAI. Still the reference post on agent design, planning, memory, and tool use.
- 7️⃣ The State of LLMs 2025 - Sebastian Raschka's year-end synthesis of how the stack actually moved.
- 8️⃣ Why We Think - Lilian Weng on test-time compute and why reasoning models work.
A curated short list of valuable long-form articles from 2025-2026. All are substantial reads (10+ minutes) that reward a full sitting. Topic-specific articles are in their respective sections below.
- 4️⃣ Here's how I use LLMs to help me write code - Simon Willison's personal workflow, written for other practitioners. The most-shared write-up on actually working with coding agents.
- 5️⃣ Your AI Product Needs Evals - Hamel Husain. The canonical starting point for why evals matter and how to begin.
- 6️⃣ A Field Guide to Rapidly Improving AI Products - Hamel Husain. An end-to-end playbook for going from "it kinda works" to a real product. Pairs evals with error analysis and data flywheels. The single best article on improving an AI product once it exists.
- 6️⃣ Building Effective AI Agents - Anthropic. The reference post on when to use a workflow and when autonomy actually pays its way. Widely treated as required reading.
- 6️⃣ Harness Engineering: The Missing Layer Behind AI Agents - Louis-François Bouchard. The layer between prompt engineering and a working agent: tools, permissions, state, retries, checkpoints, guardrails, and evals. Explains why harnesses, not models, separate demos from products.
- 6️⃣ Agents - Chip Huyen. A long-form primer on agent design, planning, and tool use. One of the most-shared agent posts of 2025.
- 7️⃣ Context Engineering for LLMs: Build Reliable, Production-Ready RAG Systems - A full walkthrough of chunking, hybrid retrieval (BM25 + dense), reranking, and token budgeting. Practical enough to take a RAG prototype to production.
- 7️⃣ Effective harnesses for long-running agents - Anthropic. Scaffolding for hour-long agent runs: checkpoints, state, and recovery patterns.
- 7️⃣ Agent Observability and Evaluation: A 2026 Developer's Guide - Divy Yadav's long-form piece on why most teams still have no evals, what to instrument first, and how to close the feedback loop between traces and fixes.
- 7️⃣ 12-Factor Agents - Dex Horthy. Widely-cited production-agent checklist covering state, tools, context, and reliability. Referenced across most 2025-2026 agent engineering discussions.
- 7️⃣ Systematically Improving RAG - Jason Liu. A disciplined iteration playbook for RAG, from evals to metadata to user feedback loops. Still the reference piece for RAG consultants.
- 8️⃣ How we built our multi-agent research system - Anthropic. Real architecture behind a shipped multi-agent product, including the tradeoffs and failure modes you only see in production.
- 8️⃣ The Lethal Trifecta for AI Agents - Simon Willison. Private data, untrusted content, and external communication — the combination every agent builder needs to internalize before shipping.
- 8️⃣ How to Fine-Tune LLMs in 2025 with Hugging Face - Philipp Schmid. The single best recent how-to on modern fine-tuning.
Articles from Anthropic, OpenAI, and individual practitioners (Shreya Shankar, Paul Iusztin, and others) are also referenced in the topic-specific sections below. Start with the topic you care about most and work outward.
For ongoing reading, rotate between practitioner blogs, official engineering posts, the Towards AI publication on Medium, and the Towards AI Newsletter instead of relying on one source.
A common mistake is reading ten articles on the same topic and building nothing. A better loop is: read one conceptual article, read one official docs page, build one tiny version yourself, then reread the article once you have scars. The second pass hits very differently.
If you want more structure, courses are the fastest route through this material.
- 2️⃣ AI for Work - Towards AI. 15 modules for non-developers who want to actually use AI at work. No coding required. (Paid, $399)
- 3️⃣ 10-Hour LLM Fundamentals - Towards AI. Compact video-first crash course covering when to use prompting, RAG, fine-tuning, or agents. Useful before going deep. (Paid, $199)
- 5️⃣ Full Stack AI Engineering - Towards AI's flagship program. 90+ lessons across prompting, RAG, fine-tuning, tools, agents, and deployment, built around one production capstone. Designed for people who want a full developer path to AI engineering. (Paid, $349)
- 7️⃣ Agentic AI Engineering - Towards AI. 34 lessons and two production agents (a research agent and a writing workflow), covering context engineering, evaluations, observability, containers, and deployment. For people who already ship LLM apps and want to specialize. (Paid, $499)
- 4️⃣ Hugging Face LLM Course - Free. The best free structured path through tokenization, fine-tuning, and modern transformers.
- 4️⃣ Anthropic Academy - Free. Includes an Introduction to MCP.
- 5️⃣ Hugging Face Agents Course - Free. Walks through agents, tools, and orchestration using open-source models.
- 5️⃣ Hugging Face MCP Course - Free. Builds both client and server sides of MCP from scratch.
- 5️⃣ LangChain Academy - Free. The official path through LangChain and LangGraph.
- 3️⃣ ChatGPT Prompt Engineering for Developers - Andrew Ng and Isa Fulford. Free. The prompting short course most teams already assume you have done.
- 4️⃣ Building Systems with the ChatGPT API - Free. Multi-step chains, moderation, and evals at a beginner level.
- 5️⃣ Improving Accuracy of LLM Applications - Free. Practical methods for moving from 70% to 95% accuracy.
- 6️⃣ Agent Skills with Anthropic - Free. Agent skills, the Anthropic way.
- 6️⃣ Agent Memory: Building Memory-Aware Agents - Free. Short, focused course on memory architectures.
- 6️⃣ A2A: The Agent2Agent Protocol - Free. Google's Agent2Agent protocol explained by its designers.
- 6️⃣ Semantic Caching for AI Agents - Free. Cutting cost and latency through caching strategies.
- 6️⃣ NVIDIA NeMo Agent Toolkit: Making Agents Reliable - Free. Guardrails and reliability at scale.
- 6️⃣ Building Coding Agents with Tool Execution - Free. The core loop behind modern coding agents.
Several DeepLearning.AI courses are listed in the topic sections below instead of here: AI Agents in LangGraph (under Agents), Automated Testing for LLMOps (under Evaluations), Red Teaming LLM Applications (under AI Safety), Efficient Inference with SGLang (under Deployment), and Document AI: From OCR to Agentic Doc Extraction (under Multimodal).
- 2️⃣ No Python yet: Beginner Python for AI Engineering first.
- 2️⃣ Non-technical and want to use AI at work: AI for Work.
- 3️⃣ Want a quick overview first: start with 10-Hour LLM Fundamentals.
- 4️⃣ Want the whole stack from nothing: start with the Get it all! From Novice to Expert Bundle.
- 5️⃣ Python-comfortable and want the full developer path: start with Full Stack AI Engineering.
- 5️⃣ Want free and docs-heavy: pair the Hugging Face LLM Course with LangChain Academy.
- 7️⃣ Already shipped basic LLM apps and want to specialize: go to Agentic AI Engineering.
Reading and watching will only take you so far. You become an AI engineer by building systems that fail in expensive and educational ways.
Watch What I Look For When Hiring AI Engineers before you start your first serious project. I share how I evaluate AI engineering candidates, why decision-making matters more than polished agent-generated output, and what kinds of practice projects actually teach useful skills.
- 4️⃣ A document question-answering assistant with citations and a real eval set.
- 4️⃣ A customer support workflow with tools and structured outputs.
- 5️⃣ A research assistant that plans, searches, reads, and writes a short brief.
- 5️⃣ A coding helper scoped to one narrow internal task.
- 5️⃣ A multimodal invoice or receipt parser with validation.
- 6️⃣ Designing Real-World AI Agents Workshop - Paul Iusztin's hands-on workshop for building a Deep Research Agent plus a LinkedIn Writing Workflow as MCP servers. It includes code, slides, video, evaluation patterns, and an
implement_yourself/path designed to be rebuilt with agentic coding tools instead of copied. - 6️⃣ A small agent that plans, acts, checks, and retries within a budget.
- 3️⃣ OpenAI Cookbook - Official recipes in notebook form. The quickest path to a working example of most common tasks.
- 4️⃣ Google Gemini Cookbook - Google. Gemini-flavored equivalent covering multimodal, long context, and tool use.
- 4️⃣ LlamaIndex Starter Tutorial and Understanding LlamaIndex - The fastest path from zero to a working RAG pipeline.
- 4️⃣ AI Engineering Cheatsheets - Louis-François Bouchard's decision tables and playbooks for choosing approaches.
- 5️⃣ Pydantic AI docs - Type-safe agent framework from the Pydantic team.
- 6️⃣ DSPy Tutorials - Tutorials for the DSPy approach of compiling prompts as programs.
- 6️⃣ Designing Real-World AI Agents Workshop - Build and run a multi-agent system with MCP servers, evaluator-optimizer loops, grounded search, structured outputs, and LLM-as-judge evaluation.
- 7️⃣ Paul Iusztin's hands-on-llms repo - End-to-end production project with training, serving, and monitoring.
Framework docs for agent-oriented libraries (LangGraph, CrewAI, AutoGen, Agno) live in the Agents section below.
- Why is this prompt, tool, or architecture chosen?
- Where and how will it fail?
- How will I evaluate it, offline and online?
- What will I log and inspect when it misbehaves?
- What is the cheapest design that still clears the bar?
- Is an agent actually the right choice here, or is a workflow enough?
If you cannot answer those, keep building.
Prompting still matters in 2026. The useful version is not clever tricks. It is writing reliable contracts for non-deterministic systems.
Clear task framing, output contracts, structured outputs and JSON schemas, few-shot examples, grounding and citations, verification loops, tool-use instructions, completion criteria, and prompt versioning.
- 3️⃣ OpenAI Prompt Engineering Guide - Official, up-to-date, API-centric.
- 3️⃣ Anthropic Prompt Engineering Overview - Claude-specific advice, generalizes well.
- 3️⃣ Anthropic Prompt Library - Anthropic's curated library of battle-tested prompts.
- 3️⃣ Learn Prompting - Free, community-maintained reference covering beginner to advanced prompting.
- 5️⃣ OpenAI GPT-5 Prompting Guide - Model-specific prompting advice from the OpenAI team.
- 5️⃣ Anthropic: Increase Output Consistency - Techniques for reducing output drift across runs.
- 5️⃣ Instructor: structured outputs with Pydantic - Jason Liu's library for turning free-form LLM outputs into typed Python objects.
- 6️⃣ Structured Data Extraction from Unstructured Content Using LLM Schemas - Simon Willison's approach to schema-first extraction.
Treat prompts as code you version, interfaces you test, and product decisions you revisit. That framing is more useful than any list of prompting tricks.
Reasoning models (OpenAI o-series, Anthropic Claude with extended thinking, Google Gemini Pro with thinking, DeepSeek R-models, Qwen reasoning variants) behave differently from standard chat models. They reward different prompting and break in different ways.
When reasoning models help, when they hurt, how to set thinking budgets, how to structure input for a thinking model, extended thinking and tool use together, and cost/latency tradeoffs.
- 4️⃣ Towards AI Newsletter issues - Weekly coverage of major reasoning model releases with benchmarks and opinion.
- 5️⃣ Anthropic: Prompt caching - Usually where reasoning costs get controlled in production.
- 6️⃣ Anthropic: Building with extended thinking - Official docs on how to use Claude's thinking mode correctly.
- 7️⃣ OpenAI: Run long horizon tasks with Codex - Long-running reasoning workflows in practice.
- 7️⃣ OpenAI: Unrolling the Codex agent loop - Inside the loop that a reasoning agent actually runs.
- 7️⃣ The State of LLMs 2025 - Sebastian Raschka's overview of how reasoning models reshaped the stack.
- 8️⃣ Why We Think - Lilian Weng on the theory behind test-time compute.
Rule of thumb for 2026: reach for a reasoning model when the task genuinely requires multi-step planning, verification, or tool use. For simple classification, extraction, or short answers, a cheaper standard model still wins on cost and latency.
Context engineering is one of the most important 2026 skills. The model is only as good as what you put in its context and how you stage it.
What belongs in context and what does not, context windows and context rot, message history management, memory versus retrieval, compaction and summaries, working files and scratchpads, repo-level instructions such as AGENTS.md or CLAUDE.md, and context handoffs between runs.
- 4️⃣ Anthropic: Context windows - Official docs with practical guidance on context limits and caching.
- 5️⃣ Context engineering and How to Fix Your Context - Simon Willison. The two posts that gave the field its current vocabulary.
- 5️⃣ Lost in the Middle - How attention drops inside long contexts and what it means for your prompt design.
- 6️⃣ Effective context engineering for AI agents - Anthropic. How the Claude team thinks about context as a first-class design surface.
- 6️⃣ Harness Engineering - Louis-François Bouchard on the scaffolding around the model that controls what enters and leaves context.
- 7️⃣ Context engineering for LLMs: Production-Ready RAG Systems - Chunking, retrieval, reranking, and token budgeting for real systems.
- 7️⃣ Jason Liu's Context Engineering Series - Consulting-flavored write-up from enterprise projects.
Most people try to fix bad systems by stuffing more tokens into the prompt. That usually makes results worse. The better habit is to be intentional about which instructions are permanent, which data is retrieved on demand, which state gets externalized into files or tools, and when to reset the context entirely.
RAG is still a core technique. The naive "stuff some chunks into the prompt" version is no longer enough.
Chunking strategies, embeddings, vector search, hybrid search with BM25, reranking, citations and provenance, metadata filtering, query rewriting, corrective RAG, retrieval quality evaluation, agentic retrieval, and knowing when RAG is the wrong answer.
- 4️⃣ Why RAG Is Not Training Your AI - Louis-François Bouchard on the mental model most builders get wrong.
- 4️⃣ LlamaIndex Introduction to RAG - Official docs. The cleanest free path to a working RAG system.
- 4️⃣ Pinecone RAG guide - Vendor-written but solid introduction with diagrams.
- 5️⃣ Is RAG Still Needed in the Era of Long Context LLMs? - Clear framework for when long context replaces RAG and when it does not.
- 6️⃣ Contextual Retrieval in AI Systems - Anthropic's prompt-cached contextual chunking pattern with measured quality gains.
- 6️⃣ Hybrid Search RAG That Actually Works - Production-ready code combining BM25, vectors, and reranking.
- 7️⃣ Context Engineering, Not Retrieval: Why Your Agentic RAG Fails in Production - April 2026. The gap between prototype and production is almost always a context problem, not a retrieval problem. Practical diagnosis for teams that have tuned embeddings for months and still see failures.
- 7️⃣ Why Most RAGs Stay POCs — How to Take Your Data Pipelines to Production - Why prototype RAG systems stall before production, and how to structure data pipelines (Databricks Asset Bundles, Python Wheel artifacts, Clean Architecture) so they actually ship and stay maintainable.
- 7️⃣ Vectorless RAG: Your RAG Pipeline Doesn't Need a Vector Database - For structured documents like contracts and financial reports, building a hierarchical JSON tree and letting the LLM navigate it can beat embeddings-plus-vector-DB. No chunking, no vector DB, fully traceable citations.
- 7️⃣ Systematically Improving RAG - Jason Liu's playbook for RAG iteration.
- 8️⃣ Evolve or perish: The new RAG paradigm - Paul Iusztin on where RAG is heading.
Do not stop at "uploaded PDF, got answer." Build one serious RAG app with citations, retrieval debugging, considered chunking choices, metadata filters, an eval set, and a way to inspect misses. That is where the real learning happens.
Good retrieval depends on the pieces around the model.
- 4️⃣ Cohere Embed and Rerank - Strong general-purpose production choice with multilingual support.
- 4️⃣ Voyage AI - Domain-specific embeddings (finance, legal, medical) plus the
rerank-2reranker. - 4️⃣ Jina Embeddings and Jina Reranker - Competitive multilingual options, strong on long documents.
- 4️⃣ Nomic Embed - Strong open-source option with Apache 2.0 licensing.
- 5️⃣ Hugging Face MTEB Leaderboard - Community leaderboard for picking an embedding model by task.
- 4️⃣ Qdrant docs - Fast, production-ready, open source, free managed tier.
- 4️⃣ Weaviate docs - Open source with built-in hybrid search and RAG modules.
- 4️⃣ LanceDB docs - Embedded, Python-first, no server needed. Great for local RAG prototypes.
- 4️⃣ Pinecone - Managed serverless, the most common enterprise default.
- 4️⃣ pgvector - Vector search inside Postgres. Best choice when you already have Postgres and want to avoid a second system.
- 4️⃣ Chroma - Light, simple, good for prototypes and tutorials.
- 7️⃣ Inside Vector Databases: Engineering High-Dimensional Search - How HNSW and IVF actually work.
If prompting was the first phase of AI apps, and tools the second, then in 2026 MCP and structured tool ecosystems are part of the default stack.
Function and tool calling, tool schemas, tool selection and retries, permissions and safety boundaries, tool result formatting, MCP clients and servers, web search and code execution tools, computer use, and authentication against external systems.
- 5️⃣ Anthropic Tool use overview - The cleanest reference for function calling with Claude.
- 5️⃣ Introducing the Model Context Protocol - Original announcement, still the best one-page summary.
- 5️⃣ Model Context Protocol Getting Started - Official MCP docs.
- 5️⃣ Hugging Face MCP Course - Free course covering the client and server implementation.
- 5️⃣ Introduction to Model Context Protocol - Anthropic's own short course, free.
- 6️⃣ Anthropic Web search tool and Code execution tool - Built-in tools that remove most of the glue you used to write.
- 6️⃣ Anthropic Skills and Agent Skills open standard - The skills primitive: reusable markdown instructions Claude loads at the right moment. First-class in Claude.ai, Claude Code, and the API in 2026, now an open standard used across multiple agent platforms.
- 6️⃣ Anthropic Computer use - Letting a model control a screen and a keyboard inside sandboxed environments.
- 6️⃣ MCP Architecture overview, Server concepts, Build an MCP server, and Build an MCP client - Full reference for both sides of the protocol.
- 7️⃣ Writing effective tools for agents - Anthropic. Practical guide to tool schemas, descriptions, and error handling.
- 7️⃣ Code execution with MCP - Anthropic's pattern for composing MCP servers through code instead of long tool lists.
- 7️⃣ Model Context Protocol (MCP): Why Every AI Developer Needs MCP in 2026 - Why MCP replaces ad-hoc REST integrations: decoupled Host/Client/Server architecture, why it scales better than direct API wiring, and what it means for maintaining AI applications across provider changes.
- 8️⃣ Model Context Protocol has Prompt Injection Security Problems - Simon Willison. Read this before you deploy an MCP server that touches private data.
Agents that need to search the web rarely call raw Google or Bing. These are the APIs most production stacks use:
- 4️⃣ Tavily - Purpose-built search API for LLM agents with content extraction and summarization.
- 4️⃣ Exa - Semantic search API with neural retrieval over the web.
- 4️⃣ Brave Search API - Privacy-focused web search, common choice for agent stacks that need independent indexing.
The model is not your system. The tool layer is where most real capability and most real risk both live.
This is where hype gets loud and engineering judgment becomes valuable.
Workflow versus agent, single agent versus multi-agent, ReAct and tool loops, routing and orchestration, planning and reflection, human-in-the-loop, state and memory, failure modes, and when to avoid autonomy altogether.
- 5️⃣ AI Agents in LangGraph - Harrison Chase and DeepLearning.AI. Free. The cleanest intro to graph-based agents.
- 5️⃣ LangGraph docs - Official graph-based orchestration docs for long-running, stateful agents.
- 5️⃣ LlamaIndex Workflows - LlamaIndex's event-driven workflow system.
- 5️⃣ CrewAI, AutoGen, and Agno - Framework docs for three of the main alternatives.
- 6️⃣ Building Effective AI Agents - Anthropic. The reference post on agent vs workflow design.
- 6️⃣ Stop Building Agent Demos - Louis-François Bouchard on the demo-to-production gap.
- 6️⃣ Agents and Workflows - Louis-François Bouchard on when multi-agent is overengineering.
- 6️⃣ What Makes an AI Agent Actually Agentic? - What separates a real agent from a workflow wearing an LLM hat: autonomy, memory, and resilience. Walks through refactoring a hardcoded LangGraph assistant into a ReAct-based agent with SQLite checkpointing and layered, context-aware error handling.
- 6️⃣ Agent Architecture Guide - Louis-François Bouchard's 13-question decision framework for agent design.
- 7️⃣ LLM Powered Autonomous Agents - Lilian Weng. The reference post that defined the field.
- 7️⃣ Agents - Chip Huyen's long-form primer on agent design, planning, and tool use. One of the most-shared agent posts of 2025.
- 7️⃣ 12-Factor Agents - Dex Horthy's widely-cited production-agent checklist covering state, tools, context, and reliability. Heavily referenced across 2025-2026 agent engineering discussions.
- 7️⃣ Creating an Advanced AI Agent From Scratch with Python in 2026 - Modular architecture over framework lock-in: a flexible tool system, provider-agnostic LLM wrapper, and a ReAct-based agent orchestrator with Pydantic for type-safe tool execution. Lets you swap models and tools without touching the core loop.
- 7️⃣ The Two Things Every Reliable Agent Needs - A framework centered on memory-first design and an anti-Goodhart scoreboard: treat memory as a core system with defined forms, functions, and dynamics, and evaluate with adversarial metrics across full episodes so agents solve the actual problem instead of gaming a proxy.
- 7️⃣ LangChain Middleware: The Missing Layer Between Your Agent and Production - LangChain's new middleware system pulls operational concerns (summarization, human approval, retries, token tracking, dynamic routing, tool monitoring, context injection) out of agent logic and into a dedicated layer. Covers decorator vs class-style hooks, ordering rules, custom state schemas, and five production patterns.
- 7️⃣ Google's A2A Protocol using LangGraph: Build Agent Systems That Actually Communicate - Divy Yadav. Practical deep-dive into Agent2Agent: Agent Cards for discovery, structured task lifecycles, HTTP messaging, and how A2A complements (not competes with) MCP. Covers real production failure modes — timeout handling, context mismatch, authentication drift — with a LangGraph implementation walkthrough.
- 7️⃣ Agentic AI Engineering - Towards AI's deep dive with two shipped agents as capstones. (Paid)
- 8️⃣ How we built our multi-agent research system - Anthropic. Real architecture behind a shipped multi-agent product.
- 8️⃣ Building Production Text-to-SQL for 70,000+ Tables: OpenAI's Data Agent Architecture - How OpenAI built an internal data agent for its own data warehouse. Goes beyond naive query generation: six layers of context (table usage patterns, human annotations, business logic extracted from code), plus a closed-loop validation step where the agent profiles results, catches its own errors, and repairs queries. The real lesson — agent effectiveness depends on the richness of context, not the model.
Most teams should start with a workflow. Add autonomy only where it clearly buys something. That saves token spend, latency, debugging pain, and a lot of regret.
The layer most people skip and rediscover the hard way.
Golden datasets, rule-based checks, LLM-as-a-judge, regression testing, traces and spans, prompt versioning, error analysis, offline evaluations and online monitoring, harness design, and testability of agent behavior.
- 5️⃣ Your job is to deliver code you have proven to work - Simon Willison. Less about tooling, more about the right mental model for this work.
- 5️⃣ Your AI Product Needs Evals and LLM Evals FAQ - Hamel Husain. The canonical starting point.
- 5️⃣ Automated Testing for LLMOps - DeepLearning.AI short course, free. CI-style testing for LLM-powered apps.
- 5️⃣ Ragas - Open-source RAG evaluation library.
- 5️⃣ LangSmith and LangSmith Evaluation - Hosted tracing and eval tooling from LangChain.
- 5️⃣ Braintrust - Commercial eval and observability platform popular with teams that want structured experiment tracking.
- 5️⃣ Arize Phoenix - Open-source observability for LLM applications.
- 5️⃣ Pydantic AI and Logfire - Type-safe agent framework and observability tool from the Pydantic team.
- 6️⃣ Harness Engineering: The Missing Layer Behind AI Agents - Louis-François Bouchard on why harnesses, not models, are what separates production from prototype.
- 6️⃣ Harness engineering - OpenAI's framing of the same layer for coding agents.
- 6️⃣ Testing Agent Skills Systematically with Evals - OpenAI on building evals for agent skills.
- 6️⃣ Effective harnesses for long-running agents - Anthropic. Scaffolding for hour-long agent runs.
- 6️⃣ A Field Guide to Rapidly Improving AI Products - Hamel Husain's end-to-end playbook for going from "it kinda works" to a real product, pairing evals with error analysis and data flywheels.
- 6️⃣ Task-Specific LLM Evals that Do & Don't Work and Evaluating LLM-Evaluators - Eugene Yan on where LLM-as-judge helps and where it misleads.
- 6️⃣ In Defense of AI Evals, for Everyone and Data Flywheels for LLM Applications - Shreya Shankar on why evals are a product skill, not a research skill.
- 7️⃣ Agent Observability and Evaluation: A 2026 Developer's Guide - Divy Yadav. One of the most complete recent write-ups.
- 7️⃣ MLflow Observability for Generative AI: A Deep Dive with Text2SQL + RAG + WebSearch using LangGraph - MLflow's native tracing applied to a real LangGraph e-commerce agent. Every node instrumented with spans, traces, and cost-tracking decorators — shows what hierarchical trace trees actually look like for a production agentic pipeline, not just HTTP latency timestamps.
- 8️⃣ Inspect AI - UK AI Safety Institute's open-source framework for building LLM evals, used in frontier safety research and increasingly in production.
If you cannot tell whether your system is improving, you are not engineering yet, you are moving vibes around.
Fine-tuning still matters, and in 2026 it is no longer the first hammer most teams reach for. Reasoning models, prompt caching, long context, and cheap high-quality base models shifted the tradeoff.
When prompting is enough, when RAG is enough, when supervised fine-tuning helps, synthetic data generation, dataset cleaning and formatting, preference optimization and reinforcement fine-tuning, Low-Rank Adaptation (LoRA) and Decomposed Low-Rank Adaptation (DoRA), domain adaptation, and cost/maintenance tradeoffs.
- 5️⃣ Building LLMs for Production - Towards AI. The fine-tuning chapters alone are worth the price for most teams. The Academy e-book version is also available. (Paid)
- 5️⃣ Hugging Face smol fine-tuning course - Free, code-first walkthrough. Fine-tuning small models hands-on.
- 5️⃣ OpenAI model optimization guide - Official docs for API-level fine-tuning and distillation.
- 6️⃣ Hugging Face PEFT docs - The official library for LoRA and related methods.
- 6️⃣ Using and Finetuning Pretrained Transformers - Sebastian Raschka's reference post.
- 7️⃣ How to Fine-Tune LLMs in 2025 with Hugging Face - Philipp Schmid. Single best recent how-to on modern fine-tuning.
- 7️⃣ LoRA vs Full Fine-Tuning - Florin Andrei's side-by-side comparison on real tasks.
- 7️⃣ What SFT, DPO, RLHF, and RAG Actually Do in an AI Agent - Shenggang Li anchors each technique to a customer-support scenario: SFT for tone and task format, RAG for business facts at inference, DPO for choosing between two valid replies, RLHF when the problem runs deeper than any single answer. A clean decision framework for picking the right fix.
- 8️⃣ Improving LoRA: Implementing DoRA from Scratch - Sebastian Raschka on the LoRA successor.
Only fine-tune after you understand the baseline and have evals. Otherwise you are tuning toward a blurry target.
Many real AI products need to read images, parse PDFs, work with screenshots, or combine text and visuals.
Vision inputs, document layout understanding beyond Optical Character Recognition (OCR), multimodal prompting, image-grounded extraction, and table and chart extraction.
- 4️⃣ Anthropic Vision docs - Claude-specific vision API and prompting guidance.
- 4️⃣ OpenAI vision guide - Official OpenAI vision reference.
- 4️⃣ Google Gemini multimodal capabilities - Gemini native multimodal, strong on long documents and video.
- 5️⃣ Docling - IBM's open-source document extraction toolkit with layout and table reconstruction. Free.
- 5️⃣ Document AI: From OCR to Agentic Doc Extraction - LandingAI short course with Andrew Ng. Free.
- 5️⃣ LlamaIndex Structured Prediction - Schema-first extraction from documents and images.
- 8️⃣ Multimodal Large Language Models: Architectures, Training, and Real-World Applications - Technical overview of MLLMs: modular versus monolithic architectures, alignment and fusion layers between encoders and LLM backbones, the three-stage training pipeline (modality alignment, joint pretraining, instruction tuning), and applications from document understanding to autonomous GUI agents.
Good first project ideas: invoice extraction with validation, a receipt parser with structured outputs, a screenshot-to-action assistant, or a research workflow that extracts and cites figures from PDFs.
Voice became table stakes for many products in 2025-2026. Low-latency turn-taking and realtime multimodal APIs now compete with traditional text chat.
Speech-to-text and text-to-speech selection, turn-taking and barge-in, session management, latency budgeting, tool use inside a voice turn, and when voice beats text.
- 4️⃣ Anthropic voice guidance - Pairs Claude with an external speech pipeline (ElevenLabs, Deepgram, etc.).
- 4️⃣ ElevenLabs docs - Production voice cloning and streaming text-to-speech.
- 4️⃣ Deepgram - Low-latency speech-to-text.
- 5️⃣ OpenAI Realtime API - The primary realtime reference for most teams. Native speech-to-speech with tool use.
- 5️⃣ Gemini Live API - Google's realtime multimodal endpoint.
- 5️⃣ Pipecat - Open-source voice agent framework. Free.
- 5️⃣ LiveKit Agents - Realtime agent infrastructure with strong WebRTC support.
This is where "my notebook works" becomes "my product survives real users and traffic."
Application Programming Interface (API) deployment, containers, concurrency, OpenAI-compatible serving, prompt and KV cache use, vLLM and other inference servers, local models and privacy tradeoffs, cost and latency and throughput tradeoffs, self-hosted versus serverless, and reliability, scaling, and rollbacks.
- 4️⃣ Ollama and Ollama docs - The easiest way to run open models locally.
- 4️⃣ LM Studio - Graphical User Interface (GUI) for local inference, good for non-developers.
- 6️⃣ vLLM docs and vLLM Quickstart - UC Berkeley's high-throughput inference server. De facto standard for self-hosting.
- 6️⃣ SGLang - Structured generation and batching, strong for constrained outputs.
- 6️⃣ Text Generation Inference (TGI) - Hugging Face's production-ready serving stack.
- 6️⃣ llama.cpp - Central Processing Unit (CPU) and edge inference with GGUF quantization. The main path to running models on laptops.
- 6️⃣ Efficient Inference with SGLang - DeepLearning.AI short course, free.
- 4️⃣ RunPod - Low-cost on-demand Graphics Processing Unit (GPU) rental.
- 4️⃣ Together AI - Fast managed inference for open-weight models.
- 4️⃣ Fireworks AI - Another leading managed inference provider.
- 4️⃣ Groq - Language Processing Unit (LPU) hardware for very low-latency serving.
- 4️⃣ Cerebras - Wafer-scale inference, fastest tokens per second on certain models.
- 5️⃣ Modal docs and Developing with LLMs on Modal - Serverless GPU compute with a clean Python interface.
- 7️⃣ BentoML docs, the LLM Inference Handbook, OpenAI-compatible API guide, Serverless vs. self-hosted, and Inference optimization - Thorough free handbook on inference economics.
Most production stacks sit one layer above the provider to handle fallbacks, rate limits, cost tracking, and per-request model selection:
- 5️⃣ LiteLLM - Open-source proxy and Python SDK that lets you call 100+ LLM providers through a unified OpenAI-compatible interface. De facto standard for multi-provider applications.
- 5️⃣ OpenRouter - Hosted router with a single API across hundreds of models, including preview access to models before they hit official APIs.
- 5️⃣ Portkey - AI gateway with caching, observability, and guardrails built on top of the routing layer.
- 4️⃣ Meta Llama and Hugging Face Llama pages - Meta's flagship open-weight family.
- 4️⃣ DeepSeek on Hugging Face and DeepSeek GitHub - The series that reshaped expectations for open-weight reasoning.
- 4️⃣ Qwen on Hugging Face - Alibaba's Qwen family, strong across dense, Mixture-of-Experts, and coding variants.
- 4️⃣ Mistral and Mistral on Hugging Face - European provider with both open and hosted models.
- 4️⃣ GLM (Zhipu) - GLM family of open-weight models with strong multilingual and code performance.
Why you chose an API model or an open-weight model. Why you chose that latency and cost tradeoff. Why the system is safe enough to expose to real users. How you would debug a bad output in production. How the system behaves when a dependency fails. If you can answer those, you are already ahead of many AI app builders.
How AI engineers actually work changed in 2025-2026. Coding agents and agent-native editors are now part of daily practice and part of what teams expect you to have used.
- 3️⃣ Claude Code - Anthropic's Command Line Interface (CLI) agent with the Claude Agent Software Development Kit (SDK) behind it. Strong for long-running, tool-heavy tasks.
- 3️⃣ Cursor - Integrated Development Environment (IDE) with agent-native editing. One of the most widely-used AI IDEs as of 2026.
- 3️⃣ GitHub Copilot - Now includes agent mode and skills. The default for many enterprise teams.
- 3️⃣ Codex CLI - OpenAI's long-horizon coding agent.
- 3️⃣ Gemini CLI - Google's open-source command-line agent.
- 3️⃣ Windsurf - Cognition's (formerly Codeium's) agent-native editor, focused on flow and context handling.
- 4️⃣ Here's how I use LLMs to help me write code - Simon Willison's personal workflow, written for other practitioners.
- 4️⃣ AI-assisted development needs automated tests and Identify, solve, verify - Simon Willison on the core loop.
- 4️⃣ How to Solve It With Code - Jeremy Howard's fast.ai course on AI-assisted problem-solving.
- 5️⃣ What is agentic engineering? - Simon Willison's working definition.
- 6️⃣ Harness engineering: leveraging Codex in an agent-first world and Unlocking the Codex harness: how we built the App Server - OpenAI's two-part series on what the harness layer actually looks like in a shipped product with a million lines of agent-generated code.
- 6️⃣ Claude Code: How to Build, Evaluate, and Tune AI Agent Skills - Rick Hightower. A practical guide to SKILL.md files that extend Claude's behavior for specific workflows. Distinguishes Capability Uplift skills (teach better reasoning, age out as models improve) from Encoded Preference skills (capture team workflows and compound in value). Covers how to benchmark and tune triggers to avoid false-fires as your skill library grows.
Rule of thumb: pick one coding agent, commit to it for a month, and learn its scaffolding well. Rotating between tools is usually slower than mastering one.
This part is not optional. If your AI system can search the web, call tools, touch private data, or send actions into other software, you need to think about risk early.
Prompt injection, sensitive data handling, system prompt leakage, tool permissions, excessive agency, overreliance, output validation, human review thresholds, red teaming, and governance.
- 6️⃣ OWASP Top 10 for LLM Applications 2025 - The canonical list, updated with vector weaknesses and system prompt leakage.
- 6️⃣ OWASP GenAI Security Project and LLM01: Prompt Injection - The LLM Top 10 landing page and the prompt injection entry.
- 7️⃣ NIST AI Risk Management Framework and the Generative AI Profile - The US government reference framework.
- 6️⃣ OpenAI Safety Evaluations Hub - OpenAI's public safety evaluation results.
- 7️⃣ The Lethal Trifecta for AI Agents - Simon Willison on the private-data, untrusted-content, external-communication risk every agent builder should understand.
- 7️⃣ Google's Approach to AI Agent Security - Summary of Google's published security posture.
- 7️⃣ Embrace The Red blog - Johann Rehberger. Ongoing red-teaming write-ups and agent exploits.
- 8️⃣ Design Patterns for Securing LLM Agents against Prompt Injections - Practical defenses.
- 8️⃣ Anthropic Constitutional Classifiers and Mitigate jailbreaks and prompt injections - Anthropic's production mitigations.
- 9️⃣ CaMeL: a promising direction for mitigating prompt injection - One of the stronger research directions on prompt injection defense.
- 6️⃣ Guardrails AI - Validators and schemas for LLM output.
- 6️⃣ Red Teaming LLM Applications - DeepLearning.AI short course, free.
- 7️⃣ NVIDIA NeMo Guardrails - Production-ready programmable guardrails.
- 7️⃣ Meta LlamaFirewall - Meta's open-source agent safety framework.
- 7️⃣ PyRIT - Microsoft's red teaming orchestration tool.
- 7️⃣ Invariant Labs Guardrails - Agent-focused policy and runtime enforcement.
Treat LLM output like work from a fast intern with occasional alien instincts. You do not blindly trust it. You design systems around it.
The social layer where most of the real-time knowledge actually moves.
- 2️⃣ Towards AI Discord - 80,000+ builders, direct access to the Towards AI team, weekly events, channels for RAG, agents, fine-tuning, and job search.
- 2️⃣ Learn AI Together - Louis-François Bouchard's nearly 100,000-member server for AI enthusiasts, study groups, and Kaggle teammates.
- 3️⃣ Hugging Face Discord - Home for the open-source AI ecosystem. Channels for every major model family and library.
- 3️⃣ LangChain Discord - Official community for LangChain and LangGraph users.
- 3️⃣ LlamaIndex Discord - Active channels on RAG, agents, and workflows.
- 4️⃣ MLOps Community - Active Slack community covering production ML and increasingly LLM operations. One of the best places to ask real production questions.
- 5️⃣ Modular (MAX) Discord - Mojo and MAX users, good for inference and performance topics.
- 2️⃣ r/artificial - General AI news and discussion.
- 2️⃣ r/ArtificialInteligence - Broader AI community with a mix of news, opinion, and tutorials.
- 2️⃣ r/learnmachinelearning - Beginner-friendly, good for study questions and roadmap discussions.
- 3️⃣ r/OpenAI - News, API discussion, and model behavior debugging.
- 3️⃣ r/ClaudeAI - Claude Code, Claude.ai, and Anthropic product discussion.
- 3️⃣ r/LangChain - LangChain and LangGraph community troubleshooting.
- 3️⃣ r/Rag - Focused subreddit on retrieval-augmented generation patterns.
- 3️⃣ r/AI_Agents - Agent-specific community with framework debates and build-in-public threads.
- 4️⃣ r/LocalLLaMA - By far the most useful subreddit for open-weight models, inference benchmarks, and quantization tips.
- 4️⃣ r/computervision - Extracting useful information from images and video.
- 4️⃣ r/LatestInML - Curated stream of newer ML developments.
- 5️⃣ r/MachineLearning - The biggest machine learning subreddit, research-heavy.
- 4️⃣ AI Engineering Cheatsheets - Louis-François Bouchard's central collection.
- 4️⃣ AI Engineering Playbook - Decision tables for choosing techniques, models, evaluation approaches, and production optimizations.
- 4️⃣ Agent Architecture Guide - The 13-question decision framework for agent design.
- 4️⃣ Anti-Slop AI Writing Guide - Avoiding the usual LLM-written tells when you use AI to draft.
- 4️⃣ Towards AI Free Resource Library - Free guides and starter kits.
- 3️⃣ Towards AI Newsletter - Weekly "What happened this week in AI" coverage with technical depth, benchmarks, and opinion.
- 3️⃣ Last Week in AI - Andrey Kurenkov and Jeremie Harris. Weekly news roundup.
- 3️⃣ The Batch - Andrew Ng's weekly summary of research and industry.
- 4️⃣ Louis-François Bouchard's Substack - Short essays on harness engineering, agents, and the practice of AI engineering.
- 4️⃣ Latent Space - swyx and Alessio Fanelli. Industry-heavy AI engineering newsletter with interviews.
- 5️⃣ Decoding AI - Paul Iusztin on production machine learning and AI engineering.
- 5️⃣ The Neural Maze - Miguel Otero Pedrido. Practical production ML and AI systems newsletter for builders tired of hype, with end-to-end projects, agent systems, deployment tradeoffs, and lessons from real ML engineering work.
- 5️⃣ Profitable AI Blog - Tobias Zwingmann. Applied AI frameworks and real-world examples focused on profitable business outcomes, use-case selection, and moving beyond prototypes.
- 5️⃣ AI Tidbits - Sahar Mor's technical briefings on new techniques.
- 6️⃣ Interconnects - Nathan Lambert. Post-training, reasoning models, and RLHF explained with research-grade clarity.
- 6️⃣ Import AI - Jack Clark's research-heavy roundup with policy perspective.
- 6️⃣ Ahead of AI - Sebastian Raschka's monthly deep dives.
- 3️⃣ Last Week in AI - Weekly news podcast companion to the newsletter.
- 3️⃣ Lex Fridman Podcast - Occasional AI episodes with researchers and founders.
- 4️⃣ The What's AI Podcast - Louis-François Bouchard. Interviews with AI builders and researchers.
- 4️⃣ Latent Space - swyx and Alessio Fanelli. Deep interviews with practitioners shipping real systems.
- 4️⃣ ThursdAI - Alex Volkov's weekly live show, podcast, and newsletter breaking down major AI news with builders. Strong for model releases, open-source AI, tooling, and practical context on what changed this week.
- 5️⃣ Machine Learning Street Talk - Tim Scarfe. Long-form research conversations.
- 4️⃣ Simon Willison - Near-daily AI engineering posts. The single most useful blog in this space.
- 4️⃣ Louis-François Bouchard - Essays on harness engineering, agents, and hiring.
- 4️⃣ Hamel Husain - Practical evals and consulting notes.
- 4️⃣ Eugene Yan - Patterns, evaluation, and applied ML writing.
- 4️⃣ Chip Huyen - System design for AI products.
- 6️⃣ Sebastian Raschka - Monthly deep dives on LLM research and implementation.
- 6️⃣ Lilian Weng - Longer-form research-style posts on agents, reasoning, and safety.
- 6️⃣ Shreya Shankar - Research-grade posts on evals and data flywheels.
- 6️⃣ Jason Liu - Consulting notes from enterprise RAG and agents work.
- 6️⃣ Philipp Schmid - Staff Engineer at Google DeepMind (formerly Hugging Face). Practical fine-tuning and Gemini-focused tutorials.
- 5️⃣ AI Engineering Field Guide - Alexey Grigorev (DataTalks.Club). Free. Research into AI engineering interview assignments, take-home challenges, hiring practices, and required skills from Q4 2025 / Q1 2026. Grounded in analysis of 51 companies and 100+ GitHub take-home repos. Includes role definitions, skill breakdowns, learning paths by background, and a curated awesome.md of the most-referenced 2025-2026 articles, talks, and interview resources.
- 5️⃣ Agents Towards Production - Nir Diamant. Free. 28+ end-to-end, code-first tutorials for production-grade GenAI agents. Created in 2025 and expanded through 2026, with company-contributed tutorials covering stateful workflows, vector memory, MCP, Docker deployment, FastAPI endpoints, security guardrails, GPU scaling, browser automation, multi-agent coordination, observability, and evaluation. One of the cleanest hands-on resources on shipping agents.
- 6️⃣ Awesome-LLM - Hannibal046. Free. One of the largest and most actively maintained LLM resource indexes on GitHub. Covers milestone papers, frontier models (DeepSeek V3/R1, Qwen 3, Kimi K-2, GPT-5, Claude 4, Gemini 2.5), open LLMs, training frameworks, deployment tools, courses, and specialized sub-lists (RAG, inference, compression, MoE, healthcare, 3D, Japanese). Updated continuously, useful as a broad navigational index when you know a topic exists but do not know where to start.
- 7️⃣ The 2025 AI Engineering Reading List - Latent Space (swyx and Alessio Fanelli). The definitive paper and resource list for AI engineers, organized by topic: agents, evals, RAG, fine-tuning, inference, and coding agents. Dense, opinionated, and updated annually. Required reading if you want to understand where the field came from and where it is heading.
- 3️⃣ OpenAI Developers and Platform docs - First stop for anything OpenAI API-related.
- 4️⃣ Hugging Face Learn - Central hub for the Hugging Face courses.
- 4️⃣ Towards AI publication on Medium - Daily practical posts from 3,000+ contributing writers.
- 5️⃣ Anthropic Docs and Anthropic Academy - Cleanest docs in the industry, plus a free learning hub.
- 5️⃣ Google AI for Developers - Gemini API, long context, and multimodal docs.
- 5️⃣ LlamaIndex docs - Official RAG and agent framework docs.
- 5️⃣ LangChain and LangGraph docs - Central reference for both libraries.
On Twitter/X and LinkedIn, most of the useful real-time signal comes from a relatively small group of practitioners. A good starter list:
- 4️⃣ Louis-François Bouchard - Co-founder and Chief Technology Officer, Towards AI. Harness engineering, agents, AI education.
- 4️⃣ Andrew Ng - DeepLearning.AI founder, weekly Batch newsletter.
- 4️⃣ swyx (Shawn Wang) - Latent Space, AI engineer community builder.
- 4️⃣ Harrison Chase - LangChain founder.
- 4️⃣ Omar Sanseviero - Hugging Face, open-source LLMs.
- 4️⃣ Logan Kilpatrick - Google DeepMind, working on Google AI Studio, the Gemini API, and Kaggle; formerly led developer relations at OpenAI. Useful for Gemini developer ecosystem updates, AI Studio workflows, and fast AI app prototyping.
- 4️⃣ Alex Volkov - Host and curator of ThursdAI, AI Evangelist at Weights & Biases, and a strong real-time source for weekly model releases, open-source AI, tooling, and builder commentary.
- 5️⃣ Simon Willison - Near-daily practical AI engineering posts. Also active on Mastodon and his blog.
- 5️⃣ Hamel Husain - Evals and AI consulting notes.
- 5️⃣ Jason Liu - RAG, consulting, structured outputs.
- 5️⃣ Chip Huyen - Systems thinking for AI products.
- 5️⃣ Philipp Schmid - Google DeepMind DevRel, formerly Hugging Face. Fine-tuning, Gemini, and open-model how-tos.
- 5️⃣ Jeremy Howard - fast.ai co-founder, deep learning and software craft.
- 6️⃣ Andrej Karpathy - Former Tesla AI and OpenAI. Long-form teaching, new architectures, occasional demo releases.
- 6️⃣ Sebastian Raschka - Research and implementation detail on LLMs.
- 6️⃣ Shreya Shankar - Evaluation, data pipelines, and research-to-practice.
- 6️⃣ Aran Komatsuzaki - Fast, curated paper summaries.
- 6️⃣ Jerry Liu - LlamaIndex founder.
- 6️⃣ Jack Clark - Anthropic co-founder, Import AI newsletter.
- 6️⃣ Nathan Lambert - Interconnects newsletter. One of the clearest writers on post-training, RLHF, and reasoning models.
- 6️⃣ Dex Horthy - HumanLayer founder, creator of the 12-Factor Agents reference. Production-agent engineering.
- 6️⃣ Lilian Weng - Former OpenAI research lead. Long-form posts on agents, reasoning, and safety that are cited everywhere.
- 8️⃣ Yann LeCun - Meta Chief AI Scientist, Turing Award laureate.
The market is messy. The signal is clearer than people think.
People who can take a vague problem, make reasonable assumptions, build a baseline, evaluate it, document tradeoffs, and ship something testable. That is closer to real work than trivia-style interviews.
- 4️⃣ AI Engineering Cheatsheets - Decision tables you can reference in interviews.
- 4️⃣ Towards AI Academy - Certificate programs and portfolio projects built for hiring.
- 5️⃣ What I Look For When Hiring AI Engineers - Louis-François Bouchard, lessons from 100+ interviews.
- 5️⃣ How to Work and Compound with AI - Eugene Yan. Not a resume guide, but a strong blueprint for how serious AI engineers work with coding agents: context as infrastructure, taste as configuration, cheap verification, larger delegation, and feedback loops that compound.
- 5️⃣ Identify, solve, verify - Simon Willison on the core skill employers are looking for.
- 5️⃣ Your job is to deliver code you have proven to work - Simon Willison on the shift in what programming jobs actually require.
- 7️⃣ How to Land a Frontier Lab Job - Vlad Feinberg. A practical path for people aiming at frontier labs: build rare skills at the edges of the LLM stack, especially accelerator/performance work below the model and rigorous agent research above it.
Ship two to four public projects that are small but serious. Write short READMEs that explain architecture choices, cost and latency tradeoffs, and failure modes. Include tests and at least one evaluation dataset. Show traces, monitoring, or experiment logs when relevant. Learn to explain why you chose not to use an agent in some places. Be able to compare prompting, RAG, fine-tuning, workflow, and agent approaches for a given problem. Many candidates can now generate code. Far fewer can show judgment.
Use the models themselves to help you learn. That does not mean outsourcing your thinking. It means using them intelligently: ask for alternative architectures, ask them to critique your evaluation plan, ask them to generate synthetic test cases, ask them to explain a docs page you half-understand, ask them to refactor your prompt into a clearer contract, ask them to produce failing tests before you implement a feature, or ask them to compare two designs under cost and latency constraints.
- Give the model your goal.
- Ask it for three plausible approaches.
- Pick one and implement it.
- Make it run end to end.
- Evaluate it on a small golden dataset.
- Ask the model to explain the failure cases you found.
- Repeat.
That loop works frighteningly well when you keep it tight.
AI engineering in 2026 is a systems craft. Learn enough theory to avoid magical thinking. Learn enough tooling to build quickly. Learn enough evaluation to trust what you ship. Learn enough product judgment to avoid building the wrong thing faster. And above all, keep shipping. That is still the shortcut.
If you found this guide useful, please star the repo and share it with one person who could use it. That is how it keeps reaching the right people.
Tag Louis-François Bouchard on X or LinkedIn if you share this guide.
If you'd like to support our work, joining any Towards AI Academy course directly funds more free content like this one.
This guide is updated throughout 2026 as the stack moves. Suggestions and pull requests are welcome.

