Awesome Prompt Engineering π§ββοΈ
A hand-curated collection of resources for Prompt Engineering and Context Engineering β covering papers, tools, models, APIs, benchmarks, courses, and communities for working with Large Language Models.
https://promptslab.github.io
New to prompt engineering? Follow this path:
Learn the basics β ChatGPT Prompt Engineering for Developers (free, ~90 min)
Read the guide β Prompt Engineering Guide by DAIR.AI (open-source, comprehensive)
Study provider docs β OpenAI Prompt Engineering Guide Β· Anthropic Prompt Engineering Guide
Understand where the field is heading β Anthropic: Effective Context Engineering for AI Agents
Read the research β The Prompt Report β taxonomy of 58+ prompting techniques from 1,500+ papers
π
Prompt Optimization and Automatic Prompting
Agentic Prompting and Multi-Agent Systems
Structured Output and Format Control
Prompt Injection and Security
Applications of Prompt Engineering
Text-to-Music/Audio Generation
Foundational Papers (Pre-2024)
These papers established the core concepts that modern prompt engineering builds on:
π§
Prompt Management and Testing
Name
Description
Link
Promptfoo
Open-source CLI for testing, evaluating, and red-teaming LLM prompts. YAML configs, CI/CD integration, adversarial testing. ~9K+ β
GitHub
Promptify
Solve NLP Problems with LLM's & Easily generate different NLP Task prompts for popular generative models like GPT, PaLM, and more with Promptify
[Github]
Agenta
Open-source LLM developer platform for prompt management, evaluation, human feedback, and deployment.
GitHub
PromptLayer
Version, test, and monitor every prompt and agent with robust evals, tracing, and regression sets.
Website
Helicone
Production prompt monitoring and optimization platform.
Website
LangGPT
Framework for structured and meta-prompt design. 10K+ β
GitHub
ChainForge
Visual toolkit for building, testing, and comparing LLM prompt responses without code.
GitHub
LMQL
A query language for LLMs making complex prompt logic programmable.
GitHub
Promptotype
Platform for developing, testing, and managing structured LLM prompts.
Website
PromptPanda
AI-powered prompt management system for streamlining prompt workflows.
Website
Promptimize AI
Browser extension to automatically improve user prompts for any AI model.
Website
PROMPTMETHEUS
Web-based "Prompt Engineering IDE" for iteratively creating and running prompts.
Website
Better Prompt
Test suite for LLM prompts before pushing to production.
GitHub
OpenPrompt
Open-source framework for prompt-learning research.
GitHub
Prompt Source
Toolkit for creating, sharing, and using natural language prompts.
GitHub
Prompt Engine
NPM utility library for creating and maintaining prompts for LLMs (Microsoft).
GitHub
PromptInject
Framework for quantitative analysis of LLM robustness to adversarial prompt attacks.
GitHub
LynxPrompt
Self-hostable platform for managing AI IDE config files (.cursorrules, CLAUDE.md, copilot-instructions.md). Web UI, REST API, CLI, and federated blueprint marketplace for 30+ AI coding assistants.
GitHub
Name
Description
Link
DeepEval
Open-source evaluation framework covering RAG, agents, and conversations with CI/CD integration. ~7K+ β
GitHub
Ragas
RAG evaluation with knowledge-graph-based test set generation and 30+ metrics. ~8K+ β
GitHub
LangSmith
LangChain's platform for debugging, testing, evaluating, and monitoring LLM applications.
Website
Langfuse
Open-source LLM observability with tracing, prompt management, and human annotation. ~7K+ β
GitHub
Braintrust
End-to-end AI evaluation platform, SOC2 Type II certified.
Website
Arize AI / Phoenix
Real-time LLM monitoring with drift detection and tracing.
GitHub
TruLens
Evaluating and explaining LLM apps; tracks hallucinations, relevance, groundedness.
GitHub
InspectAI
Purpose-built for evaluating agents against benchmarks (UK AISI).
GitHub
Opik
Evaluate, test, and ship LLM applications across dev and production lifecycles.
GitHub
Name
Description
Link
LangChain / LangGraph
Most widely adopted LLM app framework; LangGraph adds graph-based multi-step agent workflows. ~100K+ / ~10K+ β
GitHub Β· LangGraph
CrewAI
Role-playing AI agent orchestration with 700+ integrations. ~44K+ β
GitHub
AutoGen (AG2)
Microsoft's multi-agent conversational framework. ~40K+ β
GitHub
DSPy
Stanford's framework for programming LLMs with automatic prompt/weight optimization. ~22K+ β
GitHub
OpenAI Agents SDK
Official agent framework with function calling, guardrails, and handoffs. ~10K+ β
GitHub
Semantic Kernel
Microsoft's AI framework powering M365 Copilot; C#, Python, Java. ~24K+ β
GitHub
LlamaIndex
Data framework for RAG and agent capabilities. ~40K+ β
GitHub
Haystack
Open-source NLP framework with pipeline architecture for RAG and agents. ~20K+ β
GitHub
Agno (formerly Phidata)
Python agent framework with microsecond instantiation. ~20K+ β
GitHub
Smolagents
Hugging Face's minimalist code-centric agent framework (~1000 LOC). ~15K+ β
GitHub
Pydantic AI
Type-safe agent framework using Pydantic for structured validation. ~8K+ β
GitHub
Mastra
TypeScript AI agent framework with assistants, RAG, and observability. ~20K+ β
GitHub
Google ADK
Agent Development Kit deeply integrated with Gemini and Google Cloud.
GitHub
Strands Agents (AWS)
Model-agnostic framework with deep AWS integrations.
GitHub
Langflow
Node-based visual agent builder with drag-and-drop. ~50K+ β
GitHub
n8n
Workflow automation with AI agent capabilities and 400+ integrations. ~60K+ β
GitHub
Dify
All-in-one backend for agentic workflows with tool-using agents and RAG.
GitHub
PraisonAI
Multi-AI Agents framework with 100+ LLM support, MCP integration, and built-in memory.
GitHub
Neurolink
Multi-provider AI agent framework unifying 12+ providers with workflow orchestration.
GitHub
Composio
Connect 100+ tools to AI agents with zero setup.
GitHub
Prompt Optimization Tools
Name
Description
Link
DSPy
Multiple optimizers (MIPROv2, BootstrapFewShot, COPRO) for automatic prompt tuning. ~22K+ β
GitHub
TextGrad
Automatic differentiation via text (Stanford). ~2K+ β
GitHub
OPRO
Google DeepMind's optimization by prompting.
GitHub
Red Teaming and Prompt Security
Name
Description
Link
Garak (NVIDIA)
LLM vulnerability scanner for hallucination, injection, and jailbreaks β the "nmap for LLMs." ~3K+ β
GitHub
PyRIT (Microsoft)
Python Risk Identification Tool for automated red-teaming. ~3K+ β
GitHub
DeepTeam
40+ vulnerabilities, 10+ attack methods, OWASP Top 10 support.
GitHub
LLM Guard
Security toolkit for LLM I/O validation. ~2K+ β
GitHub
NeMo Guardrails (NVIDIA)
Programmable guardrails for conversational systems. ~5K+ β
GitHub
Guardrails AI
Define strict output formats (JSON schemas) to ensure system reliability.
Website
Lakera
AI security platform for real-time prompt injection detection.
Website
Purple Llama (Meta)
Open-source LLM safety evaluation including CyberSecEval.
GitHub
GPTFuzz
Automated jailbreak template generation achieving >90% success rates.
GitHub
Rebuff
Open-source tool for detection and prevention of prompt injection.
GitHub
AgentSeal
"Open-source scanner that runs 150 attack probes to test AI agents for prompt injection and extraction vulnerabilities."
GitHub
MCP (Model Context Protocol)
MCP is an open standard developed by Anthropic (Nov 2024, donated to Linux Foundation Dec 2025) for connecting AI assistants to external data sources and tools through a standardized interface. It has 97M+ monthly SDK downloads and has been adopted by GitHub, Google, and most major AI providers.
Name
Description
Link
MCP Specification
The core protocol specification and SDKs. ~15K+ β
GitHub
MCP Reference Servers
Official implementations: fetch, filesystem, GitHub, Slack, Postgres.
GitHub
FastMCP (Python)
High-level Pythonic framework for building MCP servers. ~5K+ β
GitHub
GitHub MCP Server
GitHub's official MCP server for repo, issue, PR, and Actions interaction. ~15K+ β
GitHub
Awesome MCP Servers
Curated list of 10,000+ community MCP servers. ~30K+ β
GitHub
Context7
MCP server providing version-specific documentation to reduce code hallucination.
GitHub
GitMCP
Creates remote MCP servers for any GitHub repo by changing the domain.
Website
MCP Inspector
Visual testing tool for MCP server development.
GitHub
Vibe Coding and AI Coding Assistants
Name
Description
Link
Claude Code
Anthropic's command-line AI coding tool; widely considered one of the best AI coding assistants (2026).
Docs
Cursor
AI-native code editor; Composer feature generates entire applications from natural language.
Website
Windsurf (Codeium)
"First agentic IDE" with multi-file editing and project-wide context.
Website
GitHub Copilot
AI pair programmer; ~30% of new GitHub code comes from Copilot.
Website
Aider
Open-source terminal AI pair programmer with Git integration. ~25K+ β
GitHub
Cline
Open-source VS Code AI assistant connecting editor and terminal through MCP. ~20K+ β
GitHub
Continue
Open-source IDE extensions for custom AI code assistants. ~22K+ β
GitHub
OpenAI Codex CLI
Lightweight terminal coding agent.
GitHub
Gemini CLI
Google's open-source terminal AI agent.
GitHub
Autohand Code CLI
Self-evolving autonomous terminal coding agent with multi-provider LLM support (OpenRouter, Anthropic, OpenAI, Ollama), 40+ tools, and modular skills system.
GitHub
Bolt.new
Browser-based prompt-to-app generation with one-click deployment.
Website
Lovable
Full-stack apps from natural language descriptions.
Website
v0 (Vercel)
AI assistant for building Next.js frontend components from text.
Website
Firebase Studio
Google's agentic cloud-based development environment.
Website
Other Notable Repositories
Name
Description
Link
Prompt Engineering Guide (DAIR.AI)
The definitive open-source guide and resource hub. 3M+ learners. ~55K+ β
GitHub
Awesome ChatGPT Prompts / Prompts.chat
World's largest open-source prompt library. 1000s of prompts for all major models.
GitHub
12-Factor Agents
Principles for building production-grade LLM-powered software. ~17K+ β
GitHub
NirDiamant/Prompt_Engineering
22 hands-on Jupyter Notebook tutorials. ~3K+ β
GitHub
Context Engineering Repository
First-principles handbook for moving beyond prompt engineering to context design.
GitHub
AI Agent System Prompts Library
Collection of system prompts from production AI coding agents (Claude Code, Gemini CLI, Cline, Aider, Roo Code).
GitHub
Awesome Vibe Coding
Curated list of 245+ tools and resources for building software through natural language prompts.
GitHub
OpenAI Cookbook
Official recipes for prompts, tools, RAG, and evaluations.
GitHub
Embedchain
Framework to create ChatGPT-like bots over your dataset.
GitHub
ThoughtSource
Framework for the science of machine thinking.
GitHub
Promptext
Extracts and formats code context for AI prompts with token counting.
GitHub
Price Per Token
Compare LLM API pricing across 200+ models.
Website
OpenPaw
CLI tool (npx pawmode) that turns Claude Code into a personal assistant by generating system prompts (CLAUDE.md + SOUL.md) with personality, memory, and 38 skill routers.
GitHub
π»
Model
Context
Price (Input/Output per 1M tokens)
Key Feature
GPT-5.2 / 5.2 Thinking
400K
$1.75 / $14
Latest flagship, 90% cached discount, configurable reasoning
GPT-5.1
400K
$1.25 / $10
Previous generation flagship
GPT-4.1 / 4.1 mini / nano
1M
$2 / $8
Best non-reasoning model, 40% faster and 80% cheaper than GPT-4o
o3 / o3-pro
200K
Varies
Reasoning models with native tool use
o4-mini
200K
Cost-efficient
Fast reasoning, best on AIME at its cost class
GPT-OSS-120B / 20B
128K
$0.03 / $0.30
First open-weight models, Apache 2.0
Key features: Responses API, Agents SDK, Structured Outputs, function calling, prompt caching (90% discount), Batch API (50% discount), MCP support. Platform Docs
Model
Context
Price (Input/Output per 1M tokens)
Key Feature
Claude Opus 4.6
1M (beta)
$5 / $25
Most powerful, state-of-the-art coding and agentic tasks
Claude Sonnet 4.5
200K
$3 / $15
Best coding model, 61.4% OSWorld (computer use)
Claude Haiku 4.5
200K
Fast tier
Near-frontier, fastest model class
Claude Opus 4 / Sonnet 4
200K
$15/$75 (Opus)
Opus: 72.5% SWE-bench, Sonnet 4 powers GitHub Copilot
Key features: Extended Thinking with tool use, Computer Use, MCP (originated here), prompt caching, Claude Code CLI, available on AWS Bedrock and Google Vertex AI. API Docs
Model
Context
Price (Input/Output per 1M tokens)
Key Feature
Gemini 3 Pro Preview
1M
$2 / $12
Most intelligent Google model, deployed to 2B+ Search users
Gemini 2.5 Pro
1M
$1.25 / $10
Best for coding/agentic tasks, thinking model
Gemini 2.5 Flash / Flash-Lite
1M
$0.30/$1.50 Β· $0.10/$0.40
Price-performance leaders
Key features: Thinking (all 2.5+ models), Google Search grounding, code execution, Live API (real-time audio/video), context caching. Google AI Studio
Model
Architecture
Context
Key Feature
Llama 4 Scout
109B MoE / 17B active
10M
Fits single H100, multimodal, open-weight
Llama 4 Maverick
400B MoE / 17B active, 128 experts
1M
Beats GPT-4o, open-weight
Llama 3.3 70B
Dense
128K
Matches Llama 3.1 405B
Available on 25+ cloud partners, Hugging Face, and inference APIs. Llama
Provider
Description
Link
Mistral AI
Mistral Large 3 (675B MoE), Devstral 2, Ministral 3. Apache 2.0.
Website
DeepSeek
V3.2 (671B MoE), R1 (reasoning, MIT license). $0.15/$0.75 per 1M tokens.
Website
xAI (Grok)
Grok 4.1 Fast: 2M context, $0.20/$0.50 per 1M tokens.
Website
Cohere
Command A (111B, 256K context), Embed v4, Rerank 4.0. Excels at RAG.
Website
Together AI
200+ open models with sub-100ms latency.
Website
Groq
LPU hardware with ~300+ tokens/sec inference.
Website
Fireworks AI
Fast inference with HIPAA + SOC2 compliance.
Website
OpenRouter
Unified API for 300+ models from all providers.
Website
Cerebras
Wafer-scale chips with best total response time.
Website
Perplexity AI
Search-augmented API with citations.
Website
Amazon Bedrock
Managed multi-model service with Claude, Llama, Mistral, Cohere.
Website
Hugging Face Inference
Access to open models via API.
Website
πΎ
Major Benchmarks (2024β2026)
Name
Description
Link
Chatbot Arena / LM Arena
6M+ user votes for Elo-rated pairwise LLM comparisons. De facto standard for human preference.
Website
MMLU-Pro
12,000+ graduate-level questions across 14 domains. NeurIPS 2024 Spotlight.
GitHub
GPQA
448 "Google-proof" STEM questions; non-expert validators achieve only 34%.
arXiv
SWE-bench Verified
Human-validated 500-task subset for real-world GitHub issue resolution.
Website
SWE-bench Pro
1,865 tasks across 41 professional repos; best models score only ~23%.
Leaderboard
Humanity's Last Exam (HLE)
2,500 expert-vetted questions; top AI scores only ~10β30%.
Website
BigCodeBench
1,140 coding tasks across 7 domains; AI achieves ~35.5% vs. 97% human success.
Leaderboard
LiveBench
Contamination-resistant with frequently updated questions.
Paper
FrontierMath
Research-level math; AI solves only ~2% of problems.
Research
ARC-AGI v2
Abstract reasoning measuring fluid intelligence.
Research
IFEval
Instruction-following evaluation with formatting/content constraints.
arXiv
MLE-bench
OpenAI's ML engineering evaluation via Kaggle-style tasks.
GitHub
PaperBench
Evaluates AI's ability to replicate 20 ICML 2024 papers from scratch.
GitHub
Leaderboards and Meta-Benchmarks
Name
Description
Link
Hugging Face Open LLM Leaderboard v2
Evaluates open models on MMLU-Pro, GPQA, IFEval, MATH.
Leaderboard
Artificial Analysis Intelligence Index v3
Aggregates 10 evaluations.
Website
SEAL by Scale AI
Hosts SWE-bench Pro and agentic evaluations.
Leaderboard
Prompt and Instruction Datasets
Name
Description
Link
P3 (Public Pool of Prompts)
Prompt templates for 270+ NLP tasks used to train T0 and similar models.
HuggingFace
System Prompts Dataset
944 system prompt templates for agent workflows (by Daniel Rosehill, Aug 2025).
HuggingFace
OpenAssistant Conversations (OASST)
161,443 messages in 35 languages with 461,292 quality ratings.
HuggingFace
UltraChat / UltraFeedback
Large-scale synthetic instruction and preference datasets for alignment training.
HuggingFace
SoftAge Prompt Engineering Dataset
1,000 diverse prompts across 10 categories for benchmarking prompt performance.
HuggingFace
Text Transformation Prompt Library
Comprehensive collection of text transformation prompts (May 2025).
HuggingFace
Writing Prompts
~300K human-written stories paired with prompts from r/WritingPrompts.
Kaggle
Midjourney Prompts
Text prompts and image URLs scraped from MidJourney's public Discord.
HuggingFace
CodeAlpaca-20k
20,000 programming instruction-output pairs.
HuggingFace
ProPEX-RAG
Dataset for prompt optimization in RAG workflows.
HuggingFace
NanoBanana Trending Prompts
1,000+ curated AI image prompts from X/Twitter, ranked by engagement.
GitHub
Red Teaming and Adversarial Datasets
Name
Description
Link
HarmBench
510 harmful behaviors across standard, contextual, copyright, and multimodal categories.
Website
JailbreakBench
Open robustness benchmark for jailbreaking with 100 prompts.
Research
AgentHarm
110 malicious agent tasks across 11 harm categories.
arXiv
DecodingTrust
243,877 prompts evaluating trustworthiness across 8 perspectives.
Research
SafetyPrompts.com
Aggregator tracking 50+ safety/red-teaming datasets.
Website
π§
Frontier Models (2025β2026)
Model
Provider
Context
Key Strength
GPT-5.2
OpenAI
400K
General intelligence, 100% AIME 2025
Claude Opus 4.6
Anthropic
1M (beta)
Coding, agentic tasks, extended thinking
Gemini 3 Pro
Google
1M
#1 LMArena (~1500 Elo), multimodal
Grok 4.1
xAI
2M
#2 LMArena (1483 Elo), low hallucination
Mistral Large 3
Mistral AI
256K
Best open-weight (675B MoE/41B active), Apache 2.0
DeepSeek-V3.2
DeepSeek
128K
Best value (671B MoE/37B active), MIT license
Llama 4 Maverick
Meta
1M
Beats GPT-4o (400B MoE/17B active), open-weight
Model
Key Detail
OpenAI o3 / o3-pro
87.7% GPQA Diamond. Native tool use.
OpenAI o4-mini
Best AIME at its cost class with visual reasoning.
DeepSeek-R1 / R1-0528
Open-weight, RL-trained. 87.5% on AIME 2025. MIT license.
QwQ (Qwen with Questions)
32B reasoning model. Apache 2.0. Comparable to R1.
Gemini 2.5 Pro/Flash (Thinking)
Built-in reasoning with configurable thinking budget.
Claude Extended Thinking
Hybrid mode with visible chain-of-thought and tool use.
Phi-4 Reasoning / Plus
14B reasoning models rivaling much larger models. Open-weight.
GPT-OSS-120B
OpenAI's open-weight with CoT. Near-parity with o4-mini. Apache 2.0.
Notable Open-Source Models
Model
Provider
Key Detail
Qwen3-235B-A22B
Alibaba
Flagship MoE. Strong reasoning/code/multilingual. Apache 2.0. Most downloaded family on HuggingFace.
Gemma 3
Google
270M to 27B. Multimodal. 128K context. 140+ languages.
OLMo 2/3
Allen AI
Fully open (data, code, weights, logs). OLMo 2 32B surpasses GPT-3.5. Apache 2.0.
SmolLM3-3B
Hugging Face
Outperforms Llama-3.2-3B. Dual-mode reasoning. 128K context.
Kimi K2
Moonshot AI
32B active. Open-weight. Tailored for coding/agentic use.
Llama 4 Scout
Meta
109B MoE/17B active. 10M token context. Fits single H100.
Model
Key Detail
Qwen3-Coder (480B-A35B)
69.6% SWE-bench β milestone for open-source coding. 256K context. Apache 2.0.
Devstral 2 (123B)
72.2% SWE-bench Verified. 7x more cost-efficient than Claude Sonnet.
Codestral 25.01
Mistral's code model. 80+ languages. Fill-in-the-Middle support.
DeepSeek-Coder-V2
236B MoE / 21B active. 338 programming languages.
Qwen 2.5-Coder
7B/32B. 92 programming languages. 88.4% HumanEval. Apache 2.0.
Foundational Models (Historical Reference)
These models established key concepts but are largely superseded for practical use:
Model
Provider
Significance
BLOOM 176B
BigScience
First major open multilingual LLM (2022)
GLM-130B
Tsinghua
Open bilingual English/Chinese LLM (2023)
Falcon 180B
TII
Large open generative model (2023)
Mixtral 8x7B
Mistral AI
Pioneered MoE architecture for open models (2023)
GPT-NeoX-20B
EleutherAI
Early open autoregressive LLM
GPT-J-6B
EleutherAI
Early open causal language model
π
Leading Commercial Detectors
Name
Accuracy
Key Feature
Link
GPTZero
99% claimed
10M+ users, #1 on G2 (2025). Detects GPT-4/5, Gemini, Claude, Llama. Free tier available.
Website
Originality.ai
98β100% (peer-reviewed)
Consistently rated most accurate. Combines AI detection + plagiarism + fact checking. From $14.95/month.
Website
Turnitin AI Detection
98%+ on unmodified AI text
Dominant in academia. Launched AI bypasser/humanizer detection (Aug 2025). Institutional licensing.
Website
Copyleaks
99%+ claimed
Enterprise tool detecting AI in 30+ languages. LMS integrations.
Website
Winston AI
99.98% claimed
OCR for scanned documents, AI image/deepfake detection. 11 languages.
Website
Pangram Labs
99.3% (COLING 2025)
Highest score in COLING 2025 Shared Task. 100% TPR on "humanized" text. 97.7% adversarial robustness.
Website
Free and Research Detectors
Name
Description
Link
Binoculars
Open-source research detector using cross-perplexity between two LLMs.
arXiv
DetectGPT / Fast-DetectGPT
Statistical method comparing log-probabilities of original text vs. perturbations.
arXiv
Openai Detector
AI classifier for indicating AI-written text (OpenAI Detector Python wrapper)
[GitHub]
Sapling AI Detector
Free browser-based detector (up to 2,000 chars). 97% accuracy in some studies.
Website
QuillBot AI Detector
Free, no sign-up required.
Website
Writer AI Content Detector
Free tool with color-coded results.
Website
ZeroGPT
Popular free detector evaluated in multiple academic studies.
Website
Name
Description
Link
SynthID (Google DeepMind)
Watermarking for AI text, images, and audio via statistical token sampling. Deployed in Google products.
Website
OpenAI Text Watermarking
Developed but still experimental as of 2025. Research shows fragility concerns.
Experimental
Important caveat: No detector claims 100% accuracy. Mixed human/AI text remains hardest to detect (50β70% accuracy). Adversarial robustness varies widely. The AI detection market is projected to grow from ~$2.3B (2025) to $15B by 2035.
π
Title
Author(s)
Publisher
Year
Prompt Engineering for LLMs
John Berryman & Albert Ziegler
O'Reilly
2024
Prompt Engineering for Generative AI
James Phoenix & Mike Taylor
O'Reilly
2024
Prompt Engineering for LLMs
Thomas R. Caldwell
Independent
2025
LLM Application Development
Title
Author(s)
Publisher
Year
AI Engineering: Building Applications with Foundation Models
Chip Huyen
O'Reilly
2025
Build a Large Language Model (From Scratch)
Sebastian Raschka
Manning
2024
Building LLMs for Production
Louis-FranΓ§ois Bouchard & Louie Peters
O'Reilly
2024
LLM Engineer's Handbook
Paul Iusztin & Maxime Labonne
Packt
2024
The Hundred-Page Language Models Book
Andriy Burkov
Self-Published
2025
Title
Author(s)
Publisher
Year
Building Applications with AI Agents
Michael Albada
O'Reilly
2025
AI Agents and Applications
Roberto Infante
Manning
2025
AI Agents in Action
Micheal Lanham
Manning
2025
Production, Reliability, and Security
Title
Author(s)
Publisher
Year
LLMs in Production
Christopher Brousseau & Matthew Sharp
Manning
2025
Building Reliable AI Systems
Rush Shahani
Manning
2025
The Developer's Playbook for LLM Security
Steve Wilson
O'Reilly
2024
π©βπ«
University and Platform Courses
π
OpenAI Prompt Engineering Guide β Comprehensive, covering GPT-4.1/5 prompting, reasoning models, structured outputs, agentic workflows. Continuously updated.
OpenAI GPT-4.1 Prompting Guide [2025] β Structured agent-like prompt design: goal persistence, tool integration, long-context processing.
Anthropic Prompt Engineering Overview β Iterative prompt design, XML tags, chain-of-thought, role assignment. Includes prompt generator.
Anthropic Claude 4 Best Practices [2025β2026] β Parallel tool execution, thinking capabilities, image processing.
Anthropic: Effective Context Engineering for AI Agents [2025] β The evolution from prompt engineering to context engineering: agent state, memory, tools, MCP.
Google Gemini Prompting Strategies β Multimodal prompting for Gemini via Vertex AI and AI Studio.
Microsoft Prompt Engineering in Azure AI Studio β Tool calling, function design, few-shot prompting, prompt chaining.
Community and Independent Guides
Prompt Engineering Guide (DAIR.AI / promptingguide.ai) β Most comprehensive open-source guide. 18+ techniques, model-specific guides, research papers. 3M+ learners. Now includes context engineering.
Learn Prompting (learnprompting.org) β Structured free platform. Beginner to advanced PE, AI security, HackAPrompt competition.
IBM 2026 Guide to Prompt Engineering [2026] β Curated tools, tutorials, real-world examples with Python code.
Anthropic Interactive Tutorial β 9-chapter Jupyter notebook course with hands-on exercises.
Lilian Weng's Prompt Engineering Guide [2023] β Highly respected technical blog from OpenAI researcher.
Google Prompt Engineering Guide (68-page PDF) [2025] β Internal-style best-practice guide for Gemini with concrete patterns.
DigitalOcean: Prompt Engineering Best Practices [2025] β Updated guide summarizing techniques: few-shot, chain-of-thought, role prompting, etc.
Aakash Gupta: Prompt Engineering in 2025 [2025] β Practical guide with wisdom from shipping AI at OpenAI, Shopify, and Google.
Best practices for prompt engineering with OpenAI API β OpenAI's introductory best practices.
OpenAI Cookbook β Official recipes for function calling, RAG, evaluation, and complex workflows.
Microsoft Prompt Engineering Docs β Microsoft's open prompt engineering resources.
DALLE Prompt Book β Visual guide for text-to-image prompting.
Best 100+ Stable Diffusion Prompts β Community-curated image generation prompts.
Vibe Engineering (Manning) β Book by Tomasz Lelek & Artur Skowronski on building software through natural language prompts.
π₯
Andrej Karpathy: "Deep Dive into LLMs" & "How I Use LLMs" [2024β2025] β Two of the most influential AI videos of 2024β2025. Comprehensive technical deep dive followed by practical usage patterns.
Karpathy: "Software in the Era of AI" (YC AI Startup School) [2025] β Coined "vibe coding" (Feb 2025) and championed "context engineering" (Jun 2025).
Karpathy: Neural Networks: Zero to Hero [2023β2024] β Full lecture series building from backpropagation to GPT.
3Blue1Brown: Neural Networks Series [Updated 2024] β Iconic animated visual explanations of transformers and attention mechanisms. 7M+ subscribers.
AI Explained [2024β2025] β Long-form analysis breaking down papers, model capabilities, and PE developments.
Sam Witteveen [2024β2025] β Practical tutorials on prompt engineering, LangChain, RAG, and agents.
Matthew Berman [2024β2025] β Popular channel covering model releases and practical LLM usage. 600K+ subscribers.
DeepLearning.AI YouTube [2024β2026] β Structured lessons, course previews, and Andrew Ng talks on agents and AI careers.
Lex Fridman Podcast (AI Episodes) [2024β2025] β Long-form interviews with Altman, Hinton, Amodei on LLMs, prompting, and safety.
ICSE 2025: AIware Prompt Engineering Tutorial [2025] β Conference tutorial covering prompt patterns, fragility, anti-patterns, and optimization DSLs.
CMU Advanced NLP 2022: Prompting β Foundational academic lecture on prompting methods.
ChatGPT: 5 Prompt Engineering Secrets For Beginners β Accessible intro for beginners.
π€
Learn Prompting β 40,000+ members. Largest PE Discord with courses, hackathons, HackAPrompt competitions.
PromptsLab Discord - Community
Midjourney β 1M+ members. Primary hub for text-to-image prompt sharing.
OpenAI Discord β Official community with channels for GPTs, Sora, DALL-E, and API help.
Anthropic Discord β Official Claude community for AI development collaboration.
Hugging Face Discord β Model discussions, library support, community events.
FlowGPT β 33K+ members. 100K+ prompts across ChatGPT, DALL-E, Stable Diffusion, Claude.
r/PromptEngineering β Dedicated subreddit for prompt crafting techniques and discussions.
r/ChatGPT β 10M+ members. Primary hub for ChatGPT users and prompt sharing.
r/LocalLLaMA β Highly technical community for running open-source LLMs locally.
r/ClaudeAI β Anthropic's Claude community: prompt sharing, API tips, model comparisons.
r/MachineLearning β Academic-oriented ML research discussions.
r/OpenAI β OpenAI product and API discussions.
r/StableDiffusion β 450K+ members for AI art prompting and workflows.
r/ChatGPTPromptGenius β 35K+ members sharing and refining prompts.
LangChain β Open-source LLM app framework. 100K+ stars.
Promptslab β Generative Models | Prompt-Engineering | LLMs
Hugging Face β Central hub: Transformers, Diffusers, Datasets, TRL.
DSPy (Stanford NLP) β Growing community for systematic prompt optimization.
OpenAI β Open-source models, benchmarks, and tools.
We welcome contributions to this list! Before contributing, please take a moment to review our contribution guidelines . These guidelines will help ensure that your contributions align with our objectives and meet our standards for quality and relevance.
What we're looking for:
New high-quality papers, tools, or resources with a brief description of why they matter
Updates to existing entries (broken links, outdated information)
Corrections to star counts, pricing, or model details
Translations and accessibility improvements
Quality standards:
All tools should be actively maintained (updated within the last 6 months)
Papers should be from peer-reviewed venues or have significant community adoption
Datasets should be publicly accessible
Please include a one-line description explaining why the resource is valuable
Thank you for your interest in contributing to this project!
Maintained by PromptsLab Β· Star this repo if you find it useful!