dspy/README.md at main · stanfordnlp/dspy #734

irthomasthomas · 2024-03-16T13:03:44Z

dspy/README.md at main · stanfordnlp/dspy

dspy/README.md at main · stanfordnlp/dspy

DSPy: Programming—not prompting—Foundation Models

[Oct'23] DSPy: Compiling Declarative Language Model Calls into Self-Improving Pipelines
[Jan'24] In-Context Learning for Extreme Multi-Label Classification
[Dec'23] DSPy Assertions: Computational Constraints for Self-Refining Language Model Pipelines
[Dec'22] Demonstrate-Search-Predict: Composing Retrieval & Language Models for Knowledge-Intensive NLP

Getting Started:

Documentation: DSPy Docs

DSPy is a framework for algorithmically optimizing LM prompts and weights, especially when LMs are used one or more times within a pipeline. To use LMs to build a complex system without DSPy, you generally have to: (1) break the problem down into steps, (2) prompt your LM well until each step works well in isolation, (3) tweak the steps to work well together, (4) generate synthetic examples to tune each step, and (5) use these examples to finetune smaller LMs to cut costs. Currently, this is hard and messy: every time you change your pipeline, your LM, or your data, all prompts (or finetuning steps) may need to change.

To make this more systematic and much more powerful, DSPy does two things. First, it separates the flow of your program (modules) from the parameters (LM prompts and weights) of each step. Second, DSPy introduces new optimizers, which are LM-driven algorithms that can tune the prompts and/or the weights of your LM calls, given a metric you want to maximize.

DSPy can routinely teach powerful models like GPT-3.5 or GPT-4 and local models like T5-base or Llama2-13b to be much more reliable at tasks, i.e. having higher quality and/or avoiding specific failure patterns. DSPy optimizers will "compile" the same program into different instructions, few-shot prompts, and/or weight updates (finetunes) for each LM. This is a new paradigm in which LMs and their prompts fade into the background as optimizable pieces of a larger system that can learn from data. tldr; less prompting, higher scores, and a more systematic approach to solving hard tasks with LMs.

Analogy to Neural Networks

When we build neural networks, we don't write manual for-loops over lists of hand-tuned floats. Instead, you might use a framework like PyTorch to compose declarative layers (e.g., Convolution or Dropout) and then use optimizers (e.g., SGD or Adam) to learn the parameters of the network.

Ditto! DSPy gives you the right general-purpose modules (e.g., ChainOfThought, ReAct, etc.), which replace string-based prompting tricks. To replace prompt hacking and one-off synthetic data generators, DSPy also gives you general optimizers (BootstrapFewShotWithRandomSearch or BayesianSignatureOptimizer), which are algorithms that update parameters in your program. Whenever you modify your code, your data, your assertions, or your metric, you can compile your program again and DSPy will create new effective prompts that fit your changes.

Mini-FAQs

What do DSPy optimizers tune? Each optimizer is different, but they all seek to maximize a metric on your program by updating prompts or LM weights. Current DSPy optimizers can inspect your data, simulate traces through your program to generate good/bad examples of each step, propose or refine instructions for each step based on past results, finetune the weights of your LM on self-generated examples, or combine several of these to improve quality or cut cost. We'd love to merge new optimizers that explore a richer space: most manual steps you currently go through for prompt engineering, "synthetic data" generation, or self-improvement can probably generalized into a DSPy optimizer that acts on arbitrary LM programs.

How should I use DSPy for my task? Using DSPy is an iterative process. You first define your task and the metrics you want to maximize, and prepare a few example inputs — typically without labels (or only with labels for the final outputs, if your metric requires them). Then, you build your pipeline by selecting built-in layers (modules) to use, giving each layer a signature (input/output spec), and then calling your modules freely in your Python code. Lastly, you use a DSPy optimizer to compile your code into high-quality instructions, automatic few-shot examples, or updated LM weights for your LM.

What if I have a better idea for prompting or synthetic data generation? Perfect. We encourage you to think if it's best expressed as a module or an optimizer, and we'd love to merge it in DSPy so everyone can use it. DSPy is not a complete project; it's an ongoing effort to create structure (modules and optimizers) in place of hacky prompt and pipeline engineering tricks.

What does DSPy stand for? It's a long story but the backronym now is Declarative Self-improving Language Programs, pythonically.

1) Installation

All you need is:

pip install dspy-ai

Or open our intro notebook in Google Colab:

By default, DSPy installs the latest openai from pip. However, if you install old version before OpenAI changed their API openai~=0.28.1, the library will use that just fine. Both are supported.

For the optional (alphabetically sorted) Chromadb, Qdrant, Marqo, Pinecone, or Weaviate retrieval integration(s), include the extra(s) below:

pip install dspy-ai[chromadb]  # or [qdrant] or [marqo] or [mongodb] or [pinecone] or [weaviate]

2) Documentation

The DSPy documentation is divided into tutorials (step-by-step illustration of solving a task in DSPy), guides (how to use specific parts of the API), and examples (self-contained programs that illustrate usage).

A) Tutorials

Level	Tutorial	Run in Colab	Description
Beginner	Getting Started		Introduces the basic building blocks in DSPy. Tackles the task of complex question answering with HotPotQA.
Beginner	Minimal Working Example	N/A	Builds and optimizes a very simple chain-of-thought program in DSPy for math question answering. Very short.
Beginner	Compiling for Tricky Tasks	N/A	Teaches LMs to reason about logical statements and negation. Uses GPT-4 to bootstrap few-shot CoT demonstations for GPT-3.5. Establishes a state-of-the-art result on ScoNe. Contributed by Chris Potts.
Beginner	Local Models & Custom Datasets		Illustrates two different things together: how to use local models (Llama-2-13B in particular) and how to use your own data examples for training and development.
Intermediate	The DSPy Paper	N/A	Sections 3, 5, 6, and 7 of the DSPy paper can be consumed as a tutorial. They include explained code snippets, results, and discussions of the abstractions and API.
Intermediate	DSPy Assertions		Introduces example of applying DSPy Assertions while generating long-form responses to questions with citations. Presents comparative evaluation in both zero-shot and compiled settings.
Intermediate	Finetuning for Complex Programs		Teaches a local T5 model (770M) to do exceptionally well on HotPotQA. Uses only 200 labeled answers. Uses no hand-written prompts, no calls to OpenAI, and no labels for retrieval or reasoning.
Advanced	Information Extraction		Tackles extracting information from long articles (biomedical research papers). Combines in-context learning and retrieval to set SOTA on BioDEX. Contributed by Karel D’Oosterlinck.

Other resources people find useful:

DSPy talk at ScaleByTheBay Nov 2023.
DSPy webinar with MLOps Learners, a bit longer with Q&A.
Hands-on Overviews of DSPy by the community: DSPy Explained! by Connor Shorten, DSPy explained by code_your_own_ai
Interviews: Weaviate Podcast in-person, and you can find 6-7 other remote podcasts on YouTube from a few different perspectives/audiences.
Tracing in DSPy with Arize Phoenix: Tutorial for tracing your prompts and the steps of your DSPy programs

B) Guides

If you're new to DSPy, it's probably best to go in sequential order. You will probably refer to these guides frequently after that, e.g. to copy/paste snippets that you can edit for your own DSPy programs.

C) Examples

The DSPy team believes complexity has to be justified. We take this seriously: we never release a complex tutorial (above) or example (below) unless we can demonstrate empirically that this complexity has generally led to improved quality or cost. This kind of rule is rarely enforced by other frameworks or docs, but you can count on it in DSPy examples.

There's a bunch of examples in the examples/ directory and in the top-level directory. We welcome contributions!

You can find other examples tweeted by @lateinteraction on Twitter/X.

Some other examples (not exhaustive, feel free to add more via PR):

There are also recent cool examples at Weaviate's DSPy cookbook by Connor Shorten. See tutorial on YouTube.

3) Syntax: You're in charge of the workflow—it's free-form Python code!

DSPy hides tedious prompt engineering, but it cleanly exposes the important decisions you need to make: [1] what's your

Suggested labels

The text was updated successfully, but these errors were encountered:

irthomasthomas · 2024-03-16T13:03:48Z

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

dspy/README.md at main · stanfordnlp/dspy #734

dspy/README.md at main · stanfordnlp/dspy #734

irthomasthomas commented Mar 16, 2024

irthomasthomas commented Mar 16, 2024

dspy/README.md at main · stanfordnlp/dspy #734

dspy/README.md at main · stanfordnlp/dspy #734

Comments

irthomasthomas commented Mar 16, 2024

dspy/README.md at main · stanfordnlp/dspy

DSPy: Programming—not prompting—Foundation Models

Table of Contents

Analogy to Neural Networks

Mini-FAQs

1) Installation

2) Documentation

A) Tutorials

B) Guides

C) Examples

3) Syntax: You're in charge of the workflow—it's free-form Python code!

Suggested labels

irthomasthomas commented Mar 16, 2024

Related content