In [2]:
import dspy 


  from .autonotebook import tqdm as notebook_tqdm


## Setting up the LM client

In [3]:
gpt3_turbo = dspy.OpenAI(model='gpt-3.5-turbo-1106', max_tokens=300)
dspy.configure(lm=gpt3_turbo)

## Directly calling the LM

In [4]:
gpt3_turbo("hello! this is a raw prompt to GPT-3.5")


['Hello! How can I assist you today?']

## Using the LM with DSPy signatures

In [8]:
# Define a module (ChainOfThought) and assign it a signature (return an answer, given a question).
qa = dspy.ChainOfThought('question -> answer')

# Run with the default LM configured with `dspy.configure` above.
response = qa(question="How many floors are in the castle David Gregory inherited?")
print(response.answer)

The castle David Gregory inherited has 5 floors.


## Using multiple LMs at once

In [10]:
# Run with the default LM configured above, i.e. GPT-3.5
response = qa(question="How many floors are in the castle David Gregory inherited?")
print('GPT-3.5:', response.answer)

gpt4_turbo = dspy.OpenAI(model='gpt-4-1106-preview', max_tokens=300)

# Run with GPT-4 instead
with dspy.context(lm=gpt4_turbo):
    response = qa(question="How many floors are in the castle David Gregory inherited?")
    print('GPT-4-turbo:', response.answer)

GPT-3.5: The castle David Gregory inherited has 5 floors.
GPT-4-turbo: The number of floors in the castle David Gregory inherited cannot be determined with the information provided.


One is halluciated<br>
The other is not<br>

## Tips and Tricks

In DSPy, all LM calls are cached. If you repeat the same call, you will get the same outputs. (If you change the inputs or configurations, you will get new outputs.)

To generate 5 outputs, you can use n=5 in the module constructor, or pass config=dict(n=5) when invoking the module.

In [11]:
qa = dspy.ChainOfThought('question -> answer', n=5)

response = qa(question="How many floors are in the castle David Gregory inherited?")
response.completions.answer

['The castle David Gregory inherited has 5 floors.',
 'The castle David Gregory inherited has 4 floors.',
 'The number of floors in the castle David Gregory inherited is 4.',
 'The castle David Gregory inherited has 4 floors.',
 'The castle David Gregory inherited has 7 floors.']

In [12]:
response.answer


'The castle David Gregory inherited has 5 floors.'

To loop and generate one output at a time with the same input, bypass the cache by making sure each request is (slightly) unique, as below.



In [13]:
for idx in range(5):
    response = qa(question="How many floors are in the castle David Gregory inherited?", config=dict(temperature=0.7+0.0001*idx))
    print(f'{idx+1}.', response.answer)
    

1. The castle David Gregory inherited has 5 floors.
2. The castle David Gregory inherited has 5 floors.
3. The castle David Gregory inherited has 4 floors.
4. The castle David Gregory inherited has 5 floors.
5. The castle David Gregory inherited has 8 floors.


## Remote LMs

In [None]:
lm = dspy.{provider_listed_below}(model="your model", model_request_kwargs="...")


1. dspy.OpenAI for GPT-3.5 and GPT-4.

2. dspy.Cohere

3. dspy.Anyscale for hosted Llama2 models.

4. dspy.Together for hosted various open source models.

## Local LMs

1. dspy.HFClientTGI: for HuggingFace models through the Text Generation Inference (TGI) system. Tutorial: How do I install and launch the TGI server?

In [None]:
tgi_llama2 = dspy.HFClientTGI(model="meta-llama/Llama-2-7b-hf", port=8080, url="http://localhost")

2. dspy.HFClientVLLM: for HuggingFace models through vLLM. Tutorial: How do I install and launch the vLLM server?

In [None]:
vllm_llama2 = dspy.HFClientVLLM(model="meta-llama/Llama-2-7b-hf", port=8080, url="http://localhost")


3. dspy.HFModel (experimental) Tutorial: How do I initialize models using HFModel

In [None]:
llama = dspy.HFModel(model = 'meta-llama/Llama-2-7b-hf')


4. dspy.Ollama (experimental) for open source models through Ollama. Tutorial: How do I install and use Ollama on a local computer?\n",

In [None]:
mistral_ollama = dspy.OllamaLocal(model='mistral')


5. dspy.ChatModuleClient (experimental): How do I install and use MLC?

In [None]:
model = 'dist/prebuilt/mlc-chat-Llama-2-7b-chat-hf-q4f16_1'
model_path = 'dist/prebuilt/lib/Llama-2-7b-chat-hf-q4f16_1-cuda.so'

llama = dspy.ChatModuleClient(model=model, model_path=model_path)