#DSPy + MLflow
    
Let's walk through a quick example of **basic question answering** with and without **retrieval-augmented generation** (RAG) in [DSPy](https://dspy.ai/). Specifically, let's build **a system for answering Tech questions**, e.g. about Linux or iPhone apps.

In [0]:
%run ./00_config

# Step 1: Configuration

## Configuring the DSPy environment.

Let's configure DSPy to use the `meta-llama-3-1-3b-instruct` model we deployed during the setup. Our config file contains the `CHAT_MODEL_NAME` variable pointing to the deployed model. 


In [0]:
import dspy

lm = dspy.LM(CHAT_MODEL_NAME)
dspy.configure(lm=lm)

## GenAI Tracing via MLflow Autologging

[MLflow Tracing](https://mlflow.org/docs/latest/llms/tracing/index.html) is a feature that enhances observability in Generative AI applications by capturing detailed information about the execution of services. It records inputs, outputs, and metadata for each step of a request, aiding in debugging and performance monitoring. Tracing can be automated with libraries like LangChain, OpenAI, and recently, DSPy! By simpliy using `mlflow.<library>.autolog()` you get detailed execution traces for you GenAI applications.

In [0]:
import mlflow 
mlflow.dspy.autolog()

# Step 2: Example Usage

## Exploring basic DSPy Modules

There are several ways to call an LLM using dspy. The simplest is via `lm(prompt=\"prompt\")` or `lm(messages=[...])`, but this just sends the prompt/messages to the model and returns the response. 

DSPy also gives you `Modules` as a more robust way to define your LM functions.

The simplest module is `dspy.Predict`. It takes a [DSPy Signature](https://dspy.ai/learn/programming/signatures/?h=signature), which is simply a structured input and output schema, and returns a callable for the behavior you specified. Let's use the `\"in-line\"` notation for signatures to declare a module that takes a question (`str`) and returns a `response` (`str`).

TODO: add details here on why dspy uses signatures instead of prompts  

In [0]:
# Define prediction strategy and signature
qa = dspy.Predict('question: str -> response: str')
# run inference
response = qa(question="what are high memory and low memory on linux?")

print(response.response)

As show above in the MLflow trace, we can see the variable names question and response in the signature defined our input and output, respectively.

Now, what did DSPy do to build this qa module? Nothing fancy in this example, yet. The module passed your signature, LM, and inputs to an Adapter, which is a layer that handles structuring the inputs and parsing structured outputs to fit your plaintext signature.

###Other DSPy Predictors (prediction strategies)
DSPy has various built-in modules, e.g. `dspy.ChainOfThought`, `dspy.ProgramOfThought`, and `dspy.ReAct`. These are interchangeable with basic `dspy.Predict`: they take your signature, which is specific to your task, and they apply general-purpose prompting techniques and inference-time strategies.

For example, `dspy.ChainOfThought` is an easy way to elicit _reasoning_ from your language model before it outputs the requested in your signature.

In [0]:
cot = dspy.ChainOfThought('question: str -> response: str')
cot(question="what is Linux?")