# LangChain and Azure AI Foundry

This notebook explain how to use `langchain-azure-ai` package with the capabilities in Azure AI Foundry.

## 1. Prerequisites

To run this tutorial you need either:

1. Using GitHub Models:

    1. You can use [GitHub models](https://github.com/marketplace/models) endpoint including the free tier experience.
    2. Use the endpoint `https://models.inference.ai.azure.com` along with your GitHub Token.

1. Using Azure AI Foundry:

    1. Create an [Azure subscription](https://azure.microsoft.com).
    2. Create an Azure AI hub resource as explained at [How to create and manage an Azure AI Studio hub](https://learn.microsoft.com/en-us/azure/ai-studio/how-to/create-azure-ai-resource).
    3. Deploy one model supporting the [Azure AI model inference API](https://aka.ms/azureai/modelinference). In this example we use a `Mistral-Large-2407` and a `Mistral-Small` deployment. 

        * You can follow the instructions at [Add and configure models to Azure AI model inference service](https://learn.microsoft.com/azure/ai-studio/ai-services/how-to/create-model-deployments).

Install the following packages:

```bash
pip install -U langchain-core langchain-azure-ai
```

## 2. Use chat completions models

Create a client to connect to the endpoint. In this case, we are working with a chat completions model hence we import the class `AzureAIChatCompletionsModel`.

In [None]:
import os
from langchain_azure_ai.chat_models import AzureAIChatCompletionsModel

model = AzureAIChatCompletionsModel(
    endpoint=os.environ["AZURE_INFERENCE_ENDPOINT"],
    credential=os.environ["AZURE_INFERENCE_CREDENTIAL"],
    model="mistral-large-2407",
)

> In the previous example, we are indicating the parameter `model_name` since our endpoint has multiple models deployed on it. If your endpoint has only 1 model deployed, like with Serverless API Endpoints, you don't need to indicate the parameter `model_name`.

Let's first use the model directly. ChatModels are instances of LangChain Runnable, which means they expose a standard interface for interacting with them. To simply call the model, we can pass in a list of messages to the invoke method.

In [None]:
from langchain_core.messages import HumanMessage, SystemMessage

messages = [
    SystemMessage(content="Translate the following from English into Italian"),
    HumanMessage(content="hi!"),
]

model.invoke(messages)

### Using tools

Certain models support the use of tools, either built-in tools or defined. LangChain allows indicating tools in different ways. In the following example, we use Python functions to define the schemas. Here the use of docstring is required in the functions.

In [None]:
def add(a: int, b: int) -> int:
    """Add two integers.

    Args:
        a: First integer
        b: Second integer
    """
    return a + b


def multiply(a: int, b: int) -> int:
    """Multiply two integers.

    Args:
        a: First integer
        b: Second integer
    """
    return a * b


tools = [add, multiply]

To actually bind those schemas to a chat model, we'll use the .bind_tools() method. This handles converting the add and multiply schemas to the proper format for the model. The tool schema will then be passed it in each time the model is invoked.

In [None]:
llm_with_tools = model.bind_tools(tools)

Let's see how it works:

In [None]:
llm_with_tools.invoke("What is 3 * 12?")

### Using multiple models in a chain

Models deployed to Azure AI Foundry support the Azure AI model inference API, which is standard across all the models. Chain multiple LLM operations based on the capabilities of each model so you can optimize for the right model based on capabilities.

In the following example, we create 2 model clients, one is a producer and another one is a verifier. To make the distinction clear, we are using a multi-model endpoint like the Azure AI model inference service and hence we are passing the parameter `model_name` to use a Mistral-Large and a Mistral-Small model, quoting the fact that producing content is more complex than verifying it.

In [None]:
from langchain_azure_ai.chat_models import AzureAIChatCompletionsModel

producer = AzureAIChatCompletionsModel(
    endpoint=os.environ["AZURE_INFERENCE_ENDPOINT"],
    credential=os.environ["AZURE_INFERENCE_CREDENTIAL"],
    model="mistral-large-2407",
)

verifier = AzureAIChatCompletionsModel(
    endpoint=os.environ["AZURE_INFERENCE_ENDPOINT"],
    credential=os.environ["AZURE_INFERENCE_CREDENTIAL"],
    model="mistral-small",
)

The following example generates a poem written by an urban poet:

In [None]:
from langchain_core.prompts import PromptTemplate

producer_template = PromptTemplate(
    template="You are an urban poet, your job is to come up \
             verses based on a given topic.\n\
             Here is the topic you have been asked to generate a verse on:\n\
             {topic}",
    input_variables=["topic"],
)

verifier_template = PromptTemplate(
    template="You are a verifier of poems, you are tasked\
              to inspect the verses of poem. If they consist of violence and abusive language\
              report it. Your response should be only one word either True or False.\n \
              Here is the lyrics submitted to you:\n\
              {input}",
    input_variables=["input"],
)

Now, let's create an output parser:

In [None]:
from langchain_core.output_parsers import StrOutputParser

parser = StrOutputParser()

We can now combine the template, model, and the output parser from above using the pipe (`|`) operator:

In [None]:
chain = producer_template | producer | parser | verifier_template | verifier | parser

The previous chain returns the output of the step `verifier` only. Since we want to access the intermediate result generated by the `producer`, in LangChain you need to use a `RunnablePassthrough` object to also output that intermediate step. The following code shows how to do it:

In [None]:
generate_poem = producer_template | producer | parser
verify_poem = verifier_template | verifier | parser

In [None]:
from langchain_core.runnables import RunnablePassthrough, RunnableParallel

chain = generate_poem | RunnableParallel(poem=RunnablePassthrough(), abuse=RunnablePassthrough() | verify_poem)

To invoke the chain, identify the inputs required and provide values using the `invoke` method:

In [None]:
chain.invoke({"topic": "living in a foreign country"})

## 3. Debugging and troubleshooting

If you need to debug your application and understand which parameters are being sent to the models in Azure AI Foundry, you can use the debug capabilities of the integration as follows:

First, configure logging to the level you are interested in:

In [None]:
import sys
import logging

# Acquire the logger for this client library. Use 'azure' to affect both
# 'azure.core` and `azure.ai.inference' libraries.
logger = logging.getLogger("azure")

# Set the desired logging level. logging.INFO or logging.DEBUG are good options.
logger.setLevel(logging.DEBUG)

# Direct logging output to stdout:
handler = logging.StreamHandler(stream=sys.stdout)
# Or direct logging output to a file:
# handler = logging.FileHandler(filename="sample.log")
logger.addHandler(handler)

# Optional: change the default logging format. Here we add a timestamp.
formatter = logging.Formatter("%(asctime)s:%(levelname)s:%(name)s:%(message)s")
handler.setFormatter(formatter)

To see the payloads of the requests, when instantiating the client, pass the argument `logging_enable=True` to the `client_kwargs`:

In [None]:
import os
from langchain_azure_ai.chat_models import AzureAIChatCompletionsModel

model = AzureAIChatCompletionsModel(
    endpoint=os.environ["AZURE_INFERENCE_ENDPOINT"],
    credential=os.environ["AZURE_INFERENCE_CREDENTIAL"],
    model="mistral-large-2407",
    client_kwargs={"logging_enable": True},
)

Use the client as usual in your code.

In [None]:
from langchain_core.messages import HumanMessage, SystemMessage

messages = [
    SystemMessage(content="Translate the following from English into Italian"),
    HumanMessage(content="hi!"),
]

model.invoke(messages)