# Simple chain 

Here is a list of the crucial libraries that provides framework for building a langage model pipeline, allready integrated in Databricks. 
Let's describe a few of them :

1. **LangChain** is a library that allows you to integrate pre-trained models or other open-source libraries into your workflow. The reason to choose LangChain is its flexibility and compatibility with large language models. You can create powerfull and complex workflows with **Langgraph**. Currently, one of the most, if not the most, used genAI workflow of the market. Thati is the one I will use in this notebook.

2. **DSPy** automates prompt tuning by translating user-defined natural language signatures into complete instructions and few-shot examples. Very usefull if you want to optimize your prompt automatically

3. **Hugging Face Transformers**: This is a state-of-the-art library for Natural Language Processing (NLP). It provides thousands of pre-trained models to perform tasks on texts such as classification, information extraction, summarization, translation, text generation, etc. The reason to choose this library is the wide variety of pre-trained models it offers, its active community, and its seamless integration with PyTorch and TensorFlow. Many of the popular NLP models work best on GPU hardware, so you might get the best performance using recent GPU hardware unless you use a model specifically optimized for use on CPUs.

The choice of these libraries depends on the specific requirements of your project. They all provide different functionalities that can be useful in different stages of a language model pipeline, from data preprocessing to model training and deployment.


## Deploy Your LLM Chatbots with Mosaic AI Agent Framework

In this tutorial, you will learn how to build your own Chatbot Assistant to help your customers answer questions about Databricks. 
To use this first notebook, make sure you have done the prerequisites :
- Create a scope with you OPENAI_API_KEY as secret, due to the limitations of this free version of Databricks. We will be limited in the number of serving models at our disposal. We 'll use gpt-4o-mini for chat.
- Create the llm endpoint,  due to the limitations of this free version of Databricks. We will be limited in the number of serving models at our disposal. We 'll use gpt-4o-mini for chat.


Load the librairies   and the config

In [0]:
%pip install -U --quiet databricks-langchain==0.6.0 mlflow[databricks]==3.1.0  langchain==0.3.27 langchain_core==0.3.74
dbutils.library.restartPython()

In this config, the settings of the experiments will be set.

One experiment is set in MLFlow with the name /Users/oliver@mlops-media.com/langchain_chain_demo

In [0]:
%run ../_config/config_0

## 1- Build the chain 
First we will use a simple chain with a prompt, llm, output_parser

prepare the chain config
- First, we will define the model endpoint llm_model_serving_endpoint_name
- And then the template with the system instructions

In [0]:
chain_config = {
    "llm_model_serving_endpoint_name": "chat_gpt_4o_mini",  # the foundation model we want to use
    "llm_system_message_template": "You are an databricks pyspark developper that answer the user's questions.",
}


### 1-1 MLFlow integration 

The workflow library is langchain.

It's the model flavor, used to specify mlflow which integrated library is used.

An MLflow flavor is a standardized way to package, save, and load machine learning models in MLflow. 

It defines the interface and conventions for how different types of models can be stored and deployed consistently within the MLflow ecosystem.

OpenAI, Langchain, Llama_index, HuggingFace, SBert, DSPy
([built_in flavors](https://mlflow.org/docs/2.21.3/model/#models_built-in-model-flavors))

MLflow autolog is a feature that automatically tracks metrics, parameters, artifacts, and models during machine learning experiments without requiring manual logging code.

mlflow.models.ModelConfig, this feature will log the model config in the model package





In [0]:
## Enable MLflow Tracing
mlflow.langchain.autolog()

## Load the chain's configuration
model_config = mlflow.models.ModelConfig(development_config=chain_config)


### 1-2 Chain creation

Creation of the chain : 
- prompt / chat message = system instructions + user query
- ChatDatabricks is chat langchain integration. Fully integrated in databricks, advanced properties, that improves performance, automates authentification, access to serving endpoints ..
- Finishing with a parser to present the final answer in a comprehensive format for the user.

Chain : the pipe is there to create a pipeline of sequential operation. 



In [0]:
from langchain_core.prompts import ChatPromptTemplate
from databricks_langchain.chat_models import ChatDatabricks
from langchain_core.output_parsers import StrOutputParser

# 1- Chat message creation
prompt = ChatPromptTemplate.from_messages(
    [  
        ("system", model_config.get("llm_system_message_template")), # Contains the instructions from the configuration
        ("user", "{question}") #user's questions
    ]
)

# 2- Integration of the model endpoint as the chat model
model = ChatDatabricks(
    endpoint=model_config.get("llm_model_serving_endpoint_name"),
    extra_params={"temperature": 0.7, "max_tokens": 500}
)

# gaather the components in the chain
chain = prompt | model | StrOutputParser() # 3- Post-processing of the model to extract the part of the answer for the user

# Test of the chain
query = 'How to start a Databricks cluster?'
answer = chain.invoke({'question':query})
print(answer)

In [0]:
model.extra_params

#### Observation : 
On the left menu, go to experiment, choose the "langchain_chain_demo" one.

In this experiment, we can have a first glance at MLflow traces feature.
Each row correspond to an invocation of the model, you can read  
- the user query or request
- the model response
- the total number of tokens
- the duration of the workflow,
- the time of the request,
- the status OK, ERROR .. of the workflow.

If you click on the column button, you can choose the params you want to display, such as the user..


Then click on the request col to get more infos about the workflow. 
You get some on the details and timeline, traces of the workflow.
Each task of the workflow, prompt, chat and ouput parser is described.

If you show exection time line the duration of each task appears, of course in case of bottleneck, that can be very helpfull.
In our case, no surprise, the call the openai api is the longest task of the workflow here.
For each step of the workflow, the input/output are registered, you can add whatever attributes you choose, in the case of chat databricks some are already integrated.
Event is the column where you can find info about issue that can happen, very usefull !!



## 2- Chain log and deploy

mlflow.langchain.log_model Log a LangChain model as an MLflow artifact for the current run.

A LangChain model, which could be a Chain, Agent, or retriever or a path containing the LangChain model

### 2-1 Model signature

In [0]:
from mlflow.models import infer_signature
# Define the input
input_example = {"question": "How to start a Databricks cluster?"}

#  infer signature from an input example and the answer. Needed to create the model package
signature = infer_signature(input_example, answer)

### 2-2 Create a package model
As we use the mlflow log_model method, the model will be stored as a package.

Ready for deploy on any prod environment (prod, docker, ..).

In [0]:
with mlflow.start_run() as run:
    logged_model = mlflow.langchain.log_model(
        lc_model=chain,
        name="simple_chain",
        input_example=input_example,
        signature=signature,
        params=model.extra_params
    )

#### OBSERVATION : 
Click on the link to go where the model is logged
- overview : general infos about the model, no metrics because none has been defined already, the parameters are stored.
- artifact : the package is a generic one, if the model was well logged, you can install with this package on every environment. For example : requirements.txt contains the libraries used for the package...


#

### 2-3 load the model
This functionnality allows us to register the model in UC where you can apply the gouver and share the models with co-dev.
You can find infos in the mlflow overview table

In [0]:
# Test the chain locally
# get logged model infos, the uri 
model_uri = logged_model.model_uri
chain = mlflow.pyfunc.load_model(model_uri)


Test the result

In [0]:
chain.predict(input_example)

### 2-4 Register the model on UC

In [0]:
%sql
CREATE CATALOG IF NOT EXISTS DEMO; 
USE CATALOG DEMO;
CREATE SCHEMA IF NOT EXISTS DEMO; 
USE SCHEMA DEMO;

In [0]:
catalog_name="demo"
schema_name="demo"

In [0]:
model_name = "simple_chain"
uc_registered_model = mlflow.register_model(model_uri=model_uri, name=model_name)


We can check the model in the UC catalog.
- overview : we have first infos about the registered model, can create any tags, decriptions or aliases needed
- lineage : infos about the notebook used to create the workflow
- artifact : the package of the model
- trace : traces of the model calls 

You also get access through the UI : 
- overview : we have first infos about the  model, can create any tags, decriptions or aliases needed. 
- description : infos about the model, date of creation, model_id.
- permissions : you can grant or revoke privileges to the user.
Yu can also serve the model as an endpoint. 

In [0]:
# Create an endpoint for the model
from mlflow.deployments import get_deploy_client

# Configuration
mlflow_client = get_deploy_client("databricks")
endpoints = mlflow_client.list_endpoints()
print("list_endpoints : ", len(endpoints))

In [0]:
mlflow_client.delete_endpoint(endpoint="text_embedding_3_large")

### 2-4 Create the endpoint for the chain.

In [0]:

endpoint_name = "simple_chain"
model_name = f"{catalog_name}.{schema_name}.{endpoint_name}"
model_version = "1"  # Version of the model

# Create the endpoint
endpoint = mlflow_client.create_endpoint(
    name=endpoint_name,
    config={
        "served_entities": [{
                "name": f"{endpoint_name}-{model_version}",
                "entity_name": model_name,
                "entity_version": model_version,
                "type": "UC_MODEL",
                "workload_size": "Small",
                "scale_to_zero_enabled": True
        }]
    }
)




### Check the ndpoint created.

In the config, state is not ready.

The config_update is in progress.

Let's check the serving menu to verify the new model has been correctly created and is currently in progress

In [0]:
endpoints = mlflow_client.list_endpoints()
endpoints