
<div style="text-align: center; line-height: 0; padding-top: 9px;">
  <img src="https://databricks.com/wp-content/uploads/2018/03/db-academy-rgb-1200px.png" alt="Databricks Learning">
</div>



# Building Multi-stage Reasoning Chain in Databricks

In this demo we will start building a multi-stage reasoning system using Databricks' features and LangChain. Before we build the chain, first, we will show various components that are commonly used in multi-stage chaining system. 

In the main section of the demo, we will build a multi-stage system. First, we will build a chain that will answer user questions using `llama-3` model. The second chain will search for DAIS-2023 talks and will try to find the corresponding video on YouTube. The final, complete chain will recommend videos to the user.

**Learning Objectives:**

*By the end of this demo, you will be able to;*

* Identify that LangChain can include stages/tasks that are not LLMs.

* Create basic LLM chains to connect prompts and LLMs.

* Use tools to complete various tasks in the complete system.

* Construct sequential chains of multiple LLMChains to perform multi-stage reasoning analysis.

## Requirements

Please review the following requirements before starting the lesson:

* To run this notebook, you need to use one of the following Databricks runtime(s): **15.4.x-cpu-ml-scala2.12**



## Classroom Setup

Before starting the demo, run the provided classroom setup script. This script will define configuration variables necessary for the demo. Execute the following cell:

In [0]:
%pip install -U -qq databricks-sdk databricks-vectorsearch langchain-databricks langchain==0.3.7 langchain-community==0.3.7 youtube_search Wikipedia grandalf

dbutils.library.restartPython()

ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
jupyter-server 1.23.4 requires anyio<4,>=3.1.0, but you have anyio 4.9.0 which is incompatible.
[43mNote: you may need to restart the kernel using %restart_python or dbutils.library.restartPython() to use updated packages.[0m


In [0]:
%run ../Includes/Classroom-Setup-02

[43mNote: you may need to restart the kernel using %restart_python or dbutils.library.restartPython() to use updated packages.[0m



The examples and models presented in this course are intended solely for demonstration and educational purposes.
 Please note that the models and prompt examples may sometimes contain offensive, inaccurate, biased, or harmful content.


2025-04-30 14:55:43,859 13894 ERROR _handle_rpc_error GRPC Error received
Traceback (most recent call last):
  File "/databricks/python/lib/python3.10/site-packages/pyspark/sql/connect/client/core.py", line 1721, in config
    resp = self._stub.Config(req, metadata=self.metadata())
  File "/databricks/python/lib/python3.10/site-packages/grpc/_interceptor.py", line 277, in __call__
    response, ignored_call = self._with_call(
  File "/databricks/python/lib/python3.10/site-packages/grpc/_interceptor.py", line 332, in _with_call
    return call.result(), call
  File "/databricks/python/lib/python3.10/site-packages/grpc/_channel.py", line 439, in result
    raise self
  File "/databricks/python/lib/python3.10/site-packages/grpc/_interceptor.py", line 315, in continuation
    response, call = self._thunk(new_method).with_call(
  File "/databricks/python/lib/python3.10/site-packages/grpc/_channel.py", line 1193, in with_call
    return _end_unary_response_blocking(state, call, True, None)
 

Dataset is created successfully.


**Other Conventions:**

Throughout this demo, we'll refer to the object `DA`. This object, provided by Databricks Academy, contains variables such as your username, catalog name, schema name, working directory, and dataset locations. Run the code block below to view these details:

In [0]:
print(f"Username:          {DA.username}")
print(f"Catalog Name:      {DA.catalog_name}")
print(f"Schema Name:       {DA.schema_name}")
print(f"Working Directory: {DA.paths.working_dir}")
print(f"Dataset Location:  {DA.paths.datasets}")

Username:          labuser10152510_1746020930@vocareum.com
Catalog Name:      dbacademy
Schema Name:       labuser10152510_1746020930
Working Directory: /Volumes/dbacademy/ops/labuser10152510_1746020930@vocareum_com
Dataset Location:  NestedNamespace (dais='/Volumes/dbacademy_dais/v01')


## Using LLMs and Prompts without an External Library

While there are many libraries out there for building chains and in this demo we will be using one as well, you don't need a third party library for a simple prompt. We can simply use `databricks-sdk` to directly query an **Foundational Model API endpoint**. 


In [0]:
from databricks.sdk import WorkspaceClient
from databricks.sdk.service.serving import ChatMessage

w = WorkspaceClient()

genre = "romance"
actor = "Brad Pitt"

prompt = f"Tell me about a {genre} movie which {actor} is one of the actors."

messages = [
    { 
        "role": "user", 
        "content": prompt 
    }
]
messages = [ChatMessage.from_dict(message) for message in messages]
llm_response = w.serving_endpoints.query(
    name="databricks-meta-llama-3-3-70b-instruct",
    messages=messages,
    temperature=0.2,
    max_tokens=128
)

print(llm_response.as_dict()["choices"][0]["message"]["content"])

A great choice! One of the most iconic romance movies featuring Brad Pitt is "Meet Joe Black" (1998). In this film, Brad Pitt plays the role of Joe Black, a personification of Death who takes the form of a young man to learn about human life.

The movie follows the story of Bill Parrish (played by Anthony Hopkins), a wealthy and successful businessman who is about to turn 65. Joe Black (Brad Pitt) appears to Bill and informs him that he has come to collect his soul. However, Joe is fascinated by human life and asks Bill to teach him about it.

As Joe learns about human emotions



## LangChain Basics

As demonstrated in the previous section, **it is not necessary to use a third-party chaining library** to construct a multi-chain AI system. However, composition libraries like **LangChain** can simplify some of the steps by providing a generic interface for supported large language models (LLMs).

Before we begin building a multi-stage chain, let's review the main LangChain components that we will use in this module.


### Prompt

Prompt is one of the basic blocks when interacting with GenAI models. They may include instructions, examples and specific context information related to the given task. Let's create a very basic prompt.


In [0]:
from langchain.prompts import PromptTemplate

prompt_template = PromptTemplate.from_template("Tell me about a {genre} movie which {actor} is one of the actors.")
prompt_template.format(genre="romance", actor="Brad Pitt")

'Tell me about a romance movie which Brad Pitt is one of the actors.'

### LLMs

LLMs are the core component when building compound AI systems. They are the **brain** of the system for reasoning and generating the response. The [list of supported LLMs](https://python.langchain.com/v0.2/docs/integrations/chat/) can be found on LangChain documentation. 

Let's see how to interact with **Llama-3** model.


In [0]:
from langchain_databricks import ChatDatabricks

# You can play with max_tokens to define the length of the response
llm_llama = ChatDatabricks(endpoint="databricks-meta-llama-3-3-70b-instruct", max_tokens = 500)

for chunk in llm_llama.stream("Who is Brad Pitt?"):
    print(chunk.content, end="\n", flush=True)

  llm_llama = ChatDatabricks(endpoint="databricks-meta-llama-3-3-70b-instruct", max_tokens = 500)



Brad 
Pitt 
is 
a 
renowned 
American 
actor 
and 
film 
producer. 
He 
was 
born 
on 
December 
18, 
1963, 
in 
Shawnee, 
Oklahoma, 
and 
grew 
up 
in 
Springfield, 
Missouri. 
Pitt 
has 
become 
one 
of 
the 
most 
successful 
and 
influential 
actors 
in 
Hollywood, 
known 
for 
his 
versatility, 
range, 
and 
dedication 
to 
his 
craft.


Throughout 
his 
career, 
Pitt 
has 
appeared 
in 
a 
wide 
range 
of 
films, 
including 
dramas, 
comedies, 
and 
action 
movies. 
Some 
of 
his 
most 
notable 
roles 
include:


1. 
Thelma 
& 
Louise 
(1991) 
- 
Pitt's 
breakout 
role 
as 
the 
charming 
and 
handsome 
J.D.

2. 
A 
River 
Runs 
Through 
It 
(1992) 
- 
Pitt 
played 
the 
lead 
role 
of 
Paul 
Maclean, 
a 
young 
fly 
fisherman.

3. 
Interview 
with 
the 
Vampire 
(1994) 
- 
Pitt 
portrayed 
the 
brooding 
and 
charismatic 
vampire 
Louis 
de 
Pointe 
du 
Lac.

4. 
Fight 
Club 
(1999) 
- 
Pitt 
starred 
as 
Tyler 
Durden, 
a 
mysterious 
and 
subversive 
figure.

5. 
Ocean's 
Ele

### Retriever

Retrievers accept a string as input and return the list of documents. They are mainly used for retrieving external data and passing it to the model for generating response. Typically, retrievers don't use LLMs instead they are using similarity search under the hood to find the most similar documents. There are various types of retrievers such as *document retrievers* and *vector store retrievers*. The [list of supported retrievers](https://python.langchain.com/v0.2/docs/integrations/retrievers/) can be found on LangChain documentation. 

In the next section of the demo, we will use **Databricks Vector Search** as retriever to fetch documents by input query.

For now, let's try a simple **Wikipedia retriever**.

In [0]:
from langchain_community.retrievers import WikipediaRetriever
retriever = WikipediaRetriever()
docs = retriever.invoke(input="Brad Pitt")
print(docs[0])

page_content='William Bradley Pitt (born December 18, 1963) is an American actor and film producer. He is the recipient of various accolades, including two Academy Awards, two British Academy Film Awards, two Golden Globe Awards, and a Primetime Emmy Award. One of the most influential celebrities, Pitt appeared on Forbes' annual Celebrity 100 list  from 2006 to 2008, and the Time 100 list in 2007. His films as a leading actor have grossed over $6.9 billion worldwide.
Pitt first gained recognition as a cowboy hitchhiker in the Ridley Scott road film Thelma & Louise (1991). Pitt emerged as a star taking on leading man roles in films such as the drama A River Runs Through It (1992), the western Legends of the Fall (1994), the horror film Interview with the Vampire (1994), the crime thriller Seven (1995), and the cult film Fight Club (1999). Pitt found greater commercial success starring in Steven Soderbergh's heist film Ocean's Eleven (2001), and reprised his role in its sequels. He cemen

### Tools

Tools are functions that can be invoked in the chain. Tools has *input parameters* and a *function* to run. The [list of supported tools](https://python.langchain.com/v0.2/docs/integrations/tools/) can be found on LangChain documentation. 

Here, we have a Youtube search tool. The tool's `description` defines why a tool can be used and the `args` defines what input arguments can be passed to the tool.

In [0]:
from langchain_community.tools import YouTubeSearchTool
tool = YouTubeSearchTool()
tool.run("Brad Pitt movie trailer")

"['https://www.youtube.com/watch?v=5muQK7CuFtY&pp=ygUXQnJhZCBQaXR0IG1vdmllIHRyYWlsZXI%3D', 'https://www.youtube.com/watch?v=Jlp94-C31cY&pp=ygUXQnJhZCBQaXR0IG1vdmllIHRyYWlsZXLSBwkJhAkBhyohjO8%3D']"

In [0]:
print(tool.description)
print(tool.args)

search for youtube videos associated with a person. the input to this tool should be a comma separated list, the first part contains a person name and the second a number that is the maximum number of video results to return aka num_results. the second part is optional
{'query': {'title': 'Query', 'type': 'string'}}


### Chaining

One of the important features of these components is the ability to **chain** them together. Let's connect the LLM with the prompt.

In [0]:
from langchain_core.output_parsers import StrOutputParser

chain = prompt_template | llm_llama | StrOutputParser()
print(chain.invoke({"genre":"romance", "actor":"Brad Pitt"}))

A great choice! One of the most iconic romance movies featuring Brad Pitt is "Meet Joe Black" (1998). In this film, Brad Pitt plays the role of Joe Black, a personification of Death who takes the form of a young man to learn about human life.

The movie follows the story of Bill Parrish (played by Anthony Hopkins), a wealthy and successful businessman who is about to turn 65. Joe Black (Brad Pitt) appears to Bill and informs him that he has come to collect his soul. However, Bill is not ready to die and makes a deal with Joe: in exchange for a few more days of life, Bill will teach Joe about the human experience.

As Joe navigates the world of humans, he falls in love with Bill's daughter, Susan (played by Claire Forlani). The romance between Joe and Susan is a beautiful and poignant aspect of the film, as Joe experiences the joys and sorrows of human emotions for the first time.

The movie explores themes of love, mortality, and the meaning of life, and features a strong performance f

## Build a Multi-stage Chain

### Enable MLflow Auto-Log

MLflow has support for auto-logging LangChain models. Auto-logging is enabled by default. If not enabled, we would enable auto-logging as below.

In [0]:
import mlflow
mlflow.langchain.autolog()

### Create a Vector Store

This course does not cover vector stores in depth, as they fall outside its scope. For a comprehensive understanding of vector stores, including the Databricks Vector Store, please refer to the **Generative AI Solution Development** course. 

**🚨IMPORTANT: Vector Search endpoints must be created before running the rest of the demo. These are already created for you in Databricks Lab environment.**

In [0]:
# Assign VS search endpoint by username
vs_endpoint_prefix = "vs_endpoint_"

vs_endpoint_name = vs_endpoint_prefix+str(get_fixed_integer(DA.unique_name("_")))
print(f"Assigned Vector Search endpoint name: {vs_endpoint_name}.")

# Source table and VS index table names
vs_index_table_fullname = f"{DA.catalog_name}.{DA.schema_name}.dais_embeddings"
source_table_fullname = f"{DA.catalog_name}.{DA.schema_name}.dais_text"

Assigned Vector Search endpoint name: vs_endpoint_2.


The dataset that we will be using in this module is already created in the classroom setup script. Let's have a quick look at the dataset.

Then, we will create a vector store index and store embeddings there.

In [0]:
display(spark.sql(f"SELECT * FROM {source_table_fullname}"))

Title,Abstract,id
Nebula: The Journey of Scaling Instacart’s Data Pipelines with Apache Spark™ and Lakehouse,"Instacart has gone through immense growth during the pandemic and the trend continues. Instacart ads is no exception in this growth story. We have launched many new product lines including display and video ads covering the full advertising funnel to address the increasing demand of our retail partners. We have built advanced models to auto-suggest optimal bidding to increase the ROI for our CPG partners. Advertisers’ trust is the utmost priority and thus the quest to build a top-class ads measurement platform.  Ads data processing requires complex data verifications to update ads serving stats. In ETL pipelines these were implemented through files containing thousands of lines of raw SQL which were hard to scale, test, and iterate upon. Our data engineers used to spend hours testing small changes due to a lack of local testing mechanisms. These pain points stress our need for better tools. After some research, we chose Apache Spark™ as our preferred tool to rebuild ETLs, and the Databricks platform made this move easier. In this presentation, I will share our journey to move our pipelines to Spark and Delta Lake on Databricks. With spark, scala, and delta we solved many problems which were slowing the team’s productivity. Some key areas I will cover include:  - Modular and composable code  - Unit testing framework  - Incremental event processing with spark structured streaming - Granular resource tuning for better performance and cost efficacy  Other than the domain business logic, the problems discussed here are quite common for performing data processing at scale. I would be glad to have the opportunity to share our learning and am hopeful that it will benefit others who are going through similar growth challenges or migrating to Lakehouse.",0
Satellite Imaginary Data Processing Using Apache Spark™ and H3 Geospatial Indexing System,"Agriculture is a complex ecosystem. Understanding Ag data around soil metrics, weather and historical crop production is a key to adopt sustainable practices which help farmers to enhance their profitability and soil health. As these datasets are huge and disparate in nature, finding a standard unit of analysis and bringing all the data to a common granularity is challenging. By leveraging the distributed data processing capabilities of Apache Spark™ and h3 geospatial indexing system, created a hexagonal grid and mapped all the data sets using h3 index. This gave us an ability to join all the datasets together and helped us in deriving more insights. This session will share our learnings and experiences with the Spark community.",1
From Snowflake to Enterprise-Scale Apache Spark™,"Akamai mPulse is a real user monitoring (RUM) solution that delivers real-time web performance analytics to Akamai customers through dashboards, alerting, reporting and data science. The architecture of mPulse relies on a combination of public and private cloud-based services, such as Amazon AWS, Microsoft Azure and the Snowflake data warehouse. Snowflake has provided the core data warehousing needs as the product has grown at scale along with Akamai’s customers.  The engineering team at mPulse has been re-architecting the system to migrate away from Snowflake to an internal enterprise-scale Apache Spark™ solution that Akamai has been developing in-house to improve performance and save on cost. In the first half of the talk, we’ll discuss how the mPulse team made the decision to migrate, the challenges we’ve seen and how Spark is suiting the product's needs.  In the second half of the talk, we’ll discuss the details of the Spark-based infrastructure. Akamai data warehouse (a.k.a Asgard) is a Spark-based solution running on the Azure cloud. We will describe the internal and unique technologies and characteristics of the solution that enable it to outperform Snowflake's offering both from a cost and performance perspective. We will share our experience on how to:  * Run Spark on K8s at scale while supporting multi-tenancy and resource isolation  * Handle X100 queries per second on a single Spark application with sub-second query latency  * Protect Spark application from misbehaving users * Optimize SQL-based queries",2
The Future of Data Orchestration: Asset-Based Orchestration,"Data orchestration is a core component for any batch data processing platform and we’ve been using patterns that haven't changed since the 1980s. In this session, I’ll be introducing a new pattern and way of thinking for data orchestration known as asset-based orchestration, with data freshness sensors to trigger pipelines. I will demo this new pattern using popular tools of the modern data stack - dbt, airbyte, dagster, and databricks.",3
Photon for Dummies: How Does this New Execution Engine Actually Work?,"Did you finish the Photon whitepaper and think, wait, what? I know I did. Unfortunately, it’s my job to understand it, explain it, and then use it.  If your role involves using Apache Spark™ on Databricks, then you need to know about Photon and where to use it. Join me, chief dummy, nay *supreme* dummy, as I break down this whitepaper into easy to understand explanations that don’t require a computer science degree. Together we will unravel mysteries such as:  - Why is a Java Virtual Machine the current bottleneck for spark enhancements?  - What does vectorized even mean? And how was it done before?  - Why is the relationship status between Spark and Photon 'It’s complicated'?  In this seession, we’ll start with the basics of Apache Spark, the details we pretend to know and where those performance cracks were starting to show through. Only then will we start to look at Photon, how it’s different, where the clever design choices are and how you can make the most of this in your own workloads. I’ve spent over 50 hours going over the paper in excruciating detail, every reference, and in some instances, the references of the references so that you don’t have to.",4
Monitoring Delta Live Tables,"In this session we will share how Volvo Group monitors Databricks jobs and DLTs to stay ahead of the curve. Volvo Group uses Databricks for - amongst others - material planning. Based on live input from warehouse systems, the system predicts which orders to place with the suppliers to ensure the availability of spare parts and keep the trucks running. Since this is a critical component, with reverse ETL feeding the recommendation back into the warehouse systems, strict monitoring to avoid pipeline congestion is required. In this talk I will:  * Explain a vision of data quality (deliver trustworthy data in time); which metrics to set up to monitor data quality  * How it helps Volvo Group; make the SLA, jointly use with FinOps to improve ingestion pipelines, avoid congestion  * Setup; use Jobs API and Databricks SQL workspace to build and consult the monitoring dashboard  * Demo  * Get notebook from GitHub",5
Data Quality: Fast and Slow,"Data quality: the topic du jour. Gartner estimates the average business will lose $10M annually due to data quality problems, and every week we hear another cautionary tale of an AI model gone awry. As a result, startups and thought leaders are rushing to solve the visibility problem around data quality. Technology that unifies batch and streaming has massive but overlooked implications for our ability to trust existing data. In this session, I will demonstrate how architectures that can move between batch and incremental processing without changing the storage and API allow us to solve common data trust problems, such as stale data, as well as production ML risks, such as concept drift.  This session is for data architects and practitioners. You don't need to be an ML or ETL expert to attend, but an interest in data architecture is critical.",6
Taking Your Cloud Vendor to the Next Level: Solving Complex Challenges with Azure Databricks,"Akamai's content delivery network (CDN) processes about 30% of the internet's daily traffic, resulting in a massive amount of data that presents engineering challenges, both internally and with cloud vendors. In this session, we will discuss the barriers faced while building a data infrastructure on Azure, Databricks, and Kafka to meet strict SLAs, hitting the limits of some of our cloud vendors’ services. We will describe the iterative process of re-architecting a massive-scale data platform using the aforementioned technologies. We will also delve into how today, Akamai is able to quickly ingest and make available to customers TBs of data, as well as efficiently query PBs of data and return results within 10 seconds for most queries. This discussion will provide valuable insights for attendees and organizations seeking to effectively process and analyze large amounts of data.",7
Unleashing the Power of Interactive Analytics at Scale with Databricks and Delta Lake: Lessons Learned from Building Akamai's Web Security Analytics Product,"Akamai is a leading content delivery network (CDN) and cybersecurity company, operating hundreds of thousands of servers in more than 135 countries worldwide.  In this talk, we will share our experiences and lessons learned from building and maintaining the Web Security Analytics (WSA) product, an interactive analytics platform powered by Databricks and Delta Lake, that enables customers to efficiently analyze and take informed action on a high volume of streaming security events. The WSA platform must be able to serve hundreds of queries per minute, scanning hundreds of terabytes of data from a six petabyte data lake, with most queries returning results within ten seconds (for both aggregation queries and needle in a haystack queries).  The talk will cover how to use Databricks SQL warehouses and job clusters cost-effectively, and how to improve query performance through the use of tools and techniques such as Delta Lake, Databricks Photon, and partitioning. This talk will be valuable for anyone looking to build and operate a high-performance analytics platform.",8
ABN Story: Migrating to Future Proof Data Platform,"ABN AMRO Bank is one of the top leading banks in the Netherlands; it is the 3rd largest bank in the Netherlands by revenue and number when it comes to mortgages within the Netherlands. We have an objective to become a fully data-driven bank and this goal is well supported by our top management.  ABN AMRO started its data journey almost seven years ago and has built a data platform off-premises with Hadoop technologies. This data platform has been used by more than 200 data providers, 150 data consumers, and overall more than 3000 Datasets.  To become a fully digital bank and address the limitation of the on-premises platform, we needed a future-proof data platform DIAL (digital integration and access layer.) ABN AMRO decided to build an Azure cloud-native data platform with the help of Microsoft and Databricks. Last year this cloud-native platform was ready for our data providers and data consumers. Six months ago we started the journey of migrating all the content from the on-premises data platform to the Azure data platform, this was a very large-scale migration and was achieved in 6 months.  In this session, we will focus on two things :  1. Share our strategy for migration from on-premises to a cloud-native platform. 2. Share how we used Databricks products in our data platform and how the Databricks team helped us in the overall migration.",9


In [0]:
# Store embeddings in vector store
create_vs_index(vs_endpoint_name, vs_index_table_fullname, source_table_fullname, "Title")

[NOTICE] Using a notebook authentication token. Recommended for development only. For improved performance, please use Service Principal based authentication. To disable this message, pass disable_notice=True to VectorSearchClient().
[NOTICE] Using a notebook authentication token. Recommended for development only. For improved performance, please use Service Principal based authentication. To disable this message, pass disable_notice=True to VectorSearchClient().




Endpoint named vs_endpoint_2 is ready.
Creating index dbacademy.labuser10152510_1746020930.dais_embeddings on endpoint vs_endpoint_2...
Waiting for index to be ready, this can take a few min... {'detailed_state': 'PROVISIONING_INDEX', 'message': 'Delta sync Index creation is pending. Check latest status: https://dbc-7aad3b7d-2c13.cloud.databricks.com/explore/data/dbacademy/labuser10152510_1746020930/dais_embeddings', 'indexed_row_count': 0, 'ready': False, 'index_url': 'dbc-7aad3b7d-2c13.cloud.databricks.com/api/2.0/vector-search/indexes/dbacademy.labuser10152510_1746020930.dais_embeddings'} - pipeline url:dbc-7aad3b7d-2c13.cloud.databricks.com/api/2.0/vector-search/indexes/dbacademy.labuser10152510_1746020930.dais_embeddings


### Define Common Objects

In [0]:
from langchain_databricks import ChatDatabricks
from langchain.prompts import PromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough
from IPython.display import display, HTML

llm_llama = ChatDatabricks(endpoint="databricks-meta-llama-3-3-70b-instruct", max_tokens = 1000)

### Build First Chain

The **first chain** will be used for listing videos relevant to the user's question. In order to get videos, first, we need to search for the DAIS-2023 talks that are already stored in a Vector Search index. After retrieving the relevant titles, we will use YouTube search tool to get the videos for the talks. In the final stage, these videos are passed to the chain to generate a response for the user.

This chain consist of a `prompt template`, `retriever`, `llm model` and `output parser`.

In [0]:
from langchain.chains import create_retrieval_chain
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain_databricks import DatabricksVectorSearch

# Prompt to format the video titles for YouTube search
prompt_template_1 = PromptTemplate.from_template(
    """
    Construct a search query for YouTube based on the titles below. Make sure to add DAIS 2023 as part of the query. Remove all quotes from the query and return only the query.

    <video_titles>
    {context}
    </video_titles>

    Answer:
    """
)

# Create a vector store client and retrieve documents
def get_retriever(persist_dir=None):
    vsc = VectorSearchClient(disable_notice=True)
    vs_index = vsc.get_index(vs_endpoint_name, vs_index_table_fullname)
    vectorstore = DatabricksVectorSearch(vs_index_table_fullname)
    return vectorstore.as_retriever(search_kwargs={"k": 2})


# First chain
chain_video = (
    {"context": get_retriever(), "input": RunnablePassthrough()}
    | prompt_template_1
    | llm_llama
    | StrOutputParser()
)

# Test the chain
chain_video.invoke("How machine learning models are stored in Unity Catalog?")

  vectorstore = DatabricksVectorSearch(vs_index_table_fullname)


[NOTICE] Using a notebook authentication token. Recommended for development only. For improved performance, please use Service Principal based authentication. To disable this message, pass disable_notice=True to VectorSearchClient().


'Unity Catalog Delta Sharing and Data Mesh on Databricks Lakehouse DAIS 2023'

Trace(request_id=tr-fafb83e391904fdaa03954bbc229ca72)

### Build Second Chain

This chain will use the video title-based query from the previous chain to search YouTube. The response from this chain will consist of YouTube video links.

In [0]:
from langchain_community.tools import YouTubeSearchTool
from langchain_core.runnables import RunnableLambda

# Generate image using first chain
def get_videos(input):
    tool_yt = YouTubeSearchTool()
    video_urls = tool_yt.run(input)
    return video_urls

chain_youtube = RunnableLambda(get_videos) | StrOutputParser()

# Get the image URL
response = chain_youtube.invoke("DAIS 2023 Streamlining API Deployment for ML Models Across Multiple Brands Ahold Delhaize's Experience on Serverless OR Building a Real-Time Model Monitoring Pipeline on Databricks")

print(response)

['https://www.youtube.com/watch?v=GSJFyoBiCXk&pp=ygW0AURBSVMgMjAyMyBTdHJlYW1saW5pbmcgQVBJIERlcGxveW1lbnQgZm9yIE1MIE1vZGVscyBBY3Jvc3MgTXVsdGlwbGUgQnJhbmRzIEFob2xkIERlbGhhaXplJ3MgRXhwZXJpZW5jZSBvbiBTZXJ2ZXJsZXNzIE9SIEJ1aWxkaW5nIGEgUmVhbC1UaW1lIE1vZGVsIE1vbml0b3JpbmcgUGlwZWxpbmUgb24gRGF0YWJyaWNrcw%3D%3D', 'https://www.youtube.com/watch?v=3dOePPkwEJc&pp=ygW0AURBSVMgMjAyMyBTdHJlYW1saW5pbmcgQVBJIERlcGxveW1lbnQgZm9yIE1MIE1vZGVscyBBY3Jvc3MgTXVsdGlwbGUgQnJhbmRzIEFob2xkIERlbGhhaXplJ3MgRXhwZXJpZW5jZSBvbiBTZXJ2ZXJsZXNzIE9SIEJ1aWxkaW5nIGEgUmVhbC1UaW1lIE1vZGVsIE1vbml0b3JpbmcgUGlwZWxpbmUgb24gRGF0YWJyaWNrcw%3D%3D']


Trace(request_id=tr-5b4792bcafc14ffe94634928cc7434ab)

### Build Third Chain

The **third chain** will be a simple question-answer prompt using **Meta's Llama-3**. The chain will use the video links as well for recommendation. This chain consist of a `prompt template`, `llm model` and `output parser`.

In [0]:
prompt_template_3 = PromptTemplate.from_template(
    """You are a Databricks expert. You will get questions about Databricks. Try to give simple answers and be professional. Don't include code in your response.

    Question: {input}

    Answer:

    Also, encourage the user to watch the videos provided below. Show video links as a list. Strip the YouTube link at "&pp=" and keep the first part of the URL. There is no title for the links so only show the URL. Only use the videos provided below.

    Video Links: {videos}

    Format response in HTML format.
    """
)

chain_expert = (prompt_template_3 | llm_llama | StrOutputParser())
chain_expert.invoke({
    "input": "How machine learning models are stored in Unity Catalog?",
    "videos": ""
    })

'<p>Machine learning models in Databricks are stored in Unity Catalog as managed tables, which provide a centralized and secure way to manage and share models across the organization. Unity Catalog allows users to store, manage, and version their machine learning models, making it easier to collaborate and deploy models into production.</p>\n\n<p>For more information, I recommend watching the following videos:</p>\n<ul>\n  <li>https://www.youtube.com/watch?v=dQw4w9WgXcQ</li>\n  <li>https://www.youtube.com/watch?v=jNQXAC9IVRw</li>\n</ul>\n\n<p>These videos provide a detailed overview of Unity Catalog and its capabilities in storing and managing machine learning models.</p>'

Trace(request_id=tr-d9038b4c5fbf4267ac3701a8c0170180)

### Chaining Chains ⛓️

So far we create chains for each stage. To build a multi-stage system, we need to link these chains together and build a multi-chain system.

In [0]:
multi_chain = (
  {
    "input": RunnablePassthrough(),
    "videos": (chain_video | chain_youtube | StrOutputParser())
  }
  |chain_expert
  |StrOutputParser()
)

query = "How machine learning models are stored in Unity Catalog?"
response = multi_chain.invoke(query)
display(HTML(response))

Trace(request_id=tr-de8dbd9798e74847b89f0d6f53be18d4)

View the flow of the final chain.

In [0]:
multi_chain.get_graph().print_ascii()

                                 +-----------------------------+                          
                                 | Parallel<input,videos>Input |                          
                                 +-----------------------------+                          
                                     ****                      *****                      
                                  ***                               *****                 
                                **                                       ******           
             +------------------------------+                                  ***        
             | Parallel<context,input>Input |                                    *        
             +------------------------------+                                    *        
                    **               **                                          *        
                 ***                   ***                                       *        

## Save the Chain to Model Registry in UC

Now that our chain is ready and evaluated, we can register it within our Unity Catalog schema. 

After registering the chain, you can view the chain and models in the **Catalog Explorer**.

In [0]:
from mlflow.models import infer_signature
import mlflow


# Set model registry to UC
mlflow.set_registry_uri("databricks-uc")
model_name = f"{DA.catalog_name}.{DA.schema_name}.multi_stage_demo"

with mlflow.start_run(run_name="multi_stage_demo") as run:
    signature = infer_signature(query, response)
    model_info = mlflow.langchain.log_model(
        multi_chain,
        loader_fn=get_retriever, 
        artifact_path="chain",
        registered_model_name=model_name,
        input_example=query,
        signature=signature
    )

2025/04/30 15:00:10 INFO mlflow: Attempting to auto-detect Databricks resource dependencies for the current langchain model. Dependency auto-detection is best-effort and may not capture all dependencies of your langchain model, resulting in authorization errors when serving or querying your model. We recommend that you explicitly pass `resources` to mlflow.langchain.log_model() to ensure authorization to dependent resources succeeds when the model is deployed.


[NOTICE] Using a notebook authentication token. Recommended for development only. For improved performance, please use Service Principal based authentication. To disable this message, pass disable_notice=True to VectorSearchClient().


Successfully registered model 'dbacademy.labuser10152510_1746020930.multi_stage_demo'.
Created version '1' of model 'dbacademy.labuser10152510_1746020930.multi_stage_demo'.


## Load the chain from Model Registry in UC

Now that our chain is registered in UC, we can load it and invoke it.

In [0]:
model_uri = f"models:/{model_name}/{model_info.registered_model_version}"
model = mlflow.langchain.load_model(model_uri)

model.invoke("How machine learning models are stored in Unity Catalog?")

[NOTICE] Using a notebook authentication token. Recommended for development only. For improved performance, please use Service Principal based authentication. To disable this message, pass disable_notice=True to VectorSearchClient().


'<p>Machine learning models in Databricks are stored in Unity Catalog as managed tables, which provide a centralized and secure way to manage and share models across the organization. Unity Catalog allows users to register, manage, and deploy machine learning models in a scalable and reliable manner.</p>\n\n<p>To learn more about Unity Catalog and machine learning in Databricks, we recommend watching the following videos:</p>\n\n<ul>\n  <li>https://www.youtube.com/watch?v=75QGOtqBj2k</li>\n  <li>https://www.youtube.com/watch?v=JMlvflzgybk</li>\n</ul>\n\n<p>These videos provide a comprehensive overview of Unity Catalog and its capabilities in managing machine learning models, as well as best practices for implementing a data mesh architecture on Databricks.</p>'

Trace(request_id=tr-2ef50a01f4be4e248838bc65eac7bb74)


## Conclusion

In this demo, we explored building a multi-stage reasoning system with Databricks' tools and LangChain. We began by introducing common system components and then focused on creating chains for specific tasks like answering user queries and finding DAIS-2023 talks. By the end, participants learned to use LangChain beyond just LLMs and construct sequential chains for multi-stage analyses.


&copy; 2025 Databricks, Inc. All rights reserved.<br/>
Apache, Apache Spark, Spark and the Spark logo are trademarks of the 
<a href="https://www.apache.org/">Apache Software Foundation</a>.<br/>
<br/><a href="https://databricks.com/privacy-policy">Privacy Policy</a> | 
<a href="https://databricks.com/terms-of-use">Terms of Use</a> | 
<a href="https://help.databricks.com/">Support</a>