
## Primer: LangChain & Simplifying LLM Apps

**LangChain** is a powerful framework for building applications with large language models (LLMs). It provides:

- **Integrations**: Connects LLMs (OpenAI, HuggingFace, Databricks, etc.) with data sources, APIs, and tools.
- **Chains**: Orchestrates sequences of LLM calls and logic, making complex workflows easy to build.
- **Retrieval-Augmented Generation (RAG)**: Combines LLMs with external knowledge (e.g., documents, databases) for more accurate, context-aware responses.
- **Vector Search**: Enables fast similarity search over embedded data, crucial for RAG.

### Why LangChain for LLM Apps?

- **Rapid Prototyping**: Prebuilt components for prompts, memory, agents, and retrieval.
- **Extensibility**: Easily plug in new models, data sources, or tools.
- **Production-Ready**: Integrates with MLflow for experiment tracking and Databricks for scalable deployment.

### Databricks Vector Search + LangChain

- Store and index document embeddings in Databricks.
- Use LangChain to retrieve relevant context for user queries via vector search.
- Pass retrieved context to LLMs for accurate, up-to-date answers.

**Example Workflow:**
1. Ingest and embed documents.
2. Store embeddings in Databricks Vector Search.
3. On user query, retrieve similar documents.
4. Use LLM (via LangChain) to generate a response using retrieved context.

LangChain abstracts the complexity, letting you focus on building intelligent, data-aware LLM applications.

In [0]:
%sql
select * from workspace.llm_rag.rag_input_summary

In [0]:
%sql
select distinct url from workspace.llm_rag.rag_input_summary

In [0]:
%pip install --upgrade databricks-langchain langchain-community langchain databricks-sql-connector

In [0]:
dbutils.library.restartPython() 

#Call Langchain without RAG Context

In [0]:

from databricks_langchain import ChatDatabricks

chat_model = ChatDatabricks(
    endpoint="databricks-meta-llama-3-1-405b-instruct",
    temperature=0.1,
    max_tokens=250,
)
resp=chat_model.invoke("provide step-by-step process to do disaster recovery in  Databricks")

In [0]:
print(resp.content)

In [0]:
from databricks_langchain import DatabricksVectorSearch

vector_store = DatabricksVectorSearch(index_name="workspace.llm_rag.databricks_vector_index")
retriever = vector_store.as_retriever(search_kwargs={"k": 5})
response=retriever.invoke("provide step-by-step process to do disaster recovery in  Databricks")

In [0]:
print(response[1])

#Using RAG Additional context to LLM

In [0]:
from langchain_core.prompts import ChatPromptTemplate



# 2. Define the prompt template that includes the context
system_prompt = (
    "Use the following pieces of context to answer the user's question. "
    "If you don't know the answer based *only* on the context, say that you don't know."
    "\n\nContext: {context}"
)

prompt = ChatPromptTemplate.from_messages(
    [
        ("system", system_prompt),
        ("human", "{input}"),
    ]
)
curated_response=chat_model.invoke(prompt.format_messages(input="provide step-by-step process to do disaster recovery in  Databricks", context=response))

In [0]:
print(curated_response.content)

#Log LLM Experiments with MLFLOW

In [0]:
import mlflow
mlflow.langchain.autolog()

#Going Beyond Summary Providing Full information

In [0]:
from databricks_langchain import DatabricksVectorSearch

vector_store = DatabricksVectorSearch(index_name="workspace.llm_rag.databricks_vector_index",columns=["url","content"],)
retriever = vector_store.as_retriever(search_kwargs={"k": 5})
response=retriever.invoke("provide step-by-step process to do disaster recovery in  Databricks")

In [0]:
response[0]

In [0]:
combined_docs=[[r.metadata["url"],r.metadata["content"]] for r in response]

In [0]:
from langchain_core.prompts import ChatPromptTemplate



# 2. Define the prompt template that includes the context
system_prompt = (
    "Use the following pieces of context to answer the user's question. "
    "If you don't know the answer based *only* on the context, say that you don't know."
    "\n\nContext: {context}"
)

prompt = ChatPromptTemplate.from_messages(
    [
        ("system", system_prompt),
        ("human", "{input}"),
    ]
)
curated_response=chat_model.invoke(prompt.format_messages(input="provide step-by-step process to do disaster recovery in  Databricks", context=combined_docs))

In [0]:

from databricks_langchain import ChatDatabricks

chat_model = ChatDatabricks(
    endpoint="databricks-meta-llama-3-1-405b-instruct",
    temperature=0.1,
    max_tokens=4096,
)
resp=chat_model.invoke("provide step-by-step process to do disaster recovery in  Databricks")

In [0]:
from langchain_core.prompts import ChatPromptTemplate



# 2. Define the prompt template that includes the context
system_prompt = (
    "Use the following pieces of context to answer the user's question. "
    "If you don't know the answer based *only* on the context, say that you don't know."
    "\n\nContext: {context}"
)

prompt = ChatPromptTemplate.from_messages(
    [
        ("system", system_prompt),
        ("human", "{input}"),
    ]
)
curated_response=chat_model.invoke(prompt.format_messages(input="provide step-by-step process to do disaster recovery in  Databricks", context=combined_docs))