# Quickstart

The quickstart is based on: https://python.langchain.com/docs/get_started/quickstart

In this quickstart we'll show you how to:

* Get setup with LangChain
* Use the most basic and common components of LangChain: prompt templates, models, and output parsers
* Use LangChain Expression Language, the protocol that LangChain is built on and which facilitates component chaining
* Build a simple application with LangChain

What we are **NOT** showing at the moment is:
* Trace your application with LangSmith
* Serve your application with LangServe

That's a fair amount to cover! Let's dive in.

# Setup

## Jupyter Notebook or Jupyter Lab

This guide (and most of the other guides in the documentation) use Jupyter notebooks and assume the reader is as well. Jupyter notebooks are perfect for learning how to work with LLM systems because often times things can go wrong (unexpected output, API down, etc) and going through guides in an interactive environment is a great way to better understand them.

You do not NEED to go through the guide in a Jupyter Notebook, but it is recommended. See [here](https://jupyter.org/install) for instructions on how to install.

# Installation

To setup the conda environment run

```bash
conda env create
```

This command will create new conda environment called `langchain-snowpoc`

In [1]:
# helper functions to display Markdown returned by functions
from IPython.display import display, Markdown

## langchain-snowpoc Library

> **Note**: If running Jupyter example from main folder there is no need to build or install the module, and you can skip this section. Otherwise continue reading.

### Building the module

`tox` is used to build the module. Just run it in the main folder.

# Building with LangChain

LangChain enables building application that connect external sources of data and computation to LLMs. In this quickstart, we will walk through a few different ways of doing that. We will start with a simple LLM chain, which just relies on information in the prompt template to respond. Next, we will build a retrieval chain, which fetches data from a separate database and passes that into the prompt template. We will then add in chat history, to create a conversation retrieval chain. This allows you to interact in a chat manner with this LLM, so it remembers previous questions.

## LLM Chain

We are going to work with an experimental module `langchain-snowpoc`.

> **Note**: It is required that you have access to `Cortex` and `VECTOR` search features, and that your Snowflake connection is [configured](https://docs.snowflake.com/en/developer-guide/snowflake-cli-v2/connecting/connect).

Define some variables for convienence.

In [2]:
CONNECTION_NAME = "YOUR CONNECTION NAME"
MODEL_LLM = "gemma-7b" # We need a model that does not now what Cortex is, or change questions and link to topic, that is unknown to the model
MODEL_EMBEDDINGS = "e5-base-v2"
VECTOR_LENGTH = 786

# we will be asking some questions, let us define knowledge source and questions here
URL_CORTEX_DOCUMENTATION="https://docs.snowflake.com/en/user-guide/snowflake-cortex/llm-functions"
QUESTION_WHAT_IS_CORTEX="What is Cortex in Snowflake?"
QUESTION_CAN_SNOWFLAKE_HELP_WITH_LLM="Can Snowflake help run my LLM applications?"

We have to create a connection:

In [3]:
import snowflake.connector
snowflake_connection = snowflake.connector.connect(
        connection_name=CONNECTION_NAME,
    )

After that we can then initialize the model (you can use either `SQLCortex` or `Cortex` classes. They behave in the same way. The only difference is that Cortex is using experimental python implementation):

In [4]:
from langchain_snowpoc.llms import SQLCortex
llm = SQLCortex(connection=snowflake_connection, model=MODEL_LLM)

Once you've installed and initialized the LLM, we can try using it! Let's ask it what Snowflake Cortex is - this is something that wasn't present in the training data so it shouldn't have a very good response.



In [5]:
print(llm.invoke(f"{QUESTION_WHAT_IS_CORTEX} Use no more than 200 words. If you don't know just say so."))



I do not have the knowledge to answer this question. I am not familiar with the term "Cortex" in the context of Snowflake.


We can also guide it's response with a prompt template. Prompt templates are used to convert raw user input to a better input to the LLM.



In [6]:
from langchain_core.prompts import ChatPromptTemplate
prompt = ChatPromptTemplate.from_messages([
    ("system", "You are world class Snowflake specialist. Answer using no more than 200 words. If you don't know just say so."),
    ("user", "{input}")
])

We can now combine these into a simple LLM chain:

In [7]:
chain = prompt | llm 

We can now invoke it and ask the same question. It still won't know the answer, but it should respond in a more proper tone for a Snowflake specialist!

In [8]:
chain.invoke({"input": QUESTION_WHAT_IS_CORTEX})

'\n\nI am not able to answer this question as I do not have the necessary knowledge about Snowflake. I am not a specialist in this area.'

The output of a model (and therefore, of this chain) is a message. However, it's often much more convenient to work with strings. Let's add a simple output parser to convert the chat message to a string. (We do not get a Message in this example; the documentation of Langchain is not clear regarding what LLM should return)

In [9]:
from langchain_core.output_parsers import StrOutputParser

output_parser = StrOutputParser()

We can now add this to the previous chain:

In [10]:
chain = prompt | llm | output_parser

We can now invoke it and ask the same question. The answer will now be a string (rather than a ChatMessage).

In [11]:
chain.invoke({"input": QUESTION_WHAT_IS_CORTEX})

'\n\nI am not able to answer this question as I do not have the necessary knowledge about Snowflake. I am not a specialist in this area.'

## Retrieval Chain

In order to properly answer the original question, we need to provide additional context to the LLM. We can do this via retrieval. Retrieval is useful when you have too much data to pass to the LLM directly. You can then use a retriever to fetch only the most relevant pieces and pass those in.

In this process, we will look up relevant documents from a Retriever and then pass them into the prompt. A Retriever can be backed by anything - a SQL table, the internet, etc - but in this instance we will populate a vector store and use that as a retriever. For more information on vectorstores, see [this](https://python.langchain.com/docs/modules/data_connection/vectorstores).

First, we need to load the data that we want to index. In order to do this, we will use the WebBaseLoader.

In [12]:
from langchain_community.document_loaders import WebBaseLoader
loader = WebBaseLoader(URL_CORTEX_DOCUMENTATION)

docs = loader.load()

Next, we need to index it into a vectorstore. This requires a few components, namely an embedding model (provided by Snowflake) and a vectorstore (also in Snowflake).

In [13]:
from langchain_snowpoc.embedding import SnowflakeEmbeddings
embeddings = SnowflakeEmbeddings(
        connection=snowflake_connection, model=MODEL_EMBEDDINGS
)

In [14]:
from langchain_snowpoc.vectorstores import SnowflakeVectorStore
from langchain_text_splitters import RecursiveCharacterTextSplitter


text_splitter = RecursiveCharacterTextSplitter()
documents = text_splitter.split_documents(docs)
vector = SnowflakeVectorStore.from_documents(
    documents=documents,
    embedding=embeddings,
    vector_length=VECTOR_LENGTH,
    connection=snowflake_connection
)

In [15]:
from langchain.chains.combine_documents import create_stuff_documents_chain

prompt = ChatPromptTemplate.from_template("""Use no more than 200 words. Answer the following question based only on the provided context:

<context>
{context}
</context>

Question: {input}""")

document_chain = create_stuff_documents_chain(llm, prompt)

In [16]:
from langchain.chains import create_retrieval_chain

retriever = vector.as_retriever()
retrieval_chain = create_retrieval_chain(retriever, document_chain)

In [17]:
response = retrieval_chain.invoke({"input": QUESTION_WHAT_IS_CORTEX})
display(Markdown(response["answer"]))

**Answer:**

Cortex in Snowflake is a platform that provides access to industry-leading large language models (LLMs) trained by researchers at companies like Mistral, Meta, and Google. It also offers models that Snowflake has fine-tuned for specific use cases. These LLMs are fully hosted and managed by Snowflake, so there is no setup required.

## Conversation Retrieval Chain

The chain we've created so far can only answer single questions. One of the main types of LLM applications that people are building are chat bots. So how do we turn this chain into one that can answer follow up questions?

We can still use the create_retrieval_chain function, but we need to change two things:

1. The retrieval method should now not just work on the most recent input, but rather should take the whole history into account.
1. The final LLM chain should likewise take the whole history into account

Updating Retrieval

In order to update retrieval, we will create a new chain. This chain will take in the most recent input (`input`) and the conversation history (`chat_history`) and use an LLM to generate a search query.

In [18]:
from langchain.chains import create_history_aware_retriever
from langchain_core.prompts import MessagesPlaceholder

# First we need a prompt that we can pass into an LLM to generate this search query

prompt = ChatPromptTemplate.from_messages([
    MessagesPlaceholder(variable_name="chat_history"),
    ("user", "{input}"),
    ("user", "Given the above conversation, generate a search query to look up in order to get information relevant to the conversation")
])
retriever_chain = create_history_aware_retriever(llm, retriever, prompt)

We can test this out by passing in an instance where the user is asking a follow up question.

In [19]:
from langchain_core.messages import HumanMessage, AIMessage
from pprint import pprint
chat_history = [HumanMessage(content="Can Snowflake help run my LLM applications?"), AIMessage(content="Yes!")]
retriever_chain.invoke({
    "chat_history": chat_history,
    "input": "Tell me how"
})

[Document(page_content='Large Language Model (LLM) Functions (Snowflake Cortex) | Snowflake DocumentationDOCUMENTATION/Getting StartedGuidesDeveloperReferenceReleasesTutorialsStatusOverviewConnecting to SnowflakeVirtual warehousesDatabases, Tables, & ViewsData TypesData LoadingData UnloadingQueriesSnowflake CortexLarge Language Model FunctionsML-Powered FunctionsData Sharing & CollaborationAlerts & NotificationsSecurityData GovernancePrivacyOrganizations & AccountsBusiness Continuity & Data RecoveryPerformance OptimizationCost ManagementGuidesSnowflake CortexLarge Language Model Functions\n\nLarge Language Model (LLM) Functions (Snowflake Cortex)¶\n\n Preview Feature — Open\nAvailable to all accounts in selected regions.\n\nSnowflake Cortex gives you instant access to industry-leading large language models (LLMs) trained by researchers at\ncompanies like Mistral, Meta, and Google. It also offers models that Snowflake has fine-tuned for specific use cases.\nSince these LLMs are fully ho

You should see that this returns documents about Cortex. This is because the LLM generated a new query, combining the chat history with the follow up question.

Now that we have this new retriever, we can create a new chain to continue the conversation with these retrieved documents in mind.

In [20]:
prompt = ChatPromptTemplate.from_messages([
    ("system", "Use no more than 400 words. Answer the user's questions based on the below context:\n\n{context}"),
    MessagesPlaceholder(variable_name="chat_history"),
    ("user", "{input}"),
])
document_chain = create_stuff_documents_chain(llm, prompt)

retrieval_chain = create_retrieval_chain(retriever_chain, document_chain)

We can now test this out end-to-end:

In [21]:
chat_history = [HumanMessage(content=QUESTION_CAN_SNOWFLAKE_HELP_WITH_LLM), AIMessage(content="Yes!")]
full_response = retrieval_chain.invoke({
    "chat_history": chat_history,
    "input": "Tell me how"
})

In [22]:
display(Markdown(
    full_response.get('answer') # get only the answer
))

## Summary of the text provided:

This text describes Snowflake Cortex LLM Functions, a service that gives you instant access to industry-leading large language models (LLMs) trained by researchers at companies like Mistral, Meta, and Google.

**Key points:**

* **Function availability:** Currently available in the US East, West, Europe, and Azure regions.
* **Cost considerations:** Incurred based on the number of tokens processed.
* **Usage quotas:** Limits the number of tokens processed per minute and rows processed per minute.
* **Required privileges:** The `SNOWFLAKE.CORTEX_USER` database role is required to use these functions.
* **Error conditions:** Can occur due to system load, invalid options, budget exceeding, or other reasons.
* **Using with Python:** Available in Snowpark ML version 1.1.2 and later.

**Additional information:**

* The text includes examples of using the `COMPLETE`, `EXTRACT_ANSWER`, `SENTIMENT`, `SUMMARIZE`, and `TRANSLATE` functions.
* The text also mentions the limitations of the service, such as the lack of support for the advanced chat-style (multi-message) form of COMPLETE in Python.
* The text includes information on how to find more information and resources related to Snowflake Cortex LLM Functions.

We can see that this gives a coherent answer - we've successfully turned our retrieval chain into a chatbot!