In [None]:
%pip install langchain langchain-openai tiktoken

In [30]:
# make sure to set the environment varaible OPENAI_API_KEY to your API key
# export OPENAI_API_KEY="..."

In [31]:
import os
import os
import json

home_dir = os.path.expanduser("~")
cfgFile = os.path.join(home_dir, ".langchain", "config.json")
configData = json.load(open(cfgFile, "r"))

#print(configData["openAI"]["apiKey"])


In [32]:

# read and set all environment variables
os.environ["OPENAI_API_KEY"] = configData["openAI"]["apiKey"]




In [33]:

# read in openapi key environment variable
openai_api_key = os.getenv("OPENAI_API_KEY")
#print(openai_api_key)



In [3]:
# now initialize the model:
from langchain_openai import ChatOpenAI
llm = ChatOpenAI()

In [None]:
llm.invoke("how can langsmith help with testing?")

Prompt templates are used to convert raw user input to a better input to the LLM.

In [35]:
from langchain_core.prompts import ChatPromptTemplate
prompt = ChatPromptTemplate.from_messages([
    ("system", "You are world class technical documentation writer."),
    ("user", "{input}")
])

We can now combine these into a simple LLM chain:

In [36]:
chain = prompt | llm 

We can now invoke it and ask the same question. 
It still won't know the answer, but it should respond in a more proper tone for a technical writer!

In [None]:
chain.invoke({"input": "how can langsmith help with testing?"})

The output of a ChatModel (and therefore, of this chain) is a message. \
However, it's often much more convenient to work with strings. \
Let's add a simple output parser to convert the chat message to a string.

In [38]:
from langchain_core.output_parsers import StrOutputParser
output_parser = StrOutputParser()

We can now add this to the previous chain:

In [8]:
chain = prompt | llm | output_parser

We can now invoke it and ask the same question.
The answer will now be a string (rather than a ChatMessage).

In [None]:
chain.invoke({"input": "how can langsmith help with testing?"})

Retrieval Chain
A Retriever can be backed by anything - a SQL table, the internet, etc.\
But in this instance we will populate a vector store and use that as a retriever. 

In order to do this, we will use the WebBaseLoader. \
This requires installing BeautifulSoup:

In [None]:
%pip install beautifulsoup4

In [40]:
# now use webBaseLoader
from langchain_community.document_loaders import WebBaseLoader
loader = WebBaseLoader("https://docs.smith.langchain.com/overview")

docs = loader.load()

Next, we need to index it into a vectorstore. 
This requires a few components, namely an embedding model and a vectorstore.
For embedding models, we can use OpenAI or local models.

For local, ensure you have Ollama running (same set up as with the LLM).


In [41]:
# for OpanAI embeddings we can use:
# from langchain_openai import OpenAIEmbeddings
# embeddings = OpenAIEmbeddings()

from langchain_community.embeddings import OllamaEmbeddings

embeddings = OllamaEmbeddings()

Now, we can use this embedding model to ingest documents into a vectorstore. 
We will use a simple local vectorstore, FAISS, for simplicity's sake.
First we need to install the required packages for that:

In [None]:
%pip install faiss-cpu

In [None]:
# Now build the index:
from langchain_community.vectorstores import FAISS
from langchain.text_splitter import RecursiveCharacterTextSplitter


text_splitter = RecursiveCharacterTextSplitter()
documents = text_splitter.split_documents(docs)
vector = FAISS.from_documents(documents, embeddings)