# LangChain Samples

LangChain is an open source framework for interacting with LLMs. This notebook gives some examples on how to use LangChain with GCP.

In [89]:
# Install dependencies
%pip install --upgrade -r requirements.txt

81393.66s - pydevd: Sending message related to process being replaced timed-out after 5 seconds


E1128 08:33:29.717791105  456548 backup_poller.cc:127]                 Run client channel backup poller: UNKNOWN:pollset_work {created_time:"2023-11-28T08:33:29.717493125+01:00", children:[UNKNOWN:Bad file descriptor {created_time:"2023-11-28T08:33:29.717458855+01:00", errno:9, os_error:"Bad file descriptor", syscall:"epoll_wait"}]}
Defaulting to user installation because normal site-packages is not writeable

[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m23.2.1[0m[39;49m -> [0m[32;49m23.3.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m
Note: you may need to restart the kernel to use updated packages.


In [90]:
# Global Variables (CHANGE THESE)
PROJECT_ID = "PROJECT-ID"

RAW_BUCKET = "GCS-BUCKET"

In [91]:
# Load documents from Cloud Storage into LangChain documents

from langchain.document_loaders import GCSFileLoader

loader = GCSFileLoader(project_name=PROJECT_ID, bucket=RAW_BUCKET, blob="RulesOfMachineLearning.pdf")
documents = loader.load()

In [92]:
# Split the documents into chunks so we can embed them

from langchain.text_splitter import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=1000, chunk_overlap=200, add_start_index=True
)
all_splits = text_splitter.split_documents(documents)

In [93]:
# Let's see how many chunks we have

len(all_splits)

86

In [94]:
# We now create a vectorstore so we can retrieve relevant embeddings

from langchain.embeddings import VertexAIEmbeddings
from langchain.vectorstores import Chroma

vectorstore = Chroma.from_documents(documents=all_splits, embedding=VertexAIEmbeddings())

In [95]:
# Now we create a retriever which we can use to actually search the vectorstore

retriever = vectorstore.as_retriever(search_type="similarity", search_kwargs={"k": 6})

In [96]:
# Let's ask a question to our retriever

retrieved_docs = retriever.get_relevant_documents(
    "What is the first rule of machine learning?"
)

In [97]:
# Let's see what we retrieved!

print(retrieved_docs[0].page_content)

11/26/23, 1:49 PM

Rules of Machine Learning: | Google for Developers

Rules of Machine Learning:

Best Practices for ML Engineering

Martin Zinkevich

This document is intended to help those with a basic knowledge of machine learning get the bene t of Google's best practices in machine learning. It presents a style for machine learning, similar to the Google C++ Style Guide and other popular guides to practical programming. If you have taken a class in machine learning, or built or worked on a machine -learned model, then you have the necessary background to read this document.

Rules of ML Rules of ML

Martin Zinkevich introduces 10 of his favorite rules of machine learning. Read on to learn all 43 rules!

Terminology

The following terms will come up repeatedly in our discussion of effective machine learning:


In [98]:
# Now we'll create a Vertex AI LLM
from langchain.llms import VertexAI

llm = VertexAI(temperature=0)

In [103]:
# Now we will create a chain to get summarize the document retrieved from our retriever

from langchain.chains.summarize import load_summarize_chain

chain = load_summarize_chain(llm, chain_type="refine")

In [104]:
summary = chain.run(retrieved_docs)

In [105]:
print(summary)

 This document presents a style guide for machine learning, similar to the Google C++ Style Guide and other popular guides to practical programming. It introduces 10 of the author's favorite rules of machine learning and provides a glossary of relevant terminology. The guide is intended to help those with a basic knowledge of machine learning get the benefit of Google's best practices in the field. If you have taken a class in machine learning, or built or worked on a machine-learned model, then you have the necessary background to read this document.

Machine learning (ML) is a subfield of artificial intelligence (AI) based on the idea
