<a href="https://colab.research.google.com/github/olonok69/LLM_Notebooks/blob/main/langchain/use_cases/Langchain_OpenAI_Use_cases_Q%26A.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# LangChain

LangChain is a framework for developing applications powered by language models.

https://python.langchain.com/docs/use_cases

## Langchain QA

https://python.langchain.com/docs/get_started/introduction

https://python.langchain.com/docs/use_cases/question_answering/

One of the most powerful applications enabled by LLMs is sophisticated question-answering (Q&A) chatbots. These are applications that can answer questions about specific source information. These applications use a technique known as Retrieval Augmented Generation, or RAG.


https://python.langchain.com/docs/modules/model_io/prompts/

https://python.langchain.com/docs/modules/data_connection/document_transformers/recursive_text_splitter

https://python.langchain.com/docs/modules/data_connection/document_loaders/pdf

https://www.promptingguide.ai/techniques/knowledge



In [1]:
!pip install langchain langchain-community tiktoken -q
!pip install -U accelerate -q
! pip install -U unstructured numpy -q
! pip install openai chromadb -q
! pip install pypdf -q

In [2]:

from google.colab import output
output.enable_custom_widget_manager()

In [3]:
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [4]:
from google.colab import userdata
openai_api_key = userdata.get('KEY_OPENAI')

In [5]:
from langchain_community.llms import OpenAI
from langchain import PromptTemplate
import pprint


llm = OpenAI(temperature=0, model_name='gpt-3.5-turbo-instruct', openai_api_key=openai_api_key)

# Create our template
template = """
%INSTRUCTIONS: Answer the question unsing the provided context. Detail as much as possible your answer.
%CONTEXT:
The objective of golf is to play a set of holes in the least number of strokes. A round of golf typically consists of 18 holes.
Each hole is played once in the round on a standard golf course. Each stroke is counted as one point, and the total number of strokes is used to determine the winner of the game.

Golf is a precision club-and-ball sport in which competing players (or golfers) use many types of clubs to hit balls into a series of holes on a course using the fewest number of strokes.
The goal is to complete the course with the lowest score, which is calculated by adding up the total number of strokes taken on each hole. The player with the lowest score wins the game.
%QUESTION:
{text}
"""

# Create LangChain prompt template
prompt = PromptTemplate(
    input_variables=["text"],
    template=template,
)

  warn_deprecated(


In [6]:
sample1= """
Part of golf is trying to get a higher point total than others. Yes or No?
"""

In [7]:
final_prompt = prompt.format(text=sample1)

print(final_prompt)



%INSTRUCTIONS: Answer the question unsing the provided context. Detail as much as possible your answer.
%CONTEXT:
The objective of golf is to play a set of holes in the least number of strokes. A round of golf typically consists of 18 holes. 
Each hole is played once in the round on a standard golf course. Each stroke is counted as one point, and the total number of strokes is used to determine the winner of the game.

Golf is a precision club-and-ball sport in which competing players (or golfers) use many types of clubs to hit balls into a series of holes on a course using the fewest number of strokes. 
The goal is to complete the course with the lowest score, which is calculated by adding up the total number of strokes taken on each hole. The player with the lowest score wins the game.
%QUESTION:

Part of golf is trying to get a higher point total than others. Yes or No?




In [8]:
output = llm.invoke(final_prompt)
pprint.pprint(output)

('\n'
 'No. The objective of golf is to play the course with the fewest number of '
 'strokes, so the goal is to have a lower point total than others.')


# Using Embeddings

https://python.langchain.com/docs/integrations/vectorstores

https://docs.trychroma.com/integrations/langchain



In [9]:
from langchain import OpenAI

# The vectorstore we'll be using
from langchain.vectorstores import Chroma

# The LangChain component we'll use to get the documents
from langchain.chains import RetrievalQA

# The easy document loader for text
from langchain.document_loaders import TextLoader

# The embedding engine that will convert our text to vectors
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.text_splitter import RecursiveCharacterTextSplitter

llm = OpenAI(temperature=0, openai_api_key=openai_api_key)

In [10]:
text_path = "/content/drive/MyDrive/data/llm_doc.txt"

with open(text_path, 'r', encoding='windows-1252') as file:
    text = file.read()

# PPrinting the first 1000 characters
pprint.pprint(text[:1000])

('\n'
 'Large Language Models are Zero-Shot Reasoners\n'
 '\n'
 '\n'
 '\n'
 'Takeshi Kojima\n'
 'The University of Tokyo\n'
 't.kojima@weblab.t.u-tokyo.ac.jp\n'
 '\n'
 'Shixiang Shane Gu\n'
 'Google Research, Brain Team\n'
 '\n'
 '\n'
 '\n'
 'Machel Reid\n'
 'Google Research?\n'
 '\n'
 'Yutaka Matsuo\n'
 'The University of Tokyo\n'
 '\n'
 'Yusuke Iwasawa\n'
 'The University of Tokyo\n'
 '\n'
 '\n'
 '\n'
 'Abstract\n'
 'Pretrained large language models (LLMs) are widely used in many sub-fields '
 'of natural language processing (NLP) and generally known as excellent '
 'few-shot learners with task-specific exemplars. Notably, chain of thought '
 '(CoT) prompting, a recent technique for eliciting complex multi-step '
 'reasoning through step-by- step answer examples, achieved the '
 'state-of-the-art performances in arithmetics and symbolic reasoning, '
 'difficult system-2 tasks that do not follow the standard scaling laws for '
 'LLMs. While these successes are often attributed to LLMs

In [11]:
loader = TextLoader(text_path, encoding='windows-1252')
doc = loader.load()
print (f"You have {len(doc)} document")
print (f"You have {len(doc[0].page_content)} characters in that document")

You have 1 document
You have 119397 characters in that document


In [12]:
text_splitter = RecursiveCharacterTextSplitter(chunk_size=4000, chunk_overlap=400)
docs = text_splitter.split_documents(doc)

In [13]:
# Get the total number of characters so we can see the average later
num_total_characters = sum([len(x.page_content) for x in docs])

print (f"Now you have {len(docs)} documents that have an average of {num_total_characters / len(docs):,.0f} characters (smaller pieces)")

Now you have 38 documents that have an average of 3,261 characters (smaller pieces)


In [14]:
# Get your embeddings engine ready
embeddings = OpenAIEmbeddings(openai_api_key=openai_api_key)

# Embed your documents and combine with the raw text in a pseudo db. Note: This will make an API call to OpenAI
docsearch = Chroma.from_documents(docs, embeddings)

  warn_deprecated(


In [15]:
len(docs[0].page_content)

3794

In [16]:
docs[0].page_content

'Large Language Models are Zero-Shot Reasoners\n\n\n\nTakeshi Kojima\nThe University of Tokyo\nt.kojima@weblab.t.u-tokyo.ac.jp\n\nShixiang Shane Gu\nGoogle Research, Brain Team\n\n\n\nMachel Reid\nGoogle Research?\n\nYutaka Matsuo\nThe University of Tokyo\n\nYusuke Iwasawa\nThe University of Tokyo\n\n\n\nAbstract\nPretrained large language models (LLMs) are widely used in many sub-fields of natural language processing (NLP) and generally known as excellent few-shot learners with task-specific exemplars. Notably, chain of thought (CoT) prompting, a recent technique for eliciting complex multi-step reasoning through step-by- step answer examples, achieved the state-of-the-art performances in arithmetics and symbolic reasoning, difficult system-2 tasks that do not follow the standard scaling laws for LLMs. While these successes are often attributed to LLMs’ ability for few-shot learning, we show that LLMs are decent zero-shot reasoners by simply adding “Let’s think step by step” before ea

In [17]:
docs[0].metadata

{'source': '/content/drive/MyDrive/data/llm_doc.txt'}

In [18]:
qa = RetrievalQA.from_chain_type(llm=llm, chain_type="stuff", retriever=docsearch.as_retriever())

In [19]:
query = "Why Large Language models are zero reasoners?"
response = qa.invoke(query)

In [70]:
pprint.pprint(response)

{'query': 'Why Large Language models are zero reasoners?',
 'result': '\n'
           'Large language models are considered to be zero-shot reasoners '
           'because they have the ability to solve various tasks without any '
           'prior training or examples. This is achieved by simply '
           'conditioning the models on a few examples or instructions '
           'describing the task. This method, known as "prompting", has become '
           'a popular topic in natural language processing. Additionally, '
           'recent studies have shown that large language models have '
           'excellent zero-shot abilities in many tasks, such as reading '
           'comprehension, translation, and summarization. This suggests that '
           'these models have untapped and understudied fundamental zero-shot '
           'capabilities, which can be extracted by simple prompting.'}


# Introduce a memory Buffer

In [20]:
from langchain.memory import ConversationBufferMemory

In [21]:
template = """
Use the following context (delimited by <ctx></ctx>) and the chat history (delimited by <hs></hs>) to answer the question:
------
<ctx>
{context}
</ctx>
------
<hs>
{history}
</hs>
------
{question}
Answer:
"""
prompt = PromptTemplate(
    input_variables=["history", "context", "question"],
    template=template,
)

In [22]:
qa = RetrievalQA.from_chain_type(
    llm=llm,
    chain_type='stuff',
    retriever=docsearch.as_retriever(),
    chain_type_kwargs={
        "verbose": False,
        "prompt": prompt,
        "memory": ConversationBufferMemory(
            memory_key="history",
            input_key="question"),
    }
)

In [23]:
query = "Why Large Language models are zero reasoners?"
response = qa.invoke(query)
pprint.pprint(response)

{'query': 'Why Large Language models are zero reasoners?',
 'result': '\n'
           'Large language models are zero-shot reasoners because they are '
           'able to solve various tasks without any examples or instructions, '
           'simply by conditioning the model on a few examples or instructions '
           'describing the task. This method of conditioning the language '
           'model is called "prompting" and has become a hot topic in natural '
           'language processing. Recent studies have shown that pre-trained '
           'models are not good at reasoning, but their ability can be '
           'substantially increased by making them produce step-by-step '
           'reasoning, either by fine-tuning or few-shot prompting. However, '
           'the context suggests that LLMs are not only good at few-shot '
           'learning, but also have decent zero-shot reasoning abilities, as '
           'demonstrated by the success of the Zero-shot-CoT method. This

In [24]:
query = "can you elaborate what it is prompting in this context?"
response = qa.invoke(query)
pprint.pprint(response)

{'query': 'can you elaborate what it is prompting in this context?',
 'result': '\n'
           'In this context, prompting refers to the method of conditioning a '
           'language model on a few examples or instructions to guide its '
           'generation of answers for desired tasks. This method has become '
           'popular in natural language processing and has been shown to '
           'significantly improve the performance of large language models. '
           'The context suggests that prompting can be used to enhance both '
           'few-shot and zero-shot reasoning abilities of LLMs, making them '
           'more adept at solving complex tasks.'}


In [25]:
book_path = "/content/drive/MyDrive/books/Generative AI on AWS.pdf"

In [26]:
from langchain_community.document_loaders import PyPDFLoader

In [27]:
loader = PyPDFLoader(book_path)
pages = loader.load_and_split()

In [28]:
pages[1]

Document(page_content='DATA“I am very excited about \nthis book —it has a great \nmix of all-important \nbackground/theoretical \ninfo and detailed, \nhands-on code, scripts, \nand walk-throughs. I \nenjoyed reading it, and  \nI know that you will too!” \n—Jeff Barr\nVP and Chief Evangelist @ AWSGenerative AI on AWS\nTwitter: @oreillymedia\nlinkedin.com/company/oreilly-media\nyoutube.com/oreillymedia Companies today are moving rapidly to integrate generative  \nAI into their products and services. But there’s a great deal  \nof hype (and misunderstanding) about the impact and \npromise of this technology. With this book, Chris Fregly,  \nAntje Barth, and Shelbee Eigenbrode from AWS help CTOs,  \nML practitioners, application developers, business analysts, \ndata engineers, and data scientists find practical ways to  \nuse this exciting new technology.\nYou’ll learn the generative AI project life cycle including  \nuse case definition, model selection, model fine-tuning, \nretrieval-aug

In [29]:
len(pages)

309

In [30]:
ids = docsearch.add_documents(pages)

In [31]:
query = "how we can fine tune a foundation model?"
response = qa.invoke(query)
pprint.pprint(response)

{'query': 'how we can fine tune a foundation model?',
 'result': '\n'
           'Fine-tuning a foundation model involves adapting it to a specific '
           'dataset or use case. This can be done by presenting a mix of '
           "instructions across many different tasks to maintain the model's "
           'ability to serve as a general-purpose generative model. It is '
           'recommended to establish a set of baseline evaluation metrics and '
           "compare the model's output before and after fine-tuning to measure "
           'its effectiveness. This process can be less costly than '
           'pretraining a model from scratch and can be done using techniques '
           'such as full fine-tuning or parameter-efficient methods.'}
