<a href="https://colab.research.google.com/github/XiaoHuang0803/ColabNotebooks/blob/main/LC02_langchain_RAG.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
# Install
!pip install langchain_community
!pip install langchain_openai
!pip install langchain_pinecone

Collecting langchain_community
  Downloading langchain_community-0.3.14-py3-none-any.whl.metadata (2.9 kB)
Collecting dataclasses-json<0.7,>=0.5.7 (from langchain_community)
  Downloading dataclasses_json-0.6.7-py3-none-any.whl.metadata (25 kB)
Collecting httpx-sse<0.5.0,>=0.4.0 (from langchain_community)
  Downloading httpx_sse-0.4.0-py3-none-any.whl.metadata (9.0 kB)
Collecting pydantic-settings<3.0.0,>=2.4.0 (from langchain_community)
  Downloading pydantic_settings-2.7.1-py3-none-any.whl.metadata (3.5 kB)
Collecting marshmallow<4.0.0,>=3.18.0 (from dataclasses-json<0.7,>=0.5.7->langchain_community)
  Downloading marshmallow-3.25.1-py3-none-any.whl.metadata (7.3 kB)
Collecting typing-inspect<1,>=0.4.0 (from dataclasses-json<0.7,>=0.5.7->langchain_community)
  Downloading typing_inspect-0.9.0-py3-none-any.whl.metadata (1.5 kB)
Collecting python-dotenv>=0.21.0 (from pydantic-settings<3.0.0,>=2.4.0->langchain_community)
  Downloading python_dotenv-1.0.1-py3-none-any.whl.metadata (23 kB

In [25]:
from langchain_community.document_loaders import TextLoader
from langchain_text_splitters import CharacterTextSplitter
from langchain_openai import OpenAIEmbeddings, ChatOpenAI
from langchain_pinecone import PineconeVectorStore
from pinecone import Pinecone, ServerlessSpec

import os
from langchain import hub
from langchain_core.prompts import PromptTemplate
from langchain.chains.combine_documents.stuff import create_stuff_documents_chain
from langchain.chains import create_retrieval_chain
from langchain_core.runnables import RunnablePassthrough
from langchain_core.prompts.base import format_document

In [3]:
# Load environment variables
INDEX_NAME = "medium-blogs-embedding-index"
from google.colab import userdata
PINECONE_API_KEY = userdata.get('pineConeAccessKey')
OPENAI_API_KEY = userdata.get('openAIAccessKey')

In [5]:
print("ingesting ...")
loader = TextLoader("./mediumblog1.txt")
document = loader.load()

ingesting ...


In [6]:
print("splitting ...")
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
texts = text_splitter.split_documents(document)
print(f"created {len(texts)} chunks.")



splitting ...
created 20 chunks.


In [7]:
embeddings = OpenAIEmbeddings(openai_api_key=OPENAI_API_KEY)

In [8]:
print("ingesting ...")
os.environ["PINECONE_API_KEY"] = PINECONE_API_KEY
PineconeVectorStore.from_documents(
    texts, embeddings, index_name=INDEX_NAME
)

ingesting ...


<langchain_pinecone.vectorstores.PineconeVectorStore at 0x78d49459c110>

In [9]:
# Retrieval
llm = ChatOpenAI(openai_api_key=OPENAI_API_KEY)

In [10]:
# without RAG?
query = "What is pinecone in machine learning?"
chain = PromptTemplate.from_template(template=query) | llm
result = chain.invoke(input={})
print(result.content)

In machine learning, a pinecone is a specialized hardware accelerator that is designed to speed up the training and inference processes of machine learning models. Pinecone accelerators are often used in data centers to improve the efficiency and performance of machine learning tasks, such as image recognition, natural language processing, and speech recognition. These hardware accelerators are optimized for parallel processing and are typically more powerful and energy-efficient compared to traditional CPUs and GPUs.


In [19]:
# Build a RAG chain?
vectorstore = PineconeVectorStore(index_name=INDEX_NAME, embedding=embeddings)
retrieval_qa_chat_prompt = hub.pull("langchain-ai/retrieval-qa-chat")
combine_docs_chain = create_stuff_documents_chain(llm, retrieval_qa_chat_prompt)
retrieval_chain = create_retrieval_chain(
    retriever = vectorstore.as_retriever(),
    combine_docs_chain = combine_docs_chain,
)
result = retrieval_chain.invoke({"input": query})
print(result["answer"])



Pinecone is a fully managed cloud-based vector database designed to enable businesses and organizations to create and deploy large-scale machine learning applications. It offers features such as efficient retrieval of similar data points based on their vector representations, infrastructure management, high query throughput, low latency search, security compliance, user-friendly API integration for storing and retrieving vector data, real-time updates, and syncing capabilities with tools like Airbyte and monitoring with Datadog.


In [29]:
# Create using LCEL
template = """Use the following pieces of context to answer the question at the end. If you don't know the answer, just say that you don't know, don't try to make up an answer. Use three sentences maximum and keep the answer ans concise as possible. Always say "thanks for asking!" at the end of the answer.

{context}

Question: {question}

Helpful Answer:"""
custom_rag_prompt = PromptTemplate.from_template(template)
def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)

rag_chain = (
    {"context": vectorstore.as_retriever() | format_docs, "question": RunnablePassthrough()}
    | custom_rag_prompt
    | llm
)
res = rag_chain.invoke(query)
print(res)

content='Pinecone is a fully managed cloud-based vector database designed for large-scale ML applications, providing efficient retrieval of similar data points based on vector representations, high query throughput, and real-time updates. It offers user-friendly interfaces, infrastructure management, and security features, making it accessible and secure for businesses and organizations. Thanks for asking!' additional_kwargs={'refusal': None} response_metadata={'token_usage': {'completion_tokens': 67, 'prompt_tokens': 889, 'total_tokens': 956, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None} id='run-281ff61d-a323-4838-9482-b89728a589cf-0' usage_metadata={'input_tokens': 889, 'output_tokens': 67, 'total_tokens': 956, 'input_tok