<center>

<img src="https://kajabi-storefronts-production.kajabi-cdn.com/kajabi-storefronts-production/file-uploads/sites/2148158644/images/85d1d5-f44-11de-03a0-b8b70657ae6f_pinecone.jpeg" width=80%>
</center>

**[Pinecone](https://www.pinecone.io/)** is a fully managed vector database built specifically for **high-performance vector search**. It helps you **store, index, and search vector embeddings** at scale—without managing the infrastructure yourself.

To use **Pinecone** for semantic vector storage and search, we’ll need:

* `langchain`: chaining and embedding integration

* `pinecone-client`: interact with **Pinecone DB**

* Either:
    * `openai` + `tiktoken`: for **OpenAI embeddings** or
    * `google-generativeai`: for **Gemini embeddings**

In [6]:
# Install necessary libraries (using gemini embeddings)
!pip install langchain pinecone-client pypdf -q
!pip install "google-ai-generativelanguage>=0.6.18,<0.7.0" langchain-google-genai -q


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m25.0.1[0m[39;49m -> [0m[32;49m25.1.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpython3 -m pip install --upgrade pip[0m

[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m25.0.1[0m[39;49m -> [0m[32;49m25.1.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpython3 -m pip install --upgrade pip[0m


In [None]:
from langchain_community.document_loaders import PyPDFLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter


In [2]:
loader = PyPDFLoader("../assets/MachineTranslationwithAttention.pdf")
pages = []
async for page in loader.alazy_load():
    pages.append(page)

In [3]:
pages[0].metadata

{'producer': 'pdfTeX-1.40.21',
 'creator': 'LaTeX with hyperref',
 'creationdate': '2021-11-05T20:59:50+00:00',
 'author': '',
 'keywords': '',
 'moddate': '2021-11-05T20:59:50+00:00',
 'ptex.fullbanner': 'This is pdfTeX, Version 3.14159265-2.6-1.40.21 (TeX Live 2020) kpathsea version 6.3.2',
 'subject': '',
 'title': '',
 'trapped': '/False',
 'rgid': 'PB:355917108_AS:1086893412356097@1636146991241',
 'source': '../assets/MachineTranslationwithAttention.pdf',
 'total_pages': 11,
 'page': 0,
 'page_label': '1'}

In [4]:
pages[1].page_content

'Neural Machine Translation with Attention\nMohammad Wasil Saleem\nMatrikel-Nr.: 805779\nUniversit¨at Potsdam\nsaleem1@uni-potsdam.de\nSandeep Uprety\nMatrikel-Nr. 804982\nUniversit¨at Potsdam\nuprety@uni-potsdam.de\nAbstract\nIn recent years, the success achieved\nthrough neural machine translation has\nmade it mainstream in machine translation\nsystems. In this work, encoder-decoder\nwith attention system based on ”Neural\nMachine Translation by Jointly Learning\nto Align and Translate” by Bahdanau et al.\n(2014) has been used to accomplish the\nMachine Translation between English and\nSpanish Language which has not seen\nmuch research work done as compared\nto other languages such as German and\nFrench. We aim to demonstrate the re-\nsults similar to the breakthrough paper on\nwhich our work is based on. We achieved\na BLEU score of 25.37, which was close\nenough to what Bahdanau et al. (2014)\nachieved in their work.\n1 Introduction\nMachine Translation (MT) is the task of translat

In [5]:
text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=500,
    chunk_overlap=25
)

In [6]:
text_chunks = text_splitter.split_documents(pages)
text_chunks[:3]

[Document(metadata={'producer': 'pdfTeX-1.40.21', 'creator': 'LaTeX with hyperref', 'creationdate': '2021-11-05T20:59:50+00:00', 'author': '', 'keywords': '', 'moddate': '2021-11-05T20:59:50+00:00', 'ptex.fullbanner': 'This is pdfTeX, Version 3.14159265-2.6-1.40.21 (TeX Live 2020) kpathsea version 6.3.2', 'subject': '', 'title': '', 'trapped': '/False', 'rgid': 'PB:355917108_AS:1086893412356097@1636146991241', 'source': '../assets/MachineTranslationwithAttention.pdf', 'total_pages': 11, 'page': 0, 'page_label': '1'}, page_content='See discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/355917108\nNeural Machine Translation with Attention\nTechnical Report · August 2021\nDOI: 10.13140/RG.2.2.29381.37607/1\nCITATION\n1\nREADS\n5,448\n2 authors:\nMohammad Wasil Saleem\nUniversität Potsdam\n4 PUBLICATIONS\xa0\xa0\xa02 CITATIONS\xa0\xa0\xa0\nSEE PROFILE\nSandeep Uprety\nUniversität Potsdam\n1 PUBLICATION\xa0\xa0\xa01 CITATION\xa0\xa0\xa0

In [7]:
print(text_chunks[3].page_content)

(2014) has been used to accomplish the
Machine Translation between English and
Spanish Language which has not seen
much research work done as compared
to other languages such as German and
French. We aim to demonstrate the re-
sults similar to the breakthrough paper on
which our work is based on. We achieved
a BLEU score of 25.37, which was close
enough to what Bahdanau et al. (2014)
achieved in their work.
1 Introduction
Machine Translation (MT) is the task of translat-


In [8]:
print(text_chunks[4].page_content)

ing text without human assistance while preserv-
ing the meaning of input text. The early approach
to machine translation relied heavily on hand-
crafted translation rules and linguistic knowledge.
Started in early around 1950s, unlike rule-based
machine translation, Statistical machine transla-
tion (SMT) generated translations based on statis-
tical models whose parameters are derived from
the analysis of bilingual text corpora (Koehn et al.,
2003). Though reliable, for SMT, it can be hard


In [9]:
import os
from dotenv import load_dotenv

# Load the .env file
load_dotenv()

GOOGLE_API_KEY = os.getenv("GEMINI_API_KEY")
PINECONE_API_KEY = os.getenv("PINECONE_API_KEY")
PINECONE_API_ENV = os.getenv("PINECONE_API_ENV")

In [41]:
from langchain_google_genai import GoogleGenerativeAIEmbeddings, ChatGoogleGenerativeAI

embeddings = GoogleGenerativeAIEmbeddings(model="models/text-embedding-004")
vector = embeddings.embed_query("hello, world!")
vector[:5]


[0.014134909026324749,
 -0.022324152290821075,
 -0.054603420197963715,
 -0.006284549366682768,
 -0.03392402455210686]

In [42]:
len(vector)     # Dimension of the embedding model

768

pip install langchain-pinecone

In [None]:
from langchain_pinecone.vectorstores import Pinecone as PC
from pinecone import Pinecone

pc = Pinecone(api_key=PINECONE_API_KEY)

index_name = "generative-ai"
index = pc.Index(index_name)

### Create Embeddings for each of the Text Chunk

In [44]:
docsearch = PC.from_texts(
    [t.page_content for t in text_chunks],
    embeddings,
    index_name=index_name
)

In [45]:
docsearch

<langchain_pinecone.vectorstores.Pinecone at 0x79ad18329610>

In [55]:
query="What is attention mechanism?"

docs = docsearch.similarity_search(query)
docs

[Document(id='3b2a3d55-1d6e-4901-8e8f-9bb13f87648d', metadata={}, page_content='coder output and all of the encoder hidden state.\nWe need to learn this alignment. Each output of\nthe decoder can selectively pick out speciﬁc ele-\nments from the sequence to produce the output.\nSo, this allows the model to focus and pay more\n”Attention” to the relevant part of the input se-\nquence.\nThe ﬁrst attention model was proposed by Bah-\ndanau et al. (2014), there are several other types of\nattention proposed, such as the one by Luong et al.'),
 Document(id='58860f51-d46d-4cf9-a85a-3fc324e8149c', metadata={}, page_content='Figure 2: Attention Model (Luong et al., 2015)\nuse the information from the past, present as well\nas from the future. We need the entire sequence of\ndata before we can make any predictions.\nThe architecture that we are proposing here is\nbased on the Encoder-Decoder Framework. The\nencoder takes in the input sentence and converts\nthem into a vector representation\nThe

In [49]:
llm = ChatGoogleGenerativeAI(model="gemini-2.0-flash")

In [56]:
from langchain.chains import RetrievalQA
from langchain.prompts import PromptTemplate

qa = RetrievalQA.from_chain_type(
    llm=llm,
    chain_type="stuff",
    retriever=docsearch.as_retriever())

In [57]:
qa.invoke(query)

{'query': 'What is attention mechanism?',
 'result': 'The attention mechanism addresses the issue of the encoder not being able to memorize words at the beginning of sentences, which leads to poor translation. It retains and utilizes all the hidden states of the input sentence during the decoding phase, creating an alignment between each time step of the decoder output and the encoder hidden state. This allows the model to focus and pay more attention to the relevant parts of the input sequence.'}

In [59]:
new_query = """
"How can attention mechanism improve the translation task?
What BLEU score was reported when using and not using the attention mechanism
"""

response = qa.invoke(new_query)
response["result"]

'The attention mechanism can improve the translation task because the encoder-decoder model with attention mechanism cope better with long sentences.\n\nThe BLEU score reported by Bahdanau et al. (2014) when using the attention mechanism was 26.75, training with 1000 encoder and decoder dimensions, and training on corpus of 384M words. The text mentions that the model without attention mechanism was underperforming with long sentences, but it does not specify the BLEU score.'