### Prompt templates 
- Help to translate user input and parameters into instructions for a language model.
- Prompt Templates output a PromptValue. This PromptValue can be passed to an LLM or a ChatModel

In [6]:
from langchain_core.prompts import ChatPromptTemplate
system_msg = "Translate the language from English to {language}"
prompt = ChatPromptTemplate.from_messages(
    [("system",system_msg), ("user","{text}")]
)
prompt = prompt.invoke({"language":"Italian","text":"hi"})
print(prompt)

messages=[SystemMessage(content='Translate the language from English to Italian', additional_kwargs={}, response_metadata={}), HumanMessage(content='hi', additional_kwargs={}, response_metadata={})]


In [None]:
# If we want to access messages directly
prompt.to_messages()

[SystemMessage(content='Translate the language from English to Italian', additional_kwargs={}, response_metadata={}),
 HumanMessage(content='hi', additional_kwargs={}, response_metadata={})]

In [9]:
from langchain_ollama import ChatOllama
model = ChatOllama(model="llama2")
response = model.invoke(prompt)
response.content

'Ciao! (Chow)'

### Documents 
####  Documents are intended to represent a unit of text and associated metadata. It has three attributes:. It has three attributes:

- page_content: a string representing the content;
- metadata: a dict containing arbitrary metadata;
- id: (optional) a string identifier for the document.
The metadata attribute can capture information about the source of the document, its relationship to other documents, and other information. 
### Note that an individual Document object often represents a chunk of a larger document.



## Document Loaders
Document loaders are designed to load document objects. LangChain has hundreds of integrations with various data sources to load data from: Slack, Notion, Google Drive, etc.
- When working with large datasets, you can use the .lazy_load method:



In [15]:
# for document in loader.lazy_load():
#     print(document)
from langchain_community.document_loaders import PyPDFLoader
file = "../attention is all you need.pdf"
loader = PyPDFLoader(file_path=file)
docs = loader.load()
print(docs[10].page_content[:200])
print("Length:",len(docs))

[5] Kyunghyun Cho, Bart van Merrienboer, Caglar Gulcehre, Fethi Bougares, Holger Schwenk,
and Yoshua Bengio. Learning phrase representations using rnn encoder-decoder for statistical
machine translati
Length: 15


### Splitting
- For both information retrieval and downstream question-answering purposes, a page may be too coarse a representation.
- Our goal in the end will be to retrieve Document objects that answer an input query, and further splitting our PDF will help ensure that the meanings of relevant portions of the document are not "washed out" by surrounding text.
- Large documents can dilute important information. Splitting helps improve retrieval accuracy.

In [17]:
from langchain_text_splitters import RecursiveCharacterTextSplitter
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000,chunk_overlap=200,add_start_index=True)
split_docs = text_splitter.split_documents(docs)
len(split_docs)

52

## Embeddings
Vector search is a common way to store and search over unstructured data (such as unstructured text). The idea is to store numeric vectors that are associated with the text. Given a query, we can embed it as a vector of the same dimension and use vector similarity metrics (such as cosine similarity) to identify related text."
https://python.langchain.com/docs/integrations/text_embedding/

In [23]:
from langchain_ollama import OllamaEmbeddings
embeddings = OllamaEmbeddings(model="llama2")
vectors = embeddings.embed_documents(split_docs[0].page_content)
vectors

[[0.017090062,
  0.0027207343,
  0.0023896203,
  -0.0036132417,
  -0.018104615,
  0.009382318,
  0.023345228,
  -0.011329206,
  -0.016836721,
  0.011749435,
  0.0043983813,
  -0.0009865034,
  -0.022814523,
  0.006527441,
  0.002029811,
  -0.0031695568,
  3.539518e-05,
  -0.013560653,
  0.004701068,
  -0.022765381,
  -0.0020020534,
  -0.027110893,
  -0.00269133,
  0.022735322,
  0.0016181265,
  -0.015245271,
  -0.015044807,
  0.011017536,
  0.0061868504,
  0.019850343,
  0.005722294,
  -0.019930165,
  0.020516396,
  0.009722108,
  0.00087768614,
  -0.03358023,
  -0.033772964,
  0.008089579,
  -0.0039116605,
  0.0065230937,
  0.018194132,
  0.005937826,
  0.024696043,
  -0.024083162,
  -0.014224839,
  0.003364533,
  0.008830009,
  0.0071857334,
  -0.0098787565,
  0.013772205,
  -0.025089337,
  -0.0025376284,
  -0.0024273372,
  -0.008117998,
  0.011824772,
  -0.006275674,
  0.009378758,
  -0.005720378,
  0.008937054,
  0.0130867,
  -0.0010207263,
  -0.02332093,
  -0.017634887,
  -0.019111

## Vector stores
LangChain VectorStore objects contain methods for adding text and Document objects to the store, and querying them using various similarity metrics. They are often initialized with embedding models, which determine how text data is translated to numeric vectors.
https://python.langchain.com/api_reference/core/vectorstores/langchain_core.vectorstores.base.VectorStore.html

In [27]:
from langchain.vectorstores import FAISS
vector_store = FAISS.from_documents(
    documents=docs,
    embedding=embeddings,
     )


In [32]:
# Now that we have our vectors in the vector store we can Perform things like similarity search etc.
res = vector_store.similarity_search(
    "What is attention "
)
print(res[0].page_content)

1 Introduction
Recurrent neural networks, long short-term memory [13] and gated recurrent [7] neural networks
in particular, have been firmly established as state of the art approaches in sequence modeling and
transduction problems such as language modeling and machine translation [ 35, 2, 5]. Numerous
efforts have since continued to push the boundaries of recurrent language models and encoder-decoder
architectures [38, 24, 15].
Recurrent models typically factor computation along the symbol positions of the input and output
sequences. Aligning the positions to steps in computation time, they generate a sequence of hidden
states ht, as a function of the previous hidden state ht−1 and the input for position t. This inherently
sequential nature precludes parallelization within training examples, which becomes critical at longer
sequence lengths, as memory constraints limit batching across examples. Recent work has achieved
significant improvements in computational efficiency through facto

In [38]:
res = vector_store.similarity_search_with_score("WHat is The goal of reducing sequential computation")
doc,score = res[0]
print("Score",score)
print("Document",doc)

Score 1.6813855
Document page_content='1 Introduction
Recurrent neural networks, long short-term memory [13] and gated recurrent [7] neural networks
in particular, have been firmly established as state of the art approaches in sequence modeling and
transduction problems such as language modeling and machine translation [ 35, 2, 5]. Numerous
efforts have since continued to push the boundaries of recurrent language models and encoder-decoder
architectures [38, 24, 15].
Recurrent models typically factor computation along the symbol positions of the input and output
sequences. Aligning the positions to steps in computation time, they generate a sequence of hidden
states ht, as a function of the previous hidden state ht−1 and the input for position t. This inherently
sequential nature precludes parallelization within training examples, which becomes critical at longer
sequence lengths, as memory constraints limit batching across examples. Recent work has achieved
significant improvements in

## Retrievers