In [1]:
#Data ingestion - reading from pdf, documents, etc
from langchain_community.document_loaders import PyPDFLoader
loader = PyPDFLoader("attention.pdf")
docs = loader.load()

In [2]:
# transform - made chunks
from langchain.text_splitter import RecursiveCharacterTextSplitter
splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
documents=splitter.split_documents(docs)
documents[:1]

[Document(page_content='Provided proper attribution is provided, Google hereby grants permission to\nreproduce the tables and figures in this paper solely for use in journalistic or\nscholarly works.\nAttention Is All You Need\nAshish Vaswani∗\nGoogle Brain\navaswani@google.comNoam Shazeer∗\nGoogle Brain\nnoam@google.comNiki Parmar∗\nGoogle Research\nnikip@google.comJakob Uszkoreit∗\nGoogle Research\nusz@google.com\nLlion Jones∗\nGoogle Research\nllion@google.comAidan N. Gomez∗ †\nUniversity of Toronto\naidan@cs.toronto.eduŁukasz Kaiser∗\nGoogle Brain\nlukaszkaiser@google.com\nIllia Polosukhin∗ ‡\nillia.polosukhin@gmail.com\nAbstract\nThe dominant sequence transduction models are based on complex recurrent or\nconvolutional neural networks that include an encoder and a decoder. The best\nperforming models also connect the encoder and decoder through an attention\nmechanism. We propose a new simple network architecture, the Transformer,\nbased solely on attention mechanisms, dispensing 

In [3]:
# embedding
# from langchain_openai import OpenAIEmbeddings
from langchain_community.embeddings import OllamaEmbeddings
from langchain_community.vectorstores import Chroma
db=Chroma.from_documents(documents, OllamaEmbeddings())

In [4]:
from langchain_community.llms import Ollama
llm = Ollama(model="llama2")
llm

Ollama()

In [5]:
from langchain_core.prompts import ChatPromptTemplate
prompt = ChatPromptTemplate.from_template("""Please provide excellent answer for the user query.
                                          <context>{context}</context>
                                          question:{input}""")

In [6]:
# chain llm and prompt
from langchain.chains.combine_documents import create_stuff_documents_chain
chain = create_stuff_documents_chain(llm, prompt)

# retriever with vector db
ret = db.as_retriever()

# combine chain, retriever
from langchain.chains import create_retrieval_chain
chain_ret = create_retrieval_chain(ret,chain)


In [7]:
res = chain_ret.invoke({'input':"WMT 2014 English-German dataset"})
res['answer']

'The question you provided is related to the comparison of self-attention layers and recurrent layers in neural machine translation models. Specifically, you are interested in understanding how these layers differ in terms of computational complexity and executed operations.\n\nTo answer your question, I will provide a brief overview of self-attention layers and recurrent layers, followed by a comparison of their computational complexity and executed operations.\n\nSelf-Attention Layers:\nSelf-attention layers are a type of neural network layer that allows the model to attend to different parts of the input sequence simultaneously and weigh their importance. In a self-attention layer, the output is computed as a weighted sum of the input values, where the weights are learned during training. Self-attention layers are particularly useful in natural language processing tasks, such as machine translation, as they can capture long-range dependencies in the input sequence.\n\nRecurrent Laye

In [9]:
res = chain_ret.invoke({'input':"what is python?"})
res['answer']

"Python is a high-level programming language that is widely used for various applications such as web development, data analysis, machine learning, and more. Here are some key features of Python:\n\n1. Interpreted Language: Python is an interpreted language, meaning that it can execute code directly without the need for compiling it first. This makes it a versatile language that can be used for rapid prototyping and development.\n2. Simple Syntax: Python has a simple syntax with clear guidelines for indentation, spacing, and commenting code. This makes it easier to read and write compared to other programming languages.\n3. Large Standard Library: Python has a vast standard library that includes modules for various tasks such as file I/O, networking, and data structures. It also has an extensive collection of third-party libraries and frameworks that can be used to perform specific tasks.\n4. Object-Oriented Programming: Python supports object-oriented programming (OOP) concepts such a