In [1]:
import getpass
import os

os.environ["COHERE_API_KEY"] = getpass.getpass("Cohere Key: ")
os.environ["OPENAI_API_KEY"] = getpass.getpass("OpenAI Key: ")

In [2]:
def pretty_print_docs(docs):
    print(
        f"\n{'-' * 100}\n".join(
            [f"Document {i+1}:\n\n" + d.page_content for i, d in enumerate(docs)]
        )
    )

In [3]:
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_community.document_loaders import DirectoryLoader
from langchain_community.vectorstores import FAISS
from langchain.embeddings.openai import OpenAIEmbeddings

documents = DirectoryLoader('C:/Users/fjdj0/Desktop/Coding/learning/langchain/material/', glob='**/*.pdf', show_progress=True, use_multithreading=True, silent_errors=True).load()
text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=100)
texts = text_splitter.split_documents(documents)
retriever = FAISS.from_documents(texts, OpenAIEmbeddings()).as_retriever(
    search_kwargs={"k": 20}
)

query = "Can some some numpy code and why we need to use numpy in simple terms please"
docs = retriever.get_relevant_documents(query)
pretty_print_docs(docs)

100%|██████████| 5/5 [00:30<00:00,  6.13s/it]


Document 1:

Data Processing Using Arrays Using NumPy arrays enables you to express many kinds of data processing tasks as concise array expressions that might otherwise require writing loops. This practice of replacing explicit loops with array expressions is commonly referred to as vectoriza- tion. In general, vectorized array operations will often be one or two (or more) orders of magnitude faster than their pure Python equivalents, with the biggest impact in any kind of numerical computations. Later, in
----------------------------------------------------------------------------------------------------
Document 2:

Computation on NumPy arrays can be very fast, or it can be very slow. The key to making it fast is to use vectorized operations, generally implemented through Num‐ Py’s universal functions (ufuncs). This section motivates the need for NumPy’s ufuncs, which can be used to make repeated calculations on array elements much more effi‐ cient. It then introduces many of the mo

In [5]:
from langchain.retrievers import ContextualCompressionRetriever
from langchain.retrievers.document_compressors import CohereRerank
from langchain import OpenAI

llm = OpenAI(temperature=0)
compressor = CohereRerank()
compression_retriever = ContextualCompressionRetriever(
    base_compressor=compressor, base_retriever=retriever
)

compressed_docs = compression_retriever.get_relevant_documents(
    "What did the president say about Ketanji Jackson Brown"
)
pretty_print_docs(compressed_docs)

Document 1:

data, the Pandas package is a much better choice, and we’ll dive into a full discussion of it in the next chapter.
----------------------------------------------------------------------------------------------------
Document 2:

no means best practice for presenting data, but rather included as a demonstration of some of the available options.
----------------------------------------------------------------------------------------------------
Document 3:

dataset to exploit for this pur‐ pose, since it includes the median housing prices of thousands of districts, as well as other data.


In [7]:
from langchain.chains import RetrievalQA
import markdown
chain = RetrievalQA.from_chain_type(
    llm=OpenAI(temperature=0), retriever=compression_retriever
)
chain({"query": query})

{'query': '<p>Can some some numpy code and why we need to use numpy in simple terms please</p>',
 'result': '\nNumpy is a library in Python that allows us to efficiently store and manipulate numerical data. It has fast array-processing capabilities and can be used to pass data between different algorithms. It also has universal functions (ufuncs) that can perform fast element-wise arithmetic operations on arrays, making calculations much more efficient. This is why we use numpy - to make our numerical computations faster and more efficient.'}

Numpy is a library in Python that allows us to efficiently store and manipulate numerical data. It has fast array-processing capabilities and can be used to pass data between different algorithms. It also has universal functions (ufuncs) that can perform fast element-wise arithmetic operations on arrays, making calculations much more efficient. This is why we use numpy - to make our numerical computations faster and more efficient.
