In [23]:
import os

from langchain.document_loaders import UnstructuredFileLoader
from langchain_text_splitters import CharacterTextSplitter
from langchain_community.embeddings import HuggingFaceEmbeddings
from langchain_chroma import Chroma
from langchain_groq import ChatGroq
from langchain.chains import RetrievalQA

In [None]:
os.environ["GROQ_API_KEY"] = "gsk.....PTO"

In [25]:
# Fetch the PDF from the URL
import requests
url = "https://dspmuranchi.ac.in/pdf/Blog/Python%20Built-In%20Functions.pdf"
response = requests.get(url)

In [26]:
# Save the PDF to a local file
with open("python_inbuildfunction.pdf", "wb") as f:
    f.write(response.content)

In [27]:
# laoding the document
loader = UnstructuredFileLoader("python_inbuildfunction.pdf")

In [28]:
documents = loader.load()
documents

[Document(metadata={'source': 'python_inbuildfunction.pdf'}, page_content='Python Built-In Functions\n\nGaurav Kr. suman\n\nMIT5\n\n1. abs()\n\nThe abs() is one of the most popular Python built -in functions, which returns the absolute value of a number. A negative value’s absolute is that value is positive.\n\n>>> abs(-7)\n\n7\n\n>>> abs(7)\n\n7\n\n>>> abs(0)\n\n2. all()\n\nThe all() function takes a container as an argument. This Built in Functions returns True if all values in a python iterable have a Boolean value of True. An empty value has a Boolean value of False.\n\n>>> all({\'*\',\'\',\'\'})\n\nFalse\n\n>>> all([\' \',\' \',\' \'])\n\nTrue\n\n3. any()\n\nLike all(), it takes one argument and returns True if, even one value in the iterable has a Boolean value of True.\n\n>>> any((1,0,0))\n\nTrue\n\n>>> any((0,0,0))\n\nFalse\n\n4. ascii()\n\nIt is important Python built-in functions, returns a printable representation of a python object (like a string or a Python list). Let’s ta

In [29]:
text_splitter = CharacterTextSplitter(
    chunk_size=2000,
    chunk_overlap=400
)

In [30]:
texts = text_splitter.split_documents(documents)

In [31]:
type(texts)

list

In [32]:
len(texts)

7

In [33]:
texts[4]

Document(metadata={'source': 'python_inbuildfunction.pdf'}, page_content='frozenset() returns an immutable frozenset object.\n\n8 | P a g e\n\n>>> frozenset((3,2,4))\n\nfrozenset({2, 3, 4})\n\nRead Python Sets and Booleans for more on frozenset.\n\n24. getattr()\n\ngetattr() returns the value of an object’s attribute.\n\n>>> getattr(orange,\'size\')\n\n7\n\n25. globals()\n\nThis Python built-in functions, returns a dictionary of the current global symbol table.\n\n>>> globals()\n\n{‘__name__’: ‘__main__’, ‘__doc__’: None, ‘__package__’: None, ‘__loader__’: <class ‘_frozen_importlib.BuiltinImporter’>, ‘__spec__’: None, ‘__annotations__’: {}, ‘__builtins__’: <module ‘builtins’ (built-in)>, ‘fruit’: <class ‘__main__.fruit’>, ‘orange’: <__main__.fruit object at 0x05F937D0>, ‘a’: 2, ‘numbers’: [1, 2, 3], ‘i’: (2, 3), ‘x’: 7, ‘b’: 3}\n\n26. hasattr()\n\nLike delattr() and getattr(), hasattr() Python built-in functions, returns True if the object has that attribute.\n\n>>> hasattr(orange,\'si

In [34]:
embeddings = HuggingFaceEmbeddings()

In [35]:
persist_directory = "vector_db"

In [36]:
vectordb = Chroma.from_documents(
    documents=texts,
    embedding=embeddings,
    persist_directory=persist_directory
)

Failed to send telemetry event ClientStartEvent: capture() takes 1 positional argument but 3 were given
Failed to send telemetry event ClientCreateCollectionEvent: capture() takes 1 positional argument but 3 were given


In [37]:
# retriever
retriever = vectordb.as_retriever()

In [38]:
# llm from groq
llm = ChatGroq(
    model="llama-3.1-8b-instant",
    temperature=0
)

In [39]:
qa_chain = RetrievalQA.from_chain_type(
    llm=llm,
    chain_type="stuff",
    retriever=retriever,
    return_source_documents=True
)

In [40]:
# invoke the qa chain and get a response for user query
query = "what are the function from this pdf "
response = qa_chain.invoke({"query": query})

In [41]:
print(response)

{'query': 'what are the function from this pdf ', 'result': 'Based on the provided text, the following are the Python built-in functions mentioned:\n\n1. `divmod()`\n2. `enumerate()`\n3. `eval()`\n4. `exec()`\n5. `filter()`\n6. `float()`\n7. `format()`\n8. `frozenset()`\n9. `getattr()`\n10. `globals()`\n\nThese functions are used for various purposes such as:\n\n- Mathematical operations (`divmod()`)\n- Enumerating iterables (`enumerate()`)\n- Evaluating expressions (`eval()`)\n- Executing code dynamically (`exec()`)\n- Filtering iterables (`filter()`)\n- Converting data types (`float()`)\n- Formatting strings (`format()`)\n- Creating immutable sets (`frozenset()`)\n- Accessing object attributes (`getattr()`)\n- Accessing global variables (`globals()`)', 'source_documents': [Document(metadata={'source': 'python_inbuildfunction.pdf'}, page_content='>>> divmod(3,7)\n\n6 | P a g e\n\n(0, 3)\n\n>>> divmod(7,3)\n\n(2, 1) If you encounter any doubt in Python Built-in Function, Please Comment

In [42]:
print(response["result"])

Based on the provided text, the following are the Python built-in functions mentioned:

1. `divmod()`
2. `enumerate()`
3. `eval()`
4. `exec()`
5. `filter()`
6. `float()`
7. `format()`
8. `frozenset()`
9. `getattr()`
10. `globals()`

These functions are used for various purposes such as:

- Mathematical operations (`divmod()`)
- Enumerating iterables (`enumerate()`)
- Evaluating expressions (`eval()`)
- Executing code dynamically (`exec()`)
- Filtering iterables (`filter()`)
- Converting data types (`float()`)
- Formatting strings (`format()`)
- Creating immutable sets (`frozenset()`)
- Accessing object attributes (`getattr()`)
- Accessing global variables (`globals()`)


In [43]:
print(response["source_documents"][0].metadata["source"])

python_inbuildfunction.pdf


In [44]:
# invoke the qa chain and get a response for user query
query = "Give me summary of all function from this pdf?"
response = qa_chain.invoke({"query": query})
print(response["result"])
print("*"*30)
print("Source Document:", response["source_documents"][0].metadata["source"])

Here's a summary of the Python built-in functions mentioned in the provided context:

1. **divmod()**: Returns the quotient and remainder of the division of two numbers.

   Example: `divmod(3, 7)` returns `(0, 3)` and `divmod(7, 3)` returns `(2, 1)`

2. **enumerate()**: Adds a counter to an iterable and returns it.

   Example: `for i in enumerate(['a', 'b', 'c']): print(i)` returns `(0, 'a') (1, 'b') (2, 'c')`

3. **eval()**: Parses a string as a Python expression and executes it.

   Example: `eval('x+7')` where `x = 7` returns `14`

4. **exec()**: Runs Python code dynamically.

   Example: `exec('a=2;b=3;print(a+b)')` returns `5`

5. **filter()**: Filters out items from an iterable for which a condition is true.

   Example: `list(filter(lambda x: x%2==0, [1, 2, 0, False]))` returns `[2, 0, False]`

6. **float()**: Converts an integer or a compatible value into a floating-point number.

   Example: `float(2)` returns `2.0` and `float('3')` returns `3.0`

7. **format()**: Formats a 