 This code enables a chatbot interface that allows users to upload a PDF document and interact with it using the LLaMA3 language model. Users can ask questions related to the uploaded PDF, and the system processes the document by chunking its contents, storing them in a FAISS vector database, and retrieving relevant information to generate responses. The chatbot utilizes the LLaMA3 model to deliver accurate, document-based answers, ensuring that the responses are based solely on the content of the uploaded file.

In [None]:
import os
from groq import Groq
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.vectorstores import Chroma
from llama_parse import LlamaParse 
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
import nest_asyncio  # noqa: E402
from chromadb import Client
from chromadb.config import Settings
from langchain.text_splitter import RecursiveCharacterTextSplitter
from sentence_transformers import SentenceTransformer
import faiss 
import numpy as np
import pickle

In [4]:
# Access the llm from Groq
client = Groq(
    api_key=os.getenv("Groq_API_Key"),
)

chat_completion = client.chat.completions.create(
    messages=[
        {
            "role": "user",
            "content": "Explain the importance of fast language models",
        }
    ],
    model="llama3-8b-8192",
)

print(chat_completion.choices[0].message.content)

Fast language models, particularly those designed for natural language processing (NLP) tasks, have become increasingly important in recent years due to their ability to process and generate human-like text at an unprecedented scale and speed. Here are some reasons why:

1. **Efficient Language Understanding**: Fast language models can quickly analyze vast amounts of text data, enabling them to extract insights, identify patterns, and make predictions with high accuracy. This is essential for tasks like sentiment analysis, topic modeling, and text classification.
2. **Scalability**: As data volumes continue to grow, fast language models can efficiently handle large amounts of text data, making them suitable for applications like information retrieval, language translation, and question-answering systems.
3. **Real-time Processing**: Fast language models enable real-time processing of text data, which is crucial for applications like chatbots, voice assistants, and live customer support

# RAG

In [4]:
parser = LlamaParse(
    api_key=os.getenv("LlamaIndex_API_Key"),
    result_type="markdown"  # "markdown" and "text" are available
)

In [5]:
nest_asyncio.apply()
file_extractor = {".pdf": parser}
documents = SimpleDirectoryReader(input_files=['data/Rainwater_storage.pdf'], file_extractor=file_extractor).load_data()

Started parsing the file under job_id 3635970e-e7a8-4c09-a63b-dcfefb6fdd49


In [6]:
def chunk_text_with_splitter(text, chunk_size=1000, chunk_overlap=100):
    """Split the text into smaller chunks using RecursiveCharacterTextSplitter."""
    splitter = RecursiveCharacterTextSplitter(chunk_size=chunk_size, chunk_overlap=chunk_overlap)
    chunks = splitter.split_text(text)
    return chunks

In [10]:
# Initialize FAISS Index
def create_faiss_index(dim):
    """Create a FAISS index for the given dimension."""
    index = faiss.IndexFlatL2(dim)  # Using L2 distance
    return index

In [11]:
# Initialize SentenceTransformer for embedding
model = SentenceTransformer('all-MiniLM-L6-v2')



In [22]:
# Function to store chunks into a pickle file
def store_chunks_to_pickle(chunks, file_name='text_chunks.pkl'):
    """Append chunks to a pickle file."""
    try:
        # Load existing chunks if file exists
        with open(file_name, 'rb') as file:
            existing_chunks = pickle.load(file)
    except FileNotFoundError:
        # If file doesn't exist, start with an empty list
        existing_chunks = []

    # Append new chunks to existing ones
    existing_chunks.extend(chunks)

    # Store the updated chunks back to the pickle file
    with open(file_name, 'wb') as file:
        pickle.dump(existing_chunks, file)

    print(f"Stored {len(chunks)} new chunks into {file_name}")

In [23]:
def embed_text(text):
    """Convert text to embeddings."""
    return model.encode(text).astype('float32')  # Ensure embeddings are in float32 format

# Step 6: Store embeddings in FAISS
def store_embeddings(index, chunks):
    """Store embeddings in FAISS index."""
    embeddings = []
    for chunk in chunks:
        chunk_embedding = embed_text(chunk)
        embeddings.append(chunk_embedding)

    # Convert to a numpy array and add to the FAISS index
    embeddings_np = np.array(embeddings).astype('float32')
    index.add(embeddings_np)
    return index

In [24]:
# Step 7: Process all documents and store their chunks in FAISS
faiss_index = create_faiss_index(dim=384)  # Set dimension to match your model's output size

for doc_idx, document in enumerate(documents):
    # Extract text from each document
    doc_text = getattr(document, 'text', "")
    
    if doc_text:  # Only process if the document has text
        # Split the document into chunks
        chunks = chunk_text_with_splitter(doc_text, chunk_size=1000, chunk_overlap=100)

        # Store chunks in pickle file
        store_chunks_to_pickle(chunks, file_name='text_chunks.pkl')

        # Store chunks in FAISS
        faiss_index = store_embeddings(faiss_index, chunks)

        print(f"Processed Document {doc_idx + 1} with {len(chunks)} chunks.")
    else:
        print(f"Skipping empty document {doc_idx + 1}")

# Step 8: Save the FAISS index to disk
faiss.write_index(faiss_index, 'faiss_index.index')
print("Stored embeddings into FAISS index and saved to 'faiss_index.index'.")

# Optional: Load FAISS index from disk
# faiss_index = faiss.read_index('faiss_index.index')

# Step 9: Display the number of stored embeddings
print(f"Total number of embeddings stored: {faiss_index.ntotal}")

Stored 3 new chunks into text_chunks.pkl
Processed Document 1 with 3 chunks.
Stored 4 new chunks into text_chunks.pkl
Processed Document 2 with 4 chunks.
Stored 4 new chunks into text_chunks.pkl
Processed Document 3 with 4 chunks.
Stored 2 new chunks into text_chunks.pkl
Processed Document 4 with 2 chunks.
Stored 2 new chunks into text_chunks.pkl
Processed Document 5 with 2 chunks.
Stored 2 new chunks into text_chunks.pkl
Processed Document 6 with 2 chunks.
Stored 4 new chunks into text_chunks.pkl
Processed Document 7 with 4 chunks.
Stored 1 new chunks into text_chunks.pkl
Processed Document 8 with 1 chunks.
Stored 4 new chunks into text_chunks.pkl
Processed Document 9 with 4 chunks.
Stored 1 new chunks into text_chunks.pkl
Processed Document 10 with 1 chunks.
Stored 4 new chunks into text_chunks.pkl
Processed Document 11 with 4 chunks.
Stored 3 new chunks into text_chunks.pkl
Processed Document 12 with 3 chunks.
Stored 1 new chunks into text_chunks.pkl
Processed Document 13 with 1 chu

In [25]:
# Step 10: Load chunks from pickle (for future use)
def load_chunks_from_pickle(file_name='text_chunks.pkl'):
    with open(file_name, 'rb') as file:
        chunks = pickle.load(file)
    return chunks

In [29]:
def get_llm_response(query, top_k=5):
    """Retrieve relevant information from FAISS and get LLM response."""
    # Embed the query
    query_embedding = embed_text(query).reshape(1, -1)
    # Load the FAISS index from file
    faiss_index = faiss.read_index('faiss_index.index')

    # Retrieve the top_k nearest neighbors
    distances, indices = faiss_index.search(query_embedding, top_k)
    chunks = load_chunks_from_pickle()
    # Get the corresponding chunks from the index
    retrieved_chunks = []
    for idx in indices[0]:
        if idx >= 0:  # Ensure the index is valid
            retrieved_chunks.append(chunks[idx])  # You need to maintain a separate list of chunks

    # Construct the prompt for the LLM
    prompt = f"Based on the following information, {query}\n\n" + "\n\n".join(retrieved_chunks)

    # Use the Groq API to get the LLM response
    client = Groq(api_key=os.getenv("Groq_API_Key"))  # Replace with your actual Groq API key
    chat_completion = client.chat.completions.create(
        messages=[
            {
                "role": "user",
                "content": prompt,
            }
        ],
        model="llama3-8b-8192",
    )

    return chat_completion.choices[0].message.content



In [34]:
# Example usage
query = "give is intro of Modeling Rainwater Harvesting Systems with Covered Storage Tank on A Smartphone"
response = get_llm_response(query)
print(response)

Here's a possible introduction for "Modeling Rainwater Harvesting Systems with Covered Storage Tank on A Smartphone":

Rainwater harvesting and storage have become increasingly relevant in today's water-scarce world. With the increasing frequency of droughts and water scarcity, finding innovative solutions to meet our water needs is more crucial than ever. One such approach is the rainwater harvesting system with a covered storage tank, which offers a simple and energy-efficient way to collect and store rainwater for future use. This type of system, commonly referred to as RWHS, has gained popularity globally due to its potential to reduce the reliance on traditional water sources and minimize energy consumption. In this article, we present a smartphone-based model for designing and optimizing RWHS with covered storage tanks, with the aim of providing a reliable and sustainable solution for water management.
