1. Install Required Packages

First, install the necessary Python packages:

In [None]:
pip install sentence-transformers transformers faiss-cpu langchain

2. Set Up Embeddings with sentence-transformers

We'll use a pre-trained model from sentence-transformers to generate embeddings for our documents.

In [1]:
from sentence_transformers import SentenceTransformer

# Load the pre-trained sentence transformer model locally
embedder = SentenceTransformer('all-MiniLM-L6-v2')

# Example documents
documents = [
    "The revenue for the last quarter was $10 million.",
    "Our operating income increased by 15% compared to the previous year.",
    "The company plans to launch a new product next quarter.",
    "Our net profit margin has improved by 5%.",
    "We are investing heavily in research and development."
]

# Generate embeddings for the documents
document_embeddings = embedder.encode(documents)

  from tqdm.autonotebook import tqdm, trange


3. Use FAISS for Vector Store

We'll use FAISS, an efficient similarity search library, to store and retrieve the document embeddings.

In [2]:
import faiss
import numpy as np

# Create a FAISS index
dimension = document_embeddings.shape[1]  # Dimensions of the embeddings
index = faiss.IndexFlatL2(dimension)  # L2 distance is a common choice

# Add document embeddings to the index
index.add(np.array(document_embeddings))

# Optional: Create a function to map index to document
def search_faiss(query_embedding, k=5):
    distances, indices = index.search(np.array([query_embedding]), k)
    return indices[0], distances[0]

4. Set Up a Local Language Model with Hugging Face Transformers

Now, let's set up a local language model that will generate answers based on the retrieved documents.

In [3]:
from transformers import pipeline

# Load a local language model for text generation
# (you may use 'gpt2' or any other model available locally)
generator = pipeline('text-generation', model='gpt2')

# Define a function to generate answers
def generate_answer(context, query):
    prompt = f"Context: {context}\n\nQuestion: {query}\nAnswer:"
    response = generator(prompt, max_length=100, num_return_sequences=1)
    return response[0]['generated_text']

Hardware accelerator e.g. GPU is available in the environment, but no `device` argument is passed to the `Pipeline` object. Model will be on CPU.


5. Integrate Retrieval and Generation

Let's combine everything into a function that takes a query, retrieves the most relevant documents, and generates an answer.

In [5]:
def answer_question(query, k=3):
    # Generate embedding for the query
    query_embedding = embedder.encode(query)

    # Retrieve the top-k relevant documents
    indices, distances = search_faiss(query_embedding, k=k)
    relevant_docs = [documents[i] for i in indices]

    # Combine the relevant documents into a single context
    context = " ".join(relevant_docs)

    # Generate the answer using the local language model
    answer = generate_answer(context, query)

    return answer

6. Run the Q&A System

Finally, you can run the system with a sample query.

In [6]:
query = "What is the revenue for the last quarter?"
answer = answer_question(query)
print(answer)

Truncation was not explicitly activated but `max_length` is provided a specific value, please use `truncation=True` to explicitly truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy to `truncation`.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Context: The revenue for the last quarter was $10 million. Our operating income increased by 15% compared to the previous year. The company plans to launch a new product next quarter.

Question: What is the revenue for the last quarter?
Answer: Our total revenue was $0.33 billion in the quarter ended June 30, 2015 compared to what was reported for that quarter from Q3 2016, or $2.15 bx.

Question: The next quarter sales are
