# **End-to-End RAG Tutorial Using Salesforce, Airbyte Cloud, Weaviate, and LangChain**
This notebook illustrates the complete setup of a Retrieval-Augmented Generation (RAG) pipeline.<br>
We extract data from a GitHub repository using PyAirbyte, store the data in a Chroma vector store, and use LangChain to perform RAG on the stored data.<br>
## **Prerequisites**
**1) OpenAI API Key**:
   - **Create an OpenAI Account**: Sign up for an account on [OpenAI](https://www.openai.com/).
   - **Generate an API Key**: Go to the API section and generate a new API key. For detailed instructions, refer to the [OpenAI documentation](https://beta.openai.com/docs/quickstart).

**2) Weaviate Cluster's Public URL and API Key**:

   - **Get your URL and Key**: Cick on your clusters drop down button. Visit [this](https://console.weaviate.cloud/).


# **Installing Dependencies**


In [None]:
# Add virtual environment support if needed
!apt-get install -qq python3.10-venv

# Install required packages
%pip install --quiet openai langchain-openai tiktoken pandas weaviate-client langchain-weaviate langchain-community


# **Set Up Environment Variables**

In [None]:
import os

os.environ["WEAVIATE_URL"] = "YOUR_WEAVIATE_URL"
os.environ["WEAVIATE_API_KEY"] = "YOUR_WEAVIATE_API_KEY"
os.environ["OPENAI_API_KEY"] = "YOUR_OPENAI_API_KEY"

# **Initialize Weaviate Vector Store**

In [None]:
import weaviate
from weaviate.auth import AuthApiKey
from langchain_weaviate.vectorstores import WeaviateVectorStore
from langchain_community.document_loaders import TextLoader
from langchain_openai import OpenAIEmbeddings
from langchain_text_splitters import CharacterTextSplitter
import pandas as pd

# Connect to Weaviate with API key
auth_config = AuthApiKey(api_key=os.getenv("WEAVIATE_API_KEY"))

try:
    weaviate_client = weaviate.Client(
        url=os.getenv("WEAVIATE_URL"),
        auth_client_secret=auth_config,
    )
    print("Successfully connected to Weaviate", flush=True)
except Exception as e:
    print(f"Error connecting to Weaviate: {e}", flush=True)



# **Embedding and similarity search with Weaviate**
Here we will convert the user's query into embeddings using OpenAI and retrieve similar chunks from Weaviate based on the query. <br>
### Note: Change collection and property according to your own requirement!

In [None]:
from langchain_openai import OpenAIEmbeddings
from typing import List

# Initialize OpenAI client for embeddings
openai_embeddings = OpenAIEmbeddings(openai_api_key=os.getenv("OPENAI_API_KEY"))

# Convert user's query into a vector array to prep for similarity search
def get_embedding_from_openai(query) -> List[float]:
    return openai_embeddings.embed_query(query)

# Use Weaviate to find matching chunks
collection="Lead"
property="name"
def get_similar_chunks_from_weaviate(query: str) -> List[str]:
    try:
        embedding = get_embedding_from_openai(query)
        near_vector = {
            "vector": embedding
        }
        result = weaviate_client.query.get(collection, [property]).with_near_vector(near_vector).do()

        if 'data' in result and 'Get' in result['data'] and collection in result['data']['Get']:
            chunks = [res[property] for res in result['data']['Get'][collection]]
            return chunks
        else:
            print("Unexpected result format:", result, flush=True)
            return []
    except Exception as e:
        print(f"Error during Weaviate query: {e}", flush=True)
        return []


# **Building RAG Pipeline and asking a question**
Finally we use OpenAI for querying our data! <br>
We know the three main steps of a RAG Pipeline are : <br>
- Embedding incoming query <br>
- Doing similarity search to find matching chunks <br>
- Send chunks to LLM for completion

In [None]:
from typing import List
from openai import OpenAI

client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))

# Use OpenAI to complete the response
def get_completion_from_openai(question, document_chunks: List[str], model_name="gpt-3.5-turbo"):
    chunks = "\n\n".join(document_chunks)

    try:
        completion = client.chat.completions.create(
            model=model_name,
            messages=[
                {"role": "system", "content": "You are an assistant. Answer the question based on the context. Do not use any other information. Be concise."},
                {"role": "user", "content": f"Context:\n{chunks}\n\n{question}\n\nAnswer:"}
            ],
            max_tokens=150
        )
        return completion.choices[0].message.content.strip()
    except Exception as e:
        print(f"Error during OpenAI completion: {e}", flush=True)
        return "There was an error generating the response."


# Putting it all together
def get_response(query, model_name="gpt-3.5-turbo"):
    chunks = get_similar_chunks_from_weaviate(query)
    if len(chunks) == 0:
        return "I am sorry, I do not have the context to answer your question."
    else:
        return get_completion_from_openai(query, chunks, model_name)

# Ask a question
query = 'How many lead work in BNY?'
response = get_response(query)

print(f"\n\nResponse from LLM:\n\n{response}", flush=True)
