## RAG implementation
#### in this notebook we will briefly compare RAG enriched LLM and basic LLM

In [15]:
import os
from dotenv import load_dotenv
import openai
import chromadb
import tiktoken


In [31]:
load_dotenv()

True

In [32]:
OPEN_AI_API_KEY = os.getenv("OPENAI_API_KEY")
CHROMA_DB_PATH = "./../data_collecting/chroma_db"

In [33]:
client = chromadb.PersistentClient(path=CHROMA_DB_PATH)
collection = client.get_or_create_collection(name="python_data")

In [34]:
openai.api_key = OPEN_AI_API_KEY 

In [38]:
def get_openai_embedding(text):
    response = openai.embeddings.create(
        input=[text],
        model="text-embedding-ada-002"
    )
    return response.data[0].embedding

In [48]:
def retrieve_documents(query, python_version, top_k):
    query_embedding = get_openai_embedding(query)

    results = collection.query(
        query_embeddings=[query_embedding],
        n_results = top_k,
        where={"version": python_version}
    )

    return results["documents"][0] if "documents" in results and results["documents"] else []


In [64]:
def generate_response(query, retrieved_docs, python_version):
    context = "\n\n".join(retrieved_docs)
    
    prompt = f"""Context:
        {context}
        
        User's questions:
        {query}

        Question is about python {python_version}.
        Answer the question by provided context. If you don't have sufficient knowledge inform user about it. Don't lie.
        """
    client = openai.Client()
    response = client.chat.completions.create(
        model="gpt-4-turbo",
        messages=[{"role": "system", "content": "You are python expert and you provide answer only based on given context."},
                  {"role": "user", "content": prompt}]
    )
    
    return response.choices[0].message.content

In [62]:
query = "Jak działa match-case w Pythonie 3.10?"
retrieved_docs = retrieve_documents(query, "3.10", 3)
answer = generate_response(query, retrieved_docs)


In [63]:
print(answer)

Match-case w Pythonie 3.10 to konstrukcja umożliwiająca wzorcowe dopasowanie (pattern matching), która została wprowadzona w Pythonie 3.10. Jest ona podobna do konstrukcji switch-case, znaną z innych języków programowania, ale oferuje znacznie bardziej elastyczne możliwości dopasowania wzorców.

Oto jak działa w skrócie:

1. **Konstrukcja `match`**: Określasz zmienną lub wyrażenie, które chcesz badać.

2. **Konstrukcja `case`**: Definiujesz wzorce, z którymi chcesz porównać wynik wyrażenia w `match`. Python dopasuje wzorce od góry do dołu i wykona blok kodu odpowiedniego przypadku `case`, który pasuje jako pierwszy.

W przykładzie z kontekstu:
```python
match x:
    case {1: _, 2: _}:
        ...
    case {**rest}:
        ...
```
- **`match x:`** – rozpoczyna blok wzorcowy, w którym `x` to zmienna, którą będziemy dopasowywać.
- **`case {1: _, 2: _}:`** – pierwszy przypadek próbuje dopasować słownik `x`, który zawiera klucze 1 i 2. Symbol `_` jest użyty jako "wilder," co oznacza, że in

In [75]:
example_queries = ["what is match_size?", "How multiinheritance in python works?", "What is SyntaxError?" ]

In [76]:
answers = []

In [77]:
for query in example_queries:
    python_version = "3.10"
    retrieved_docs = retrieve_documents(query, python_version, 3)
    answer_rag = generate_response(query, retrieved_docs, python_version)

    client = openai.Client()
    response = client.chat.completions.create(
        model="gpt-4-turbo",
        messages=[{"role": "system", "content": "You are python expert and you provide answer only based on given context."},
                  {"role": "user", "content": query + " python version is: " + python_version}]
    )

    normal_answer = response.choices[0].message.content

    answers.append({"rag": answer_rag, "normal": normal_answer})

In [79]:

for answer in answers:
    print("-"*90)
    print("1: " + answer["rag"])
    print("2: " + answer["normal"])

------------------------------------------------------------------------------------------
1: The term "match_size" is not explicitly defined or discussed in the context provided. As per the information you’ve shared, the context primarily discusses Python's `struct` module and how data is packed and unpacked using specified format strings, ensuring data alignment and specifying byte order.

If "match_size" refers to ensuring that sizes or lengths are aligned or compatible in a particular computational context (e.g., memory allocation, struct packing), it wasn’t specifically covered in the details you provided.

For clarity or detailed understanding of "match_size" within Python—or if it pertains to a specific library or function—it would be helpful to refer to more specific documentation or source that clearly defines or uses this term. As of the information you've shared, there is no sufficient detail to provide a concrete explanation or definition of "match_size" in Python 3.10.
2: 