# Python Question Answering Bot

## Initialization

In [1]:
import pathlib as pth
import sys

base_location = pth.Path.cwd().parent

sys.path.append((base_location / "src").as_posix())

In [2]:
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from yarl import URL

from wiki_helper.qna.impl.generative_model import LargeLanguageModel
from wiki_helper.qna.impl.knowledge_base import ExternalKnowledgeBase
from wiki_helper.qna.impl.system import RagSystemImpl
from wiki_helper.storing.impl.storage import VectorStorage, VectorStorageConnection

#### Model Locations

In [3]:
generative_model_location = (
    base_location / "models" / "generative_model" / "Llama-3.2-3B-Instruct-Q6_K.gguf"
)
embedding_model_location = base_location / "models" / "embedder"

#### Initialize Embedding Function

In [4]:
prompt_for_retrieval = "Represent this sentence for searching relevant passages:"

contextual_oven = HuggingFaceEmbedding(
    model_name=embedding_model_location.as_posix(),
    query_instruction=prompt_for_retrieval,
)

**NOTE**: you need to up chroma-db before

In [5]:
settings = VectorStorageConnection(host="localhost", port=8000)

#### Build RAG System

In [6]:
system = RagSystemImpl(
    LargeLanguageModel(model_location=generative_model_location),
    ExternalKnowledgeBase("en"),
    VectorStorage(embedding_builder=contextual_oven, connection_settings=settings),
)

#### Now We can Train the Model

Let's say we want our system to be able to answer questions about Python

In [7]:
system.train(
    location=URL("https://en.wikipedia.org/wiki/Python_(programming_language)")
)

Okay, we got the system. We can ask it questions!

In [8]:
system.answer("What is Python?")

'Python is a high-level, general-purpose programming language.'

But I want to ask it more abstract questions about programming. Let's widen the knowledge base

In [9]:
system.train(
    location=URL("https://en.wikipedia.org/wiki/High-level_programming_language")
)

In [10]:
system.train(
    location=URL("https://en.wikipedia.org/wiki/Low-level_programming_language")
)

Now, the most important question:

In [11]:
system.answer("Which programming language is better: Python or C++? Why")

"Based on the provided context, it's difficult to definitively state that one programming language is categorically better than the other. Both Python and C++ are powerful languages with different design philosophies and use cases.\n\nHowever, based on various factors, such as:\n\n* **Ease of use**: Python is generally considered easier to learn and use, especially for beginners and rapid prototyping.\n* **Cross-platform compatibility**: Python can run on multiple operating systems, including Windows, macOS, and Linux.\n* **Libraries and frameworks**: Python has a vast collection of libraries and frameworks that simplify tasks such as data analysis, web development, and more.\n* **Speed**: C++ is generally considered faster than Python, especially for computationally intensive tasks.\n\nBased on these factors, it's difficult to say which programming language is categorically better. However, if I had to make an educated guess, I would say that Python is a better choice for most applica