In [35]:
!pipenv install streamlit

[1;32mInstalling streamlit[0m[1;33m...[0m
[?25lResolving streamlit[33m...[0m
[2K[1mAdded [0m[1;32mstreamlit[0m to Pipfile's [1;33m[[0m[33mpackages[0m[1;33m][0m [33m...[0m
[2K✔ Installation Succeededlit...
[2K[32m⠋[0m Installing streamlit...
[1A[2K[1;33mPipfile.lock [0m[1;33m([0m[1;33md427b6[0m[1;33m)[0m[1;33m out of date, updating to [0m[1;33m([0m[1;33m2b9af0[0m[1;33m)[0m[1;33m...[0m
Locking[0m [33m[packages][0m dependencies...[0m
[?25lBuilding requirements[33m...[0m
[2KResolving dependencies[33m...[0m
[2K✔ Success! Locking...
[2K[32m⠇[0m Locking...
[1A[2KLocking[0m [33m[dev-packages][0m dependencies...[0m
[1mUpdated Pipfile.lock (8930b5104d4eaa702716403af564fcd8bf933809b40fb85d66f6fe144c2b9af0)![0m
[1mInstalling dependencies from Pipfile.lock [0m[1m([0m[1m2b9af0[0m[1m)[0m[1;33m...[0m


In [34]:
import os
import openai
from elasticsearch import Elasticsearch
from groq import Groq

In [23]:
es = Elasticsearch("http://localhost:9200")
es.info()

ObjectApiResponse({'name': 'dc8e8916ea58', 'cluster_name': 'docker-cluster', 'cluster_uuid': '4f4NB6I4S8atKnaejD3pdg', 'version': {'number': '8.4.3', 'build_flavor': 'default', 'build_type': 'docker', 'build_hash': '42f05b9372a9a4a470db3b52817899b99a76ee73', 'build_date': '2022-10-04T07:17:24.662462378Z', 'build_snapshot': False, 'lucene_version': '9.3.0', 'minimum_wire_compatibility_version': '7.17.0', 'minimum_index_compatibility_version': '7.0.0'}, 'tagline': 'You Know, for Search'})

In [24]:
def retrieve_documents(query, index_name="course-questions", max_results=5):
    es = Elasticsearch("http://localhost:9200")
    
    search_query = {
        "size": max_results,
        "query": {
            "bool": {
                "must": {
                    "multi_match": {
                        "query": query,
                        "fields": ["question^3", "text", "section"],
                        "type": "best_fields"
                    }
                },
                "filter": {
                    "term": {
                        "course": "data-engineering-zoomcamp"
                    }
                }
            }
        }
    }
    
    response = es.search(index=index_name, body=search_query)
    documents = [hit['_source'] for hit in response['hits']['hits']]
    return documents

In [25]:
client = Groq(
    api_key=os.environ.get("GROQ_API_KEY"),
)

In [30]:
context_template = """
Section: {section}
Question: {question}
Answer: {text}
""".strip()

prompt_template = """
You're a course teaching assistant.
Answer the user QUESTION based on CONTEXT - the documents retrieved from our FAQ database.
Don't use other information outside of the provided CONTEXT.  

QUESTION: {user_question}

CONTEXT:

{context}
""".strip()


def build_context(documents):
    context_result = ""
    
    for doc in documents:
        doc_str = context_template.format(**doc)
        context_result += ("\n\n" + doc_str)
    
    return context_result.strip()


def build_prompt(user_question, documents):
    context = build_context(documents)
    prompt = prompt_template.format(
        user_question=user_question,
        context=context
    )
    return prompt

def ask_groq(prompt, model="llama3-8b-8192"):
    response = client.chat.completions.create(
        model=model,
        messages=[{"role": "user", "content": prompt}]
    )
    answer = response.choices[0].message.content
    return answer

def qa_bot(user_question):
    context_docs = retrieve_documents(user_question)
    prompt = build_prompt(user_question, context_docs)
    answer = ask_groq(prompt)
    return answer

In [31]:
qa_bot("I'm getting invalid reference format: repository name must be lowercase")

'I see the issue here. The problem is likely due to the repository name in your Docker command, which needs to be in lowercase.'

In [32]:
qa_bot("I can't connect to postgres port 5432, my password doesn't work")

"I'd be happy to help!\n\nBased on the provided context, it seems that you're experiencing an issue connecting to Postgres port 5432, and your password doesn't work. \n\nTo resolve this issue, it's possible that there is another Postgres instance running on your machine, taking up port 5432. In this case, you could try changing the port used in your Docker container to a different one, such as 5431. You would then need to use this new port when connecting to pgcli or other tools.\n\nAdditionally, if you have a local Postgres installation, you may need to stop the service before running your Docker container.\n\nPlease let me know if you'd like more information on how to specify a different port in your Docker container, or if you'd like help troubleshooting a local Postgres installation."

In [33]:
qa_bot("how can I run kafka?")

'Based on the provided context, I would answer the question "how can I run kafka?" as follows:\n\nThe answer is not explicitly stated in the provided context. However, the closest related information is from the "Section: Module 6: streaming with kafka" where it provides instructions on how to run a Java Kafka producer/consumer/kstreams/etc in the terminal:\n\n"In the project directory, run:\njava -cp build/libs/<jar_name>-1.0-SNAPSHOT.jar:out src/main/java/org/example/JsonProducer.java"\n\nThis suggests that Kafka can be run by executing a Java program in the terminal.'