In [None]:
!pip install tqdm notebook==7.1.2 openai elasticsearch pandas scikit-learn

# Q1. Running Elastic
Run Elastic Search 8.4.3, and get the cluster information. If you run it on localhost, this is how you do it:


```
curl localhost:9200
```

What's the `version.build_hash` value?

### Running ElasticSearch

In [1]:
!docker run -d --rm --name elasticsearch -p 9200:9200 -p 9300:9300 -e "discovery.type=single-node" -e "xpack.security.enabled=false" docker.elastic.co/elasticsearch/elasticsearch:8.4.3

dc20ed9d998f5a2d160489bbfef796253ffbb29c2e7a67a5404cf56e6ca7071b


In [3]:
!curl localhost:9200

{
  "name" : "dc20ed9d998f",
  "cluster_name" : "docker-cluster",
  "cluster_uuid" : "GHCuMFmvTgWWAHbUyim4Gw",
  "version" : {
    "number" : "8.4.3",
    "build_flavor" : "default",
    "build_type" : "docker",
    "build_hash" : "42f05b9372a9a4a470db3b52817899b99a76ee73",
    "build_date" : "2022-10-04T07:17:24.662462378Z",
    "build_snapshot" : false,
    "lucene_version" : "9.3.0",
    "minimum_wire_compatibility_version" : "7.17.0",
    "minimum_index_compatibility_version" : "7.0.0"
  },
  "tagline" : "You Know, for Search"
}


  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed

  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100   539  100   539    0     0   7508      0 --:--:-- --:--:-- --:--:--  7591


# Q2. Indexing the data
## Getting the data
Now let's get the FAQ data. You can run this snippet:

In [4]:
import requests

docs_url = 'https://github.com/DataTalksClub/llm-zoomcamp/blob/main/01-intro/documents.json?raw=1'
docs_response = requests.get(docs_url)
documents_raw = docs_response.json()

documents = []

for course in documents_raw:
    course_name = course['course']

    for doc in course['documents']:
        doc['course'] = course_name
        documents.append(doc)

Note that you need to have the `requests` library:

In [None]:
!pip install requests

In [5]:
from elasticsearch import Elasticsearch

es_client = Elasticsearch("http://localhost:9200")
es_client.info()

ObjectApiResponse({'name': 'dc20ed9d998f', 'cluster_name': 'docker-cluster', 'cluster_uuid': 'GHCuMFmvTgWWAHbUyim4Gw', 'version': {'number': '8.4.3', 'build_flavor': 'default', 'build_type': 'docker', 'build_hash': '42f05b9372a9a4a470db3b52817899b99a76ee73', 'build_date': '2022-10-04T07:17:24.662462378Z', 'build_snapshot': False, 'lucene_version': '9.3.0', 'minimum_wire_compatibility_version': '7.17.0', 'minimum_index_compatibility_version': '7.0.0'}, 'tagline': 'You Know, for Search'})

In [6]:
index_settings = {
    "settings": {
        "number_of_shards": 1,
        "number_of_replicas": 0
    },
    "mappings": {
        "properties": {
            "text": {"type": "text"},
            "section": {"type": "text"},
            "question": {"type": "text"},
            "course": {"type": "keyword"} 
        }
    }
}

index_name = "course-questions"

es_client.indices.create(index=index_name, body=index_settings)

ObjectApiResponse({'acknowledged': True, 'shards_acknowledged': True, 'index': 'course-questions'})

In [7]:
from tqdm.auto import tqdm
for doc in tqdm(documents):
    es_client.index(index=index_name, document=doc)

  from .autonotebook import tqdm as notebook_tqdm
100%|██████████| 948/948 [00:48<00:00, 19.73it/s]


# Q3. Searching
Now let's search in our index.

We will execute a query "How do I execute a command in a running docker container?".

Use only `question` and `text` fields and give `question` a boost of 4, and use `"type": "best_fields"`.

What's the score for the top ranking result?

- 94.05
- 84.05
- 74.05
- 64.05

Look at the `_score field`.

In [8]:
query = "How do I execute a command in a running docker container?"

In [9]:
search_query_q3 = {
        "size": 5,
        "query": {
            "bool": {
                "must": {
                    "multi_match": {
                        "query": query,
                        "fields": ["question^4", "text"],
                        "type": "best_fields"
                    }
                }
            }
        }
    }

response = es_client.search(index=index_name, body=search_query_q3)

In [None]:
response['hits']['hits']

In [10]:
# Answer
response['hits']['hits'][0]

{'_index': 'course-questions',
 '_id': 'iTzjXpABinJoeCncQeRk',
 '_score': 84.050095,
 '_source': {'text': 'Launch the container image in interactive mode and overriding the entrypoint, so that it starts a bash command.\ndocker run -it --entrypoint bash <image>\nIf the container is already running, execute a command in the specific container:\ndocker ps (find the container-id)\ndocker exec -it <container-id> bash\n(Marcos MJD)',
  'section': '5. Deploying Machine Learning Models',
  'question': 'How do I debug a docker container?',
  'course': 'machine-learning-zoomcamp'}}

# Q4. Filtering
Now let's only limit the questions to `machine-learning-zoomcamp`.

Return 3 results. What's the 3rd question returned by the search engine?

- How do I debug a docker container?
- How do I copy files from a different folder into docker container’s working directory?
- How do Lambda container images work?
- How can I annotate a graph?

In [11]:
search_query_q4 = {
        "size": 3,
        "query": {
            "bool": {
                "must": {
                    "multi_match": {
                        "query": query,
                        "fields": ["question^4", "text"],
                        "type": "best_fields"
                    }
                },
                "filter": {
                    "term": {
                        "course": "machine-learning-zoomcamp"
                    }
                }
            }
        }
    }

response = es_client.search(index=index_name, body=search_query_q4)

In [12]:
# Answer
response['hits']['hits'][2]

{'_index': 'course-questions',
 '_id': 'qTzjXpABinJoeCncR-S4',
 '_score': 49.938507,
 '_source': {'text': 'You can copy files from your local machine into a Docker container using the docker cp command. Here\'s how to do it:\nIn the Dockerfile, you can provide the folder containing the files that you want to copy over. The basic syntax is as follows:\nCOPY ["src/predict.py", "models/xgb_model.bin", "./"]\t\t\t\t\t\t\t\t\t\t\tGopakumar Gopinathan',
  'section': '5. Deploying Machine Learning Models',
  'question': 'How do I copy files from a different folder into docker container’s working directory?',
  'course': 'machine-learning-zoomcamp'}}

# Q5. Building a prompt
Now we're ready to build a prompt to send to an LLM.

Take the records returned from Elasticsearch in Q4 and use this template to build the context. Separate context entries by two linebreaks `(\n\n)`

```
context_template = """
Q: {question}
A: {text}
""".strip()

```


In [13]:
# Answer Q5.1: Context template
context_template = """
Q: {question}
A: {text}
""".strip()

In [14]:
context = ""
for doc in response['hits']['hits']:
    context = context + context_template.format(question=doc['_source']['question'],text=doc['_source']['text']).strip() + "\n\n"

print(context)

Q: How do I debug a docker container?
A: Launch the container image in interactive mode and overriding the entrypoint, so that it starts a bash command.
docker run -it --entrypoint bash <image>
If the container is already running, execute a command in the specific container:
docker ps (find the container-id)
docker exec -it <container-id> bash
(Marcos MJD)

Q: How do I copy files from my local machine to docker container?
A: You can copy files from your local machine into a Docker container using the docker cp command. Here's how to do it:
To copy a file or directory from your local machine into a running Docker container, you can use the `docker cp command`. The basic syntax is as follows:
docker cp /path/to/local/file_or_directory container_id:/path/in/container
Hrithik Kumar Advani

Q: How do I copy files from a different folder into docker container’s working directory?
A: You can copy files from your local machine into a Docker container using the docker cp command. Here's how to 

Now use the context you just created along with the "How do I execute a command in a running docker container?" question to construct a prompt using the template below:
```
prompt_template = """
You're a course teaching assistant. Answer the QUESTION based on the CONTEXT from the FAQ database.
Use only the facts from the CONTEXT when answering the QUESTION.

QUESTION: {question}

CONTEXT:
{context}
""".strip()
```

In [15]:
# Answer Q5.2: Prompt template
prompt_template = """
You're a course teaching assistant. Answer the QUESTION based on the CONTEXT from the FAQ database.
Use only the facts from the CONTEXT when answering the QUESTION.

QUESTION: {question}

CONTEXT:
{context}
""".strip()


In [16]:
promt = ""
prompt = prompt_template.format(question=query, context=context).strip()
print(prompt)

You're a course teaching assistant. Answer the QUESTION based on the CONTEXT from the FAQ database.
Use only the facts from the CONTEXT when answering the QUESTION.

QUESTION: How do I execute a command in a running docker container?

CONTEXT:
Q: How do I debug a docker container?
A: Launch the container image in interactive mode and overriding the entrypoint, so that it starts a bash command.
docker run -it --entrypoint bash <image>
If the container is already running, execute a command in the specific container:
docker ps (find the container-id)
docker exec -it <container-id> bash
(Marcos MJD)

Q: How do I copy files from my local machine to docker container?
A: You can copy files from your local machine into a Docker container using the docker cp command. Here's how to do it:
To copy a file or directory from your local machine into a running Docker container, you can use the `docker cp command`. The basic syntax is as follows:
docker cp /path/to/local/file_or_directory container_id:

What's the length of the resulting prompt? (use the len function)

- 962
- 1462
- 1962
- 2462

In [None]:
len(prompt)

# Q6. Tokens
When we use the OpenAI Platform, we're charged by the number of tokens we send in our prompt and receive in the response.

The OpenAI python package uses `tiktoken` for tokenization:

In [None]:
!pip install tiktoken

Let's calculate the number of tokens in our query:

In [None]:
import tiktoken
encoding = tiktoken.encoding_for_model("gpt-4o")

Use the `encode` function. How many tokens does our prompt have?

- 122
- 222
- 322
- 422

Note: to decode back a token into a word, you can use the `decode_single_token_bytes` function:

`encoding.decode_single_token_bytes(63842)`

In [None]:
len(encoding.encode(prompt))

# Bonus: generating the answer (ungraded)
Let's send the prompt to OpenAI. What's the response?

Note: you can replace OpenAI with Ollama. See module 2.

In [17]:
from openai import OpenAI

In [21]:
client = OpenAI(base_url = "http://localhost:11434/v1", api_key="ollama")

In [None]:
response = client.chat.completions.create(
    model='llama3',
    messages=[{"role": "user", "content": prompt}]
)
print(prompt)
print(response.choices[0].message.content)

In [25]:
response = client.chat.completions.create(
    model='phi3',
    messages=[{"role": "user", "content": prompt}]
)
print(prompt)
print(response.choices[0].message.content)

You're a course teaching assistant. Answer the QUESTION based on the CONTEXT from the FAQ database.
Use only the facts from the CONTEXT when answering the QUESTION.

QUESTION: How do I execute a command in a running docker container?

CONTEXT:
Q: How do I debug a docker container?
A: Launch the container image in interactive mode and overriding the entrypoint, so that it starts a bash command.
docker run -it --entrypoint bash <image>
If the container is already running, execute a command in the specific container:
docker ps (find the container-id)
docker exec -it <container-id> bash
(Marcos MJD)

Q: How do I copy files from my local machine to docker container?
A: You can copy files from your local machine into a Docker container using the docker cp command. Here's how to do it:
To copy a file or directory from your local machine into a running Docker container, you can use the `docker cp command`. The basic syntax is as follows:
docker cp /path/to/local/file_or_directory container_id: