# Limitations with Large Language Models

Large Language Models are objectively great. They are flexible, surprisingly cunning, and have a considerable amount of knowledge by themselves. They do come short in some cases, especially when it comes to adapting to new contextual information. Let's say you're trying to build an LLM that answers all the questions you may have about BeCode rules. What does ChatGPT know about BeCode rules, was there any of it in its training data? Probably not much.

## How could we have a LLM answer BeCode questions?

An LLM has a very long context window, close to 1 million words for ChatGPT, that means close to two books from Game of Thrones can be given to it and it would still be able to answer. We could give it all of BeCode rules as a text in the prompt and have it answer questions based on them. But it still comes with many caveats, mostly that giving a lot of content to an LLM is quite costly in resources and money.

Wouldn't it be better if we could just give it the parts of the document useful in the answer to help it on the prompt at hand? The LLM doesn't need to be told about the way moodle works in order to explain when the holidays of the bootcamp happen. The document with Becode Rules is given in `data/becode_rules.txt`.

Make it so that the LLM can answer the following prompt by giving it the paragraph from becode_rules that will allow it to answer the prompt, insert this in the code snippet underneath:

In [None]:
# Import the Python SDK
from google import genai

API_KEY="GEMINI API KEY HERE"

if API_KEY=="GEMINI API KEY HERE":
    raise Exception('Your API key has to be put instead of "GEMINI API KEY HERE"')

client = genai.Client(api_key=API_KEY)#Telling google what your API key is

question = 'I am sick, I sent an email to Antoine and Cindy, what else should I do?'
context = '''PASTE INFORMATION IN THE DOCUMENT ABOUT BEING SICK'''
prompt = f'Use the following snippet:\n {context}\n\n To answer this question: {question}'
print("Prompt:\n",prompt)

response = client.models.generate_content(
    model="gemini-2.0-flash", contents=prompt
)
print("\n\nAnswer:\n", response.text)

## Word2Vec makes a comeback

You just had to give the answer to ChatGPT to have it tell you what to do. Not very handy, might as well go search in the document yourself. But what if there was a way to make a program perform that search automatically?

In the previous notebooks, you may have read that we used to turn some words into vectors to encode meaning about them, and that words with similar meanings had similar vectors. Well what if you could do this instead with many words? What if you could do it with paragraphs? Wouldn't that be great.

Well as it turns out, you can, you can make [embeddings for paragraphs](https://ai.google.dev/gemini-api/docs/embeddings). You can do that with an entire document, and then use the paragraph who's vectors are similar to your prompt to augment it!

In [None]:
result = client.models.embed_content(
        model="gemini-embedding-exp-03-07",
        contents="I am sick, I sent an email to Antoine and Cindy, what else should I do?")

print(result.embeddings)

In [None]:
###WORK IN PROGRESS### COMING SOON