# 1.1 Running inference on an LLM (Large-Language-Model)

The LLM is the core of the whole Nova system. It understands the users request and can not only answer them but also act on them using tools. This notebook will show you how to interact with the LLM system.

Run this code so python can find the scripts. This is not required when importing Nova from outside the root folder.

In [None]:
import sys
from pathlib import Path
module_path = Path().absolute().parent.parent
if str(module_path) not in sys.path:
    sys.path.append(str(module_path))

Import the Nova class and create an instance. This is the API for Nova and everything you can do with Nova you do through this class.

In [None]:
from nova import *

nova = Nova()

Before we can run inference on an LLM, we need to set up a few things.  
You will need to choose an inference engine. This is essentially what service/system you want to use to run the LLM. By default Nova comes with 2 inference engine. One using [LlamaCPP](https://github.com/ggml-org/llama.cpp) and one using [Groq](https://groq.com/):

In [None]:
# LLamaCPP:
inference_engine = InferenceEngineLlamaCPP()

# Groq:
inference_engine = InferenceEngineGroq()

Now you will need to create an LLMConditioning. This is esentially an object that contains all parameters required to run the LLM. Note that the exact parameters can vary from engine to engine. Below are examples for both LlamaCPP and Groq.

In [None]:
# LLamaCPP conditioning:
conditioning = LLMConditioning(
    model="bartowski/Qwen2.5-7B-Instruct-1M-GGUF",
    file="*Q8_0.gguf"
)

Parameters:  
- model: This must be a valid huggingface repo ID. Note that the model must be in GGUF format.  
- file: The name of the file to use.

In [None]:
# Groq conditioning:
conditioning = LLMConditioning(
    model="llama-3.2-90b-vision-preview"
)

Parameters:  
- model: Which model to use. Make sure it is one of the model hosted on [Groq](https://groq.com/).

If you are using Groq as your inference engine, you will need to set your API key. The key will be encrypted and stored in a database. Next time Nova needs the API key, it will just retrieve it in the background so you do not have to parse it everytime.

In [None]:
nova.edit_secret(Secrets.GROQ_API, "YOUR-API-KEY")

If you are using an engine that pulls the model from huggingface it is also a good idea to log in so you gain access to gated repos. Just like API keys, your huggingface token will be encrypted and stored.

In [None]:
nova.huggingface_login(overwrite=True, token="YOUR-HF-TOKEN")

# Next time you want to log into huggingface run:
nova.huggingface_login()

Now you need to configure the LLM system by parsing the inference engine and the conditioning.

In [None]:
nova.configure_llm(inference_engine=inference_engine, conditioning=conditioning)

# You need to apply your new configuration.
# Only after applying the configuration will the model be loaded into memory.
nova.apply_config_llm()

The LLM is now ready to be used. The LLM system takes in a "Conversation object" containing a list of messages.  
Here is a simple example showing you how to set up a chat with the LLM:

In [None]:
# 1. Create a new conversation
conversation = Conversation()

# 2. Create a message object
message = "What are the benefits of open source AI?"
user_message = Message(author="user", content=message)

# 3. Add the message to the conversation
conversation.add_message(user_message)

# 4. Run the LLM.
llm_response = nova.run_llm(conversation=conversation)

# 5. Print the result of the LLM
llm_response_text = llm_response.message
print(f"Assistant: {llm_response_text}")