# 4.1 Creating a new inference engine

An inference engine is essentially a wrapper around the system that runs inference, either on an LLM or a TTS. The inference engine is responsible for converting the standardized data structures of Nova into the format the specific system expects, as well as converting the output back into one of the standardized datastructures of Nova. Due to the modular nature of Nova, you can easily add your own inference engine. This notebook shows you how:

### Creating an LLM inference engine:

Navigate to ./Nova2/app/inference_engines/inference_llm and create a new python file. The convention is to name the file "inference_nameofyourservice.py" but this is entirely optional. Open the file and begin to import a few things:

In [None]:
from .inference_base_llm import InferenceEngineBaseLLM
from ...tool_data import *
from ...llm_data import *

Create a class that inherits from "InferenceEngineBaseLLM" and call the constructor of the parent class:

In [None]:
class InferenceEngine(InferenceEngineBaseLLM):
    def __init__(self) -> None:
        super().__init__()
        # Run further setup code here

You can now overwrite a few methods that will be called when the inference engine is used:

In [None]:
def initialize_model(self, conditioning: LLMConditioning) -> None:
    """
    Here is where you set up your model based on the conditioning. If you are using a local solution,
    this method is where you load the model into memory.
    """

def run_inference(self, conversation: Conversation, tools: List[LLMTool] | None) -> LLMResponse:
    """
    This is where you unpack the conversation object and the tool list (if present) and run model inference.
    From the output, construct an "LLMResponse" object and return it.
    """

The last thing we need to do is to add the inference engine to ./Nova2/app/inference_engines/_\_init__.py:

In [None]:
from .inference_llm.inference_nameofyourservice.py import *

You can now select your inference engine via the API.

### Creating a TTS inference engine:

Navigate to ./Nova2/app/inference_engines/inference_tts and create a new python file. The convention is to name the file "inference_nameofyourservice.py" but this is entirely optional. Open the file and begin to import a few things:

In [None]:
from .inference_base_tts import InferenceEngineBaseTTS
from ...tts_data import TTSConditioning

Create a class that inherits from "InferenceEngineBaseTTS" and call the constructor of the parent class:

In [None]:
class InferenceEngine(InferenceEngineBaseTTS):
    def __init__(self) -> None:
        super().__init__()
        # Run further setup code here

You can now overwrite a few methods that will be called when the inference engine is used:

In [None]:
def initialize_model(self, model: str) -> None:
    """
    Here is where you set up your model. If you are using a local solution,
    this method is where you load the model into memory.
    """

def run_inference(self, text: str, conditioning: TTSConditioning) -> bytes:
    """
    This is where you run inference on the model based on the conditioning as well as return the audio data.
    Note that the audio data must be in mp3 or wave format.
    """

The last thing we need to do is to add the inference engine to ./Nova2/app/inference_engines/_\_init__.py:

In [None]:
from .inference_tts.inference_nameofyourservice.py import *