-
Notifications
You must be signed in to change notification settings - Fork 116
User guide to the available LLMs
This page explains how to configure Newelle for the best assistant experience based on your needs. Note: Newelle can support basically any LLM provider, officially or through extensions. If you think something is missing, open an issue.
By default, Newelle will use our demo API. It is limited and it's intended use is only to make new users try out Newelle before configuring.
Recommended Providers:
For a more reliable and personalized experience, we strongly recommend using:
- A local model: Ideal for maximum privacy and performance on powerful hardware with dedicated GPUs. Go to Local Models section
- A provider with an API key: there are many providers that offer LLM API for free or for very cheap.
Note: Multiple providers give free API Keys that have limits compatible for personal use. This list contains a lot of Free LLM API providers.
If you need a provider that is not in Newelle's list, you can check if there is an extension here: https://github.com/FrancescoCaracciolo/Newelle-LLMS
Newelle supports any OpenAI-Compatible provider. Instructions here Here is the list of the supported providers ordered by how recommended they are.
- Go to https://ollama.com/ and register
- Go to Ollama Settings and create a new API Key
- Click on "Add an API Key"
- Give a name to the key, create it and copy it
- Open Newelle settings.
- Choose "Ollama Cloud" as the provider. (Click the install button near that row if you have not done so)
- Paste your Ollama Cloud API key into the "API Key" field.
- Select the desired model
Groq offers a free API key with generous limits and very fast inference speeds. At the moment, they don't offer many good models, therefore it is not the best option anymore.
- Visit the Groq console: Groq Console
- Sign up with your email address (verification required).
- Navigate to the "API Keys" section and click "Create API Key."

- Give your key a name and click "Add."
- Copy the generated key.
Go in the API Key section, click on "Create API Key" and then give any name to the key. After that, click on add and copy the key that was given to you.
- Open Newelle settings.
- Choose "Groq" as the provider.
- Paste your Groq API key into the "API Key" field.
- Select the desired model.

Groq supports many of the most powerful open source AI models. You can check the list here.
Local models are the best option for maximum privacy and performance on powerful hardware with dedicated GPUs. However, they are not recommended for laptops without a dedicated GPU or if you frequently use your device unplugged. Additionally, currently available local models might offer slightly lower quality results compared to cloud-based models.
- Open Newelle settings.
- Choose "Local Model" as the provider.
- Download a compatible model from the available options.
- Select the downloaded model.

Every model has a small description that explains the model characteristics. At the moment, using other models from a file is not supported. You can track the related issue here
Google Gemini's free tier allows access to their models with limits suitable for personal use.
- Log in to Google AI Studio with your Google account: Google AI Studio
- Click "Get API Key" on the top left corner.

- Open Newelle settings.
- Important: Click the download button to install the additional modules required for using Gemini (the download is very small).
- Choose "Google Gemini API" as the provider.
- Paste your Gemini API key into the "API Key" field.
- Select the desired model.

You can see the limits for each model here. You can check the available regions for the free tier here.
Using the Open AI APIs you can not only use OpenAI API, but also any service or program that supports OpenAI-compatible API.
Note: When using inference APIs, be sure that something is set in the API Key setting, otherwise it will give you "Connection error" Some example of programs/services that support OpenAI compatible APIs:
To use Groq in Newelle, go in Newelle settings, choose "OpenAI API" as a provider, and paste the key in the "API Key" setting.

From here you can also adjust multiple settings, for example:
- API Endpoint: this must be modified if you want to use and interference API. For example, if you want to use OpenRouter you can put
https://openrouter.ai/api/v1/. By default OpenAI endpoint is set - OpenAI Model: you can choose any model to put in the model parameter of the API
Newelle allows you to use a custom command run on your system to get the output message. The output of the command will be shown as a response.

This is an example that just gives you the list of prompts as a response.
Currently there are two ways to use local models in Newelle.
Newelle provides a simple way to download and use local models without any other setup. (By default they run on CPU)
In case you want to install llamacpp with GPU acceleration, click on the "install" button, select your hardware and follow the setup wizard. You can either choose to install a prebuilt binary of Llama.cpp or compile it on your machine. Newelle already offers optimized cmake flags by default, but you can also add your own.
Clicking on model library, you can easily explore and download new models.

You can also put your own models in GGUF format in the custom models folder.
In Newelle you can use your own Ollama instance.
- Download Ollama, the instructions are here. Note that many distros already package it in their repos.
- Install a model, you can find the list of models here. For example, to install qwen3 8b:
ollama pull qwen3:8b
- Start the Ollama instance:
ollama serve
You can easily use the ollama instance by selecting it in Newelle settings.

You can specify the model you want to use, and the endpoint of the instance. (If you followed the tutorial above, leave the default option)