Skip to content

Commit

Permalink
Doc: Add IPEX-LLM model provider (#1417)
Browse files Browse the repository at this point in the history
* Add related doc for IPEX-LLM accelerated Ollama provider

* Add description in providor title and other small updates
  • Loading branch information
Oscilloscope98 committed Jun 4, 2024
1 parent 321050d commit ebf2a09
Show file tree
Hide file tree
Showing 2 changed files with 41 additions and 1 deletion.
39 changes: 39 additions & 0 deletions docs/docs/reference/Model Providers/ipex_llm.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
# IPEX-LLM

:::info
[**IPEX-LLM**](https://github.com/intel-analytics/ipex-llm) is a PyTorch library for running LLM on Intel CPU and GPU (e.g., local PC with iGPU, discrete GPU such as Arc A-Series, Flex and Max) with very low latency.
:::

IPEX-LLM supports accelerated Ollama backend to be hosted on Intel GPU. Refer to [this guide](https://ipex-llm.readthedocs.io/en/latest/doc/LLM/Quickstart/ollama_quickstart.html) from IPEX-LLM official documentation about how to install and run Ollama serve accelerated by IPEX-LLM on Intel GPU. You can then configure Continue to use the IPEX-LLM accelerated `"ollama"` provider as follows:

```json title="~/.continue/config.json"
{
"models": [
{
"title": "IPEX-LLM",
"provider": "ollama",
"model": "AUTODETECT"
}
]
}
```

If you would like to reach the Ollama service from another machine, make sure you set or export the environment variable `OLLAMA_HOST=0.0.0.0` before executing the command `ollama serve`. Then, in the Continue configuration, set `'apiBase'` to correspond with the IP address / port of the remote machine. That is, Continue can be configured to be:

```json title="~/.continue/config.json"
{
"models": [
{
"title": "IPEX-LLM",
"provider": "ollama",
"model": "AUTODETECT",
"apiBase": "http://your-ollama-service-ip:11434"
}
]
}
```

:::tip
- For more configuration options regarding completion or authentication, you could refer to [here](./ollama.md#completion-options) for Ollama provider.
- If you would like to preload the model before your first conversation with that model in Continue, you could refer to [here](https://ipex-llm.readthedocs.io/en/latest/doc/LLM/Quickstart/continue_quickstart.html#pull-and-prepare-the-model) for more information.
:::
3 changes: 2 additions & 1 deletion docs/docs/setup/select-provider.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,12 +20,13 @@ You can run a model on your local computer using:
- [LM Studio](../reference/Model%20Providers/lmstudio.md)
- [Llama.cpp](../reference/Model%20Providers/llamacpp.md)
- [KoboldCpp](../reference/Model%20Providers/openai.md) (OpenAI compatible server)
- [llamafile](../reference/Model%20Providers/llamafile) ((OpenAI compatible server)
- [llamafile](../reference/Model%20Providers/llamafile) (OpenAI compatible server)
- [LocalAI](../reference/Model%20Providers/openai.md) (OpenAI compatible server)
- [Text generation web UI](../reference/Model%20Providers/openai.md) (OpenAI compatible server)
- [FastChat](../reference/Model%20Providers/openai.md) (OpenAI compatible server)
- [llama-cpp-python](../reference/Model%20Providers/openai.md) (OpenAI compatible server)
- [TensorRT-LLM](https://github.com/NVIDIA/trt-llm-as-openai-windows?tab=readme-ov-file#examples) (OpenAI compatible server)
- [IPEX-LLM](../reference/Model%20Providers/ipex_llm.md) (Local LLM on Intel GPU)

### Remote

Expand Down

0 comments on commit ebf2a09

Please sign in to comment.