2020
2121> ⚠️ ** Cortex.cpp is currently in Development. This documentation outlines the intended behavior of Cortex, which may not yet be fully implemented in the codebase.**
2222
23- ## About
23+ ## Overview
2424Cortex.cpp is a Local AI engine that is used to run and customize LLMs. Cortex can be deployed as a standalone server, or integrated into apps like [ Jan.ai] ( https://jan.ai/ ) .
2525
2626Cortex.cpp is a multi-engine that uses ` llama.cpp ` as the default engine but also supports the following:
2727- [ ` llamacpp ` ] ( https://github.com/janhq/cortex.llamacpp )
2828- [ ` onnx ` ] ( https://github.com/janhq/cortex.onnx )
2929- [ ` tensorrt-llm ` ] ( https://github.com/janhq/cortex.tensorrt-llm )
3030
31- ## Installation
3231To install Cortex.cpp, download the installer for your operating system from the following options:
3332
3433<table >
@@ -43,7 +42,7 @@ To install Cortex.cpp, download the installer for your operating system from the
4342 <td style="text-align:center">
4443 <a href='https://github.com/janhq/cortex.cpp/releases'>
4544 <img src='https://github.com/janhq/docs/blob/main/static/img/windows.png' style="height:14px; width: 14px" />
46- <b>cortexcpp.exe </b>
45+ <b>Download </b>
4746 </a>
4847 </td>
4948 <td style="text-align:center">
@@ -61,79 +60,13 @@ To install Cortex.cpp, download the installer for your operating system from the
6160 <td style="text-align:center">
6261 <a href='https://github.com/janhq/cortex.cpp/releases'>
6362 <img src='https://github.com/janhq/docs/blob/main/static/img/linux.png' style="height:14px; width: 14px" />
64- <b>cortexcpp.deb </b>
63+ <b>Debian Download </b>
6564 </a>
6665 </td>
6766 <td style="text-align:center">
6867 <a href='https://github.com/janhq/cortex.cpp/releases'>
6968 <img src='https://github.com/janhq/docs/blob/main/static/img/linux.png' style="height:14px; width: 14px" />
70- <b>cortexcpp.AppImage</b>
71- </a>
72- </td>
73- </tr >
74- <tr style =" text-align :center " >
75- <td style="text-align:center"><b>Beta Build</b></td>
76- <td style="text-align:center">
77- <a href='https://github.com/janhq/cortex.cpp/releases'>
78- <img src='https://github.com/janhq/docs/blob/main/static/img/windows.png' style="height:14px; width: 14px" />
79- <b>cortexcpp.exe</b>
80- </a>
81- </td>
82- <td style="text-align:center">
83- <a href='https://github.com/janhq/cortex.cpp/releases'>
84- <img src='https://github.com/janhq/docs/blob/main/static/img/mac.png' style="height:15px; width: 15px" />
85- <b>Intel</b>
86- </a>
87- </td>
88- <td style="text-align:center">
89- <a href='https://github.com/janhq/cortex.cpp/releases'>
90- <img src='https://github.com/janhq/docs/blob/main/static/img/mac.png' style="height:15px; width: 15px" />
91- <b>M1/M2/M3/M4</b>
92- </a>
93- </td>
94- <td style="text-align:center">
95- <a href='https://github.com/janhq/cortex.cpp/releases'>
96- <img src='https://github.com/janhq/docs/blob/main/static/img/linux.png' style="height:14px; width: 14px" />
97- <b>cortexcpp.deb</b>
98- </a>
99- </td>
100- <td style="text-align:center">
101- <a href='https://github.com/janhq/cortex.cpp/releases'>
102- <img src='https://github.com/janhq/docs/blob/main/static/img/linux.png' style="height:14px; width: 14px" />
103- <b>cortexcpp.AppImage</b>
104- </a>
105- </td>
106- </tr >
107- <tr style =" text-align :center " >
108- <td style="text-align:center"><b>Nightly Build</b></td>
109- <td style="text-align:center">
110- <a href='https://github.com/janhq/cortex.cpp/releases'>
111- <img src='https://github.com/janhq/docs/blob/main/static/img/windows.png' style="height:14px; width: 14px" />
112- <b>cortexcpp.exe</b>
113- </a>
114- </td>
115- <td style="text-align:center">
116- <a href='https://github.com/janhq/cortex.cpp/releases'>
117- <img src='https://github.com/janhq/docs/blob/main/static/img/mac.png' style="height:15px; width: 15px" />
118- <b>Intel</b>
119- </a>
120- </td>
121- <td style="text-align:center">
122- <a href='https://github.com/janhq/cortex.cpp/releases'>
123- <img src='https://github.com/janhq/docs/blob/main/static/img/mac.png' style="height:15px; width: 15px" />
124- <b>M1/M2/M3/M4</b>
125- </a>
126- </td>
127- <td style="text-align:center">
128- <a href='https://github.com/janhq/cortex.cpp/releases'>
129- <img src='https://github.com/janhq/docs/blob/main/static/img/linux.png' style="height:14px; width: 14px" />
130- <b>cortexcpp.deb</b>
131- </a>
132- </td>
133- <td style="text-align:center">
134- <a href='https://github.com/janhq/cortex.cpp/releases'>
135- <img src='https://github.com/janhq/docs/blob/main/static/img/linux.png' style="height:14px; width: 14px" />
136- <b>cortexcpp.AppImage</b>
69+ <b>Fedora Download</b>
13770 </a>
13871 </td>
13972 </tr >
@@ -148,95 +81,29 @@ To install Cortex.cpp, download the installer for your operating system from the
14881- [ cortex.py] ( https://github.com/janhq/cortex-python )
14982
15083## Quickstart
151- To run and chat with a model in Cortex.cpp:
84+ ### CLI
15285``` bash
153- # Start the Cortex.cpp server
86+ # 1. Start the Cortex.cpp server (The server is running at localhost:3928)
15487cortex
15588
156- # Start a model
89+ # 2. Start a model
15790cortex run < model_id> :[engine_name]
158- ```
159- ## Built-in Model Library
160- Cortex.cpp supports a list of models available on [ Cortex Hub] ( https://huggingface.co/cortexso ) .
161-
162- Here are example of models that you can use based on each supported engine:
163- ### ` llama.cpp `
164- | Model ID | Variant (Branch) | Model size | CLI command |
165- | ------------------| ------------------| -------------------| ------------------------------------|
166- | codestral | 22b-gguf | 22B | ` cortex run codestral:22b-gguf ` |
167- | command-r | 35b-gguf | 35B | ` cortex run command-r:35b-gguf ` |
168- | gemma | 7b-gguf | 7B | ` cortex run gemma:7b-gguf ` |
169- | llama3 | gguf | 8B | ` cortex run llama3:gguf ` |
170- | llama3.1 | gguf | 8B | ` cortex run llama3.1:gguf ` |
171- | mistral | 7b-gguf | 7B | ` cortex run mistral:7b-gguf ` |
172- | mixtral | 7x8b-gguf | 46.7B | ` cortex run mixtral:7x8b-gguf ` |
173- | openhermes-2.5 | 7b-gguf | 7B | ` cortex run openhermes-2.5:7b-gguf ` |
174- | phi3 | medium-gguf | 14B - 4k ctx len | ` cortex run phi3:medium-gguf ` |
175- | phi3 | mini-gguf | 3.82B - 4k ctx len| ` cortex run phi3:mini-gguf ` |
176- | qwen2 | 7b-gguf | 7B | ` cortex run qwen2:7b-gguf ` |
177- | tinyllama | 1b-gguf | 1.1B | ` cortex run tinyllama:1b-gguf ` |
178- ### ` ONNX `
179- | Model ID | Variant (Branch) | Model size | CLI command |
180- | ------------------| ------------------| -------------------| ------------------------------------|
181- | gemma | 7b-onnx | 7B | ` cortex run gemma:7b-onnx ` |
182- | llama3 | onnx | 8B | ` cortex run llama3:onnx ` |
183- | mistral | 7b-onnx | 7B | ` cortex run mistral:7b-onnx ` |
184- | openhermes-2.5 | 7b-onnx | 7B | ` cortex run openhermes-2.5:7b-onnx ` |
185- | phi3 | mini-onnx | 3.82B - 4k ctx len| ` cortex run phi3:mini-onnx ` |
186- | phi3 | medium-onnx | 14B - 4k ctx len | ` cortex run phi3:medium-onnx ` |
187- ### ` TensorRT-LLM `
188- | Model ID | Variant (Branch) | Model size | CLI command |
189- | ------------------| -------------------------------| -------------------| ------------------------------------|
190- | llama3 | 8b-tensorrt-llm-windows-ampere | 8B | ` cortex run llama3:8b-tensorrt-llm-windows-ampere ` |
191- | llama3 | 8b-tensorrt-llm-linux-ampere | 8B | ` cortex run llama3:8b-tensorrt-llm-linux-ampere ` |
192- | llama3 | 8b-tensorrt-llm-linux-ada | 8B | ` cortex run llama3:8b-tensorrt-llm-linux-ada ` |
193- | llama3 | 8b-tensorrt-llm-windows-ada | 8B | ` cortex run llama3:8b-tensorrt-llm-windows-ada ` |
194- | mistral | 7b-tensorrt-llm-linux-ampere | 7B | ` cortex run mistral:7b-tensorrt-llm-linux-ampere ` |
195- | mistral | 7b-tensorrt-llm-windows-ampere | 7B | ` cortex run mistral:7b-tensorrt-llm-windows-ampere ` |
196- | mistral | 7b-tensorrt-llm-linux-ada | 7B | ` cortex run mistral:7b-tensorrt-llm-linux-ada ` |
197- | mistral | 7b-tensorrt-llm-windows-ada | 7B | ` cortex run mistral:7b-tensorrt-llm-windows-ada ` |
198- | openhermes-2.5 | 7b-tensorrt-llm-windows-ampere | 7B | ` cortex run openhermes-2.5:7b-tensorrt-llm-windows-ampere ` |
199- | openhermes-2.5 | 7b-tensorrt-llm-windows-ada | 7B | ` cortex run openhermes-2.5:7b-tensorrt-llm-windows-ada ` |
200- | openhermes-2.5 | 7b-tensorrt-llm-linux-ada | 7B | ` cortex run openhermes-2.5:7b-tensorrt-llm-linux-ada ` |
201-
202- > ** Note** :
203- > You should have at least 8 GB of RAM available to run the 7B models, 16 GB to run the 14B models, and 32 GB to run the 32B models.
204-
205- ## Cortex.cpp CLI Commands
20691
207- | Command Description | Command Example |
208- | ------------------------------------| ---------------------------------------------------------------------|
209- | ** Start Cortex.cpp Server** | ` cortex ` |
210- | ** Chat with a Model** | ` cortex chat [options] [model_id] [message] ` |
211- | ** Embeddings** | ` cortex embeddings [options] [model_id] [message] ` |
212- | ** Pull a Model** | ` cortex pull <model_id> ` |
213- | ** Download and Start a Model** | ` cortex run [options] [model_id]:[engine] ` |
214- | ** Get Model Details** | ` cortex models get <model_id> ` |
215- | ** List Models** | ` cortex models list [options] ` |
216- | ** Delete a Model** | ` cortex models delete <model_id> ` |
217- | ** Start a Model** | ` cortex models start [model_id] ` |
218- | ** Stop a Model** | ` cortex models stop <model_id> ` |
219- | ** Update a Model** | ` cortex models update [options] <model_id> ` |
220- | ** Get Engine Details** | ` cortex engines get <engine_name> ` |
221- | ** Install an Engine** | ` cortex engines install <engine_name> [options] ` |
222- | ** List Engines** | ` cortex engines list [options] ` |
223- | ** Uninnstall an Engine** | ` cortex engines uninstall <engine_name> [options] ` |
224- | ** Show Model Information** | ` cortex ps ` |
225- | ** Update Cortex.cpp** | ` cortex update [options] ` |
226-
227- > ** Note**
228- > For a more detailed CLI Reference documentation, please see [ here] ( https://cortex.so/docs/cli ) .
92+ # 3. Stop a model
93+ cortex stop < model_id> :[engine_name]
22994
230- ## REST API
231- Cortex.cpp has a REST API that runs at ` localhost:3928 ` .
232-
233- ### Pull a Model
95+ # 4. Stop the Cortex.cpp server
96+ cortex stop
97+ ```
98+ ### API
99+ 1 . Start the API server using ` cortex ` command.
100+ 2 . ** Pull a Model**
234101``` bash
235102curl --request POST \
236103 --url http://localhost:3928/v1/models/{model_id}/pull
237104```
238105
239- ### Start a Model
106+ 3 . ** Start a Model**
240107``` bash
241108curl --request POST \
242109 --url http://localhost:3928/v1/models/{model_id}/start \
@@ -259,7 +126,7 @@ curl --request POST \
259126}'
260127```
261128
262- ### Chat with a Model
129+ 4 . ** Chat with a Model**
263130``` bash
264131curl http://localhost:3928/v1/chat/completions \
265132-H " Content-Type: application/json" \
@@ -284,18 +151,144 @@ curl http://localhost:3928/v1/chat/completions \
284151}'
285152```
286153
287- ### Stop a Model
154+ 5 . ** Stop a Model**
288155``` bash
289156curl --request POST \
290157 --url http://localhost:3928/v1/models/mistral/stop
291158```
159+ 6 . Stop the Cortex.cpp server using ` cortex stop ` command.
160+ > ** Note** :
161+ > Our API server is fully compatible with the OpenAI API, making it easy to integrate with any systems or tools that support OpenAI-compatible APIs.
162+
163+ ## Built-in Model Library
164+ Cortex.cpp supports various models available on the [ Cortex Hub] ( https://huggingface.co/cortexso ) . Once downloaded, all model source files will be stored at ` C:\Users\<username>\AppData\Local\cortexcpp\models ` .
165+
166+ Here are example of models that you can use based on each supported engine:
167+
168+ | Model | llama.cpp<br >` :gguf ` | TensorRT<br >` :tensorrt ` | ONNXRuntime<br >` :onnx ` | Command |
169+ | ------------------| -----------------------| ------------------------------------------| ----------------------------| -------------------------------|
170+ | llama3.1 | ✅ | | ✅ | cortex run llama3.1: gguf |
171+ | llama3 | ✅ | ✅ | ✅ | cortex run llama3 |
172+ | mistral | ✅ | ✅ | ✅ | cortex run mistral |
173+ | qwen2 | ✅ | | | cortex run qwen2:7b-gguf |
174+ | codestral | ✅ | | | cortex run codestral:22b-gguf |
175+ | command-r | ✅ | | | cortex run command-r:35b-gguf |
176+ | gemma | ✅ | | ✅ | cortex run gemma |
177+ | mixtral | ✅ | | | cortex run mixtral:7x8b-gguf |
178+ | openhermes-2.5 | ✅ | ✅ | ✅ | cortex run openhermes-2.5 |
179+ | phi3 (medium) | ✅ | | ✅ | cortex run phi3: medium |
180+ | phi3 (mini) | ✅ | | ✅ | cortex run phi3: mini |
181+ | tinyllama | ✅ | | | cortex run tinyllama:1b-gguf |
182+
183+ > ** Note** :
184+ > You should have at least 8 GB of RAM available to run the 7B models, 16 GB to run the 14B models, and 32 GB to run the 32B models.
292185
293- > ** Note **
294- > Check our [ API documentation] ( https://cortex.so/api-reference ) for a full list of available endpoints .
186+ ## Cortex.cpp CLI Commands
187+ For complete details on CLI commands, please refer to our [ CLI documentation] ( https://cortex.so/docs/cli ) .
295188
296- ## Build from Source
189+ ## REST API
190+ Cortex.cpp includes a REST API accessible at ` localhost:3928 ` . For a complete list of endpoints and their usage, visit our [ API documentation] ( https://cortex.so/api-reference ) .
297191
192+ ## Uninstallation
298193### Windows
194+ 1 . Navigate to Add or Remove program.
195+ 2 . Search for Cortex.cpp.
196+ 3 . Click Uninstall.
197+ 4 . Delete the Cortex.cpp data folder located in your home folder.
198+ ### MacOs
199+ Run the uninstaller script:
200+ ``` bash
201+ sudo sh cortex-uninstall.sh
202+ ```
203+ > ** Note** :
204+ > The script requires sudo permission.
205+
206+
207+ ### Linux
208+ ``` bash
209+ sudo apt remove cortexcpp
210+ ```
211+
212+ ## Alternate Installation
213+ We also provide Beta and Nightly version.
214+ <table >
215+ <tr style =" text-align :center " >
216+ <td style="text-align:center"><b>Version Type</b></td>
217+ <td style="text-align:center"><b>Windows</b></td>
218+ <td colspan="2" style="text-align:center"><b>MacOS</b></td>
219+ <td colspan="2" style="text-align:center"><b>Linux</b></td>
220+ </tr >
221+ <tr style =" text-align :center " >
222+ <td style="text-align:center"><b>Beta Build</b></td>
223+ <td style="text-align:center">
224+ <a href='https://github.com/janhq/cortex.cpp/releases'>
225+ <img src='https://github.com/janhq/docs/blob/main/static/img/windows.png' style="height:14px; width: 14px" />
226+ <b>cortexcpp.exe</b>
227+ </a>
228+ </td>
229+ <td style="text-align:center">
230+ <a href='https://github.com/janhq/cortex.cpp/releases'>
231+ <img src='https://github.com/janhq/docs/blob/main/static/img/mac.png' style="height:15px; width: 15px" />
232+ <b>Intel</b>
233+ </a>
234+ </td>
235+ <td style="text-align:center">
236+ <a href='https://github.com/janhq/cortex.cpp/releases'>
237+ <img src='https://github.com/janhq/docs/blob/main/static/img/mac.png' style="height:15px; width: 15px" />
238+ <b>M1/M2/M3/M4</b>
239+ </a>
240+ </td>
241+ <td style="text-align:center">
242+ <a href='https://github.com/janhq/cortex.cpp/releases'>
243+ <img src='https://github.com/janhq/docs/blob/main/static/img/linux.png' style="height:14px; width: 14px" />
244+ <b>cortexcpp.deb</b>
245+ </a>
246+ </td>
247+ <td style="text-align:center">
248+ <a href='https://github.com/janhq/cortex.cpp/releases'>
249+ <img src='https://github.com/janhq/docs/blob/main/static/img/linux.png' style="height:14px; width: 14px" />
250+ <b>cortexcpp.AppImage</b>
251+ </a>
252+ </td>
253+ </tr >
254+ <tr style =" text-align :center " >
255+ <td style="text-align:center"><b>Nightly Build</b></td>
256+ <td style="text-align:center">
257+ <a href='https://github.com/janhq/cortex.cpp/releases'>
258+ <img src='https://github.com/janhq/docs/blob/main/static/img/windows.png' style="height:14px; width: 14px" />
259+ <b>cortexcpp.exe</b>
260+ </a>
261+ </td>
262+ <td style="text-align:center">
263+ <a href='https://github.com/janhq/cortex.cpp/releases'>
264+ <img src='https://github.com/janhq/docs/blob/main/static/img/mac.png' style="height:15px; width: 15px" />
265+ <b>Intel</b>
266+ </a>
267+ </td>
268+ <td style="text-align:center">
269+ <a href='https://github.com/janhq/cortex.cpp/releases'>
270+ <img src='https://github.com/janhq/docs/blob/main/static/img/mac.png' style="height:15px; width: 15px" />
271+ <b>M1/M2/M3/M4</b>
272+ </a>
273+ </td>
274+ <td style="text-align:center">
275+ <a href='https://github.com/janhq/cortex.cpp/releases'>
276+ <img src='https://github.com/janhq/docs/blob/main/static/img/linux.png' style="height:14px; width: 14px" />
277+ <b>cortexcpp.deb</b>
278+ </a>
279+ </td>
280+ <td style="text-align:center">
281+ <a href='https://github.com/janhq/cortex.cpp/releases'>
282+ <img src='https://github.com/janhq/docs/blob/main/static/img/linux.png' style="height:14px; width: 14px" />
283+ <b>cortexcpp.AppImage</b>
284+ </a>
285+ </td>
286+ </tr >
287+ </table >
288+
289+ ### Build from Source
290+
291+ #### Windows
2992921 . Clone the Cortex.cpp repository [ here] ( https://github.com/janhq/cortex.cpp ) .
3002932 . Navigate to the ` engine > vcpkg ` folder.
3012943 . Configure the vpkg:
@@ -319,7 +312,7 @@ cmake .. -DBUILD_SHARED_LIBS=OFF -DCMAKE_TOOLCHAIN_FILE=path_to_vcpkg_folder/vcp
319312# Get the help information
320313cortex -h
321314```
322- ### MacOS
315+ #### MacOS
3233161 . Clone the Cortex.cpp repository [ here] ( https://github.com/janhq/cortex.cpp ) .
3243172 . Navigate to the ` engine > vcpkg ` folder.
3253183 . Configure the vpkg:
@@ -344,7 +337,7 @@ make -j4
344337# Get the help information
345338cortex -h
346339```
347- ### Linux
340+ #### Linux
3483411 . Clone the Cortex.cpp repository [ here] ( https://github.com/janhq/cortex.cpp ) .
3493422 . Navigate to the ` engine > vcpkg ` folder.
3503433 . Configure the vpkg:
0 commit comments