Merge pull request #1236 from janhq/readme

freelerobot · web-flow · commit 9e7be721a34a · 2024-09-19T16:42:18.000+08:00
Docs: Update the Readme per feedback
diff --git a/README.md b/README.md
@@ -20,15 +20,14 @@
 
 > ⚠️ **Cortex.cpp is currently in Development. This documentation outlines the intended behavior of Cortex, which may not yet be fully implemented in the codebase.**
 
-## About
+## Overview
 Cortex.cpp is a Local AI engine that is used to run and customize LLMs. Cortex can be deployed as a standalone server, or integrated into apps like [Jan.ai](https://jan.ai/).
 
 Cortex.cpp is a multi-engine that uses `llama.cpp` as the default engine but also supports the following:
 - [`llamacpp`](https://github.com/janhq/cortex.llamacpp)
 - [`onnx`](https://github.com/janhq/cortex.onnx)
 - [`tensorrt-llm`](https://github.com/janhq/cortex.tensorrt-llm)
 
-## Installation
 To install Cortex.cpp, download the installer for your operating system from the following options:
 
 <table>
@@ -43,7 +42,7 @@ To install Cortex.cpp, download the installer for your operating system from the
     <td style="text-align:center">
       <a href='https://github.com/janhq/cortex.cpp/releases'>
         <img src='https://github.com/janhq/docs/blob/main/static/img/windows.png' style="height:14px; width: 14px" />
-        <b>cortexcpp.exe</b>
+        <b>Download</b>
       </a>
     </td>
     <td style="text-align:center">
@@ -61,79 +60,13 @@ To install Cortex.cpp, download the installer for your operating system from the
     <td style="text-align:center">
       <a href='https://github.com/janhq/cortex.cpp/releases'>
         <img src='https://github.com/janhq/docs/blob/main/static/img/linux.png' style="height:14px; width: 14px" />
-        <b>cortexcpp.deb</b>
+        <b>Debian Download</b>
       </a>
     </td>
     <td style="text-align:center">
       <a href='https://github.com/janhq/cortex.cpp/releases'>
         <img src='https://github.com/janhq/docs/blob/main/static/img/linux.png' style="height:14px; width: 14px" />
-        <b>cortexcpp.AppImage</b>
-      </a>
-    </td>
-  </tr>
-  <tr style="text-align:center">
-    <td style="text-align:center"><b>Beta Build</b></td>
-    <td style="text-align:center">
-      <a href='https://github.com/janhq/cortex.cpp/releases'>
-        <img src='https://github.com/janhq/docs/blob/main/static/img/windows.png' style="height:14px; width: 14px" />
-        <b>cortexcpp.exe</b>
-      </a>
-    </td>
-    <td style="text-align:center">
-      <a href='https://github.com/janhq/cortex.cpp/releases'>
-        <img src='https://github.com/janhq/docs/blob/main/static/img/mac.png' style="height:15px; width: 15px" />
-        <b>Intel</b>
-      </a>
-    </td>
-    <td style="text-align:center">
-      <a href='https://github.com/janhq/cortex.cpp/releases'>
-        <img src='https://github.com/janhq/docs/blob/main/static/img/mac.png' style="height:15px; width: 15px" />
-        <b>M1/M2/M3/M4</b>
-      </a>
-    </td>
-    <td style="text-align:center">
-      <a href='https://github.com/janhq/cortex.cpp/releases'>
-        <img src='https://github.com/janhq/docs/blob/main/static/img/linux.png' style="height:14px; width: 14px" />
-        <b>cortexcpp.deb</b>
-      </a>
-    </td>
-    <td style="text-align:center">
-      <a href='https://github.com/janhq/cortex.cpp/releases'>
-        <img src='https://github.com/janhq/docs/blob/main/static/img/linux.png' style="height:14px; width: 14px" />
-        <b>cortexcpp.AppImage</b>
-      </a>
-    </td>
-  </tr>
-  <tr style="text-align:center">
-    <td style="text-align:center"><b>Nightly Build</b></td>
-    <td style="text-align:center">
-      <a href='https://github.com/janhq/cortex.cpp/releases'>
-        <img src='https://github.com/janhq/docs/blob/main/static/img/windows.png' style="height:14px; width: 14px" />
-        <b>cortexcpp.exe</b>
-      </a>
-    </td>
-    <td style="text-align:center">
-      <a href='https://github.com/janhq/cortex.cpp/releases'>
-        <img src='https://github.com/janhq/docs/blob/main/static/img/mac.png' style="height:15px; width: 15px" />
-        <b>Intel</b>
-      </a>
-    </td>
-    <td style="text-align:center">
-      <a href='https://github.com/janhq/cortex.cpp/releases'>
-        <img src='https://github.com/janhq/docs/blob/main/static/img/mac.png' style="height:15px; width: 15px" />
-        <b>M1/M2/M3/M4</b>
-      </a>
-    </td>
-    <td style="text-align:center">
-      <a href='https://github.com/janhq/cortex.cpp/releases'>
-        <img src='https://github.com/janhq/docs/blob/main/static/img/linux.png' style="height:14px; width: 14px" />
-        <b>cortexcpp.deb</b>
-      </a>
-    </td>
-    <td style="text-align:center">
-      <a href='https://github.com/janhq/cortex.cpp/releases'>
-        <img src='https://github.com/janhq/docs/blob/main/static/img/linux.png' style="height:14px; width: 14px" />
-        <b>cortexcpp.AppImage</b>
+        <b>Fedora Download</b>
       </a>
     </td>
   </tr>
@@ -148,95 +81,29 @@ To install Cortex.cpp, download the installer for your operating system from the
 - [cortex.py](https://github.com/janhq/cortex-python)
 
 ## Quickstart
-To run and chat with a model in Cortex.cpp:
+### CLI
 ```bash
-# Start the Cortex.cpp server
+# 1. Start the Cortex.cpp server (The server is running at localhost:3928)
 cortex
 
-# Start a model
+# 2. Start a model
 cortex run <model_id>:[engine_name]
-```
-## Built-in Model Library
-Cortex.cpp supports a list of models available on [Cortex Hub](https://huggingface.co/cortexso).
-
-Here are example of models that you can use based on each supported engine:
-### `llama.cpp`
-| Model ID         | Variant (Branch) | Model size        | CLI command                        |
-|------------------|------------------|-------------------|------------------------------------|
-| codestral        | 22b-gguf         | 22B               | `cortex run codestral:22b-gguf`    |
-| command-r        | 35b-gguf         | 35B               | `cortex run command-r:35b-gguf`    |
-| gemma            | 7b-gguf          | 7B                | `cortex run gemma:7b-gguf`         |
-| llama3           | gguf             | 8B                | `cortex run llama3:gguf`           |
-| llama3.1         | gguf             | 8B                | `cortex run llama3.1:gguf`         |
-| mistral          | 7b-gguf          | 7B                | `cortex run mistral:7b-gguf`       |
-| mixtral          | 7x8b-gguf        | 46.7B             | `cortex run mixtral:7x8b-gguf`     |
-| openhermes-2.5   | 7b-gguf          | 7B                | `cortex run openhermes-2.5:7b-gguf`|
-| phi3             | medium-gguf      | 14B - 4k ctx len  | `cortex run phi3:medium-gguf`      |
-| phi3             | mini-gguf        | 3.82B - 4k ctx len| `cortex run phi3:mini-gguf`        |
-| qwen2            | 7b-gguf          | 7B                | `cortex run qwen2:7b-gguf`         |
-| tinyllama        | 1b-gguf          | 1.1B              | `cortex run tinyllama:1b-gguf`     |
-### `ONNX`
-| Model ID         | Variant (Branch) | Model size        | CLI command                        |
-|------------------|------------------|-------------------|------------------------------------|
-| gemma            | 7b-onnx          | 7B                | `cortex run gemma:7b-onnx`         |
-| llama3           | onnx             | 8B                | `cortex run llama3:onnx`           |
-| mistral          | 7b-onnx          | 7B                | `cortex run mistral:7b-onnx`       |
-| openhermes-2.5   | 7b-onnx          | 7B                | `cortex run openhermes-2.5:7b-onnx`|
-| phi3             | mini-onnx        | 3.82B - 4k ctx len| `cortex run phi3:mini-onnx`        |
-| phi3             | medium-onnx      | 14B - 4k ctx len  | `cortex run phi3:medium-onnx`      |
-### `TensorRT-LLM`
-| Model ID         | Variant (Branch)              | Model size        | CLI command                        |
-|------------------|-------------------------------|-------------------|------------------------------------|
-| llama3           | 8b-tensorrt-llm-windows-ampere       | 8B                | `cortex run llama3:8b-tensorrt-llm-windows-ampere`   |
-| llama3           | 8b-tensorrt-llm-linux-ampere     | 8B                | `cortex run llama3:8b-tensorrt-llm-linux-ampere` |
-| llama3           | 8b-tensorrt-llm-linux-ada   | 8B                | `cortex run llama3:8b-tensorrt-llm-linux-ada`|
-| llama3           | 8b-tensorrt-llm-windows-ada       | 8B                | `cortex run llama3:8b-tensorrt-llm-windows-ada`   |
-| mistral          | 7b-tensorrt-llm-linux-ampere     | 7B                | `cortex run mistral:7b-tensorrt-llm-linux-ampere`|
-| mistral          | 7b-tensorrt-llm-windows-ampere       | 7B                | `cortex run mistral:7b-tensorrt-llm-windows-ampere`  |
-| mistral          | 7b-tensorrt-llm-linux-ada   | 7B                | `cortex run mistral:7b-tensorrt-llm-linux-ada`|
-| mistral          | 7b-tensorrt-llm-windows-ada       | 7B                | `cortex run mistral:7b-tensorrt-llm-windows-ada`  |
-| openhermes-2.5   | 7b-tensorrt-llm-windows-ampere       | 7B                | `cortex run openhermes-2.5:7b-tensorrt-llm-windows-ampere`|
-| openhermes-2.5   | 7b-tensorrt-llm-windows-ada     | 7B                | `cortex run openhermes-2.5:7b-tensorrt-llm-windows-ada`|
-| openhermes-2.5   | 7b-tensorrt-llm-linux-ada   | 7B                | `cortex run openhermes-2.5:7b-tensorrt-llm-linux-ada`|
-
-> **Note**:
-> You should have at least 8 GB of RAM available to run the 7B models, 16 GB to run the 14B models, and 32 GB to run the 32B models.
-
-## Cortex.cpp CLI Commands
 
-| Command Description                | Command Example                                                     |
-|------------------------------------|---------------------------------------------------------------------|
-| **Start Cortex.cpp Server**            | `cortex`                                                            |
-| **Chat with a Model**              | `cortex chat [options] [model_id] [message]`                        |
-| **Embeddings**                     | `cortex embeddings [options] [model_id] [message]`                  |
-| **Pull a Model**                   | `cortex pull <model_id>`                                            |
-| **Download and Start a Model**     | `cortex run [options] [model_id]:[engine]`                          |
-| **Get Model Details**              | `cortex models get <model_id>`                                      |
-| **List Models**                    | `cortex models list [options]`                                      |
-| **Delete a Model**                 | `cortex models delete <model_id>`                                   |
-| **Start a Model**                  | `cortex models start [model_id]`                                    |
-| **Stop a Model**                   | `cortex models stop <model_id>`                                     |
-| **Update a Model**            | `cortex models update [options] <model_id>`                         |
-| **Get Engine Details**             | `cortex engines get <engine_name>`                                  |
-| **Install an Engine**              | `cortex engines install <engine_name> [options]`                    |
-| **List Engines**                   | `cortex engines list [options]`                                     |
-| **Uninnstall an Engine**              | `cortex engines uninstall <engine_name> [options]`                 |
-| **Show Model Information**         | `cortex ps`                                                         |
-| **Update Cortex.cpp**         | `cortex update [options]`                                                         |
-
-> **Note**
-> For a more detailed CLI Reference documentation, please see [here](https://cortex.so/docs/cli).
+# 3. Stop a model
+cortex stop <model_id>:[engine_name]
 
-## REST API
-Cortex.cpp has a REST API that runs at `localhost:3928`.
-
-### Pull a Model
+# 4. Stop the Cortex.cpp server
+cortex stop 
+```
+### API
+1.  Start the API server using `cortex` command.
+2. **Pull a Model**
 ```bash
 curl --request POST \
   --url http://localhost:3928/v1/models/{model_id}/pull
 ```
 
-### Start a Model
+3. **Start a Model**
 ```bash
 curl --request POST \
   --url http://localhost:3928/v1/models/{model_id}/start \
@@ -259,7 +126,7 @@ curl --request POST \
 }'
 ```
 
-### Chat with a Model
+4. **Chat with a Model**
 ```bash
 curl http://localhost:3928/v1/chat/completions \
 -H "Content-Type: application/json" \
@@ -284,18 +151,144 @@ curl http://localhost:3928/v1/chat/completions \
 }'
 ```
 
-### Stop a Model
+5. **Stop a Model**
 ```bash
 curl --request POST \
   --url http://localhost:3928/v1/models/mistral/stop
 ```
+6. Stop the Cortex.cpp server using `cortex stop` command.
+> **Note**:
+> Our API server is fully compatible with the OpenAI API, making it easy to integrate with any systems or tools that support OpenAI-compatible APIs.
+
+## Built-in Model Library
+Cortex.cpp supports various models available on the [Cortex Hub](https://huggingface.co/cortexso). Once downloaded, all model source files will be stored at `C:\Users\<username>\AppData\Local\cortexcpp\models`.
+
+Here are example of models that you can use based on each supported engine:
+
+| Model            | llama.cpp<br >`:gguf` | TensorRT<br >`:tensorrt`                 | ONNXRuntime<br >`:onnx`    | Command                       |
+|------------------|-----------------------|------------------------------------------|----------------------------|-------------------------------|
+| llama3.1         | ✅                     |                                          | ✅                          | cortex run llama3.1:gguf      |
+| llama3           | ✅                     | ✅                                        | ✅                          | cortex run llama3             |
+| mistral          | ✅                     | ✅                                        | ✅                          | cortex run mistral            |
+| qwen2            | ✅                     |                                          |                            | cortex run qwen2:7b-gguf      |
+| codestral        | ✅                     |                                          |                            | cortex run codestral:22b-gguf |
+| command-r        | ✅                     |                                          |                            | cortex run command-r:35b-gguf |
+| gemma            | ✅                     |                                          | ✅                          | cortex run gemma              |
+| mixtral          | ✅                     |                                          |                            | cortex run mixtral:7x8b-gguf  |
+| openhermes-2.5   | ✅                     | ✅                                        | ✅                          | cortex run openhermes-2.5     |
+| phi3 (medium)    | ✅                     |                                          | ✅                          | cortex run phi3:medium        |
+| phi3 (mini)      | ✅                     |                                          | ✅                          | cortex run phi3:mini          |
+| tinyllama        | ✅                     |                                          |                            | cortex run tinyllama:1b-gguf  |
+
+> **Note**:
+> You should have at least 8 GB of RAM available to run the 7B models, 16 GB to run the 14B models, and 32 GB to run the 32B models.
 
-> **Note**
-> Check our [API documentation](https://cortex.so/api-reference) for a full list of available endpoints.
+## Cortex.cpp CLI Commands
+For complete details on CLI commands, please refer to our [CLI documentation](https://cortex.so/docs/cli).
 
-## Build from Source
+## REST API
+Cortex.cpp includes a REST API accessible at `localhost:3928`. For a complete list of endpoints and their usage, visit our [API documentation](https://cortex.so/api-reference).
 
+## Uninstallation
 ### Windows
+1. Navigate to Add or Remove program.
+2. Search for Cortex.cpp.
+3. Click Uninstall.
+4. Delete the Cortex.cpp data folder located in your home folder.
+### MacOs
+Run the uninstaller script:
+```bash
+sudo sh cortex-uninstall.sh
+```
+> **Note**:
+> The script requires sudo permission.
+
+
+### Linux
+```bash
+sudo apt remove cortexcpp
+```
+
+## Alternate Installation
+We also provide Beta and Nightly version.
+<table>
+  <tr style="text-align:center">
+    <td style="text-align:center"><b>Version Type</b></td>
+    <td style="text-align:center"><b>Windows</b></td>
+    <td colspan="2" style="text-align:center"><b>MacOS</b></td>
+    <td colspan="2" style="text-align:center"><b>Linux</b></td>
+  </tr>
+  <tr style="text-align:center">
+    <td style="text-align:center"><b>Beta Build</b></td>
+    <td style="text-align:center">
+      <a href='https://github.com/janhq/cortex.cpp/releases'>
+        <img src='https://github.com/janhq/docs/blob/main/static/img/windows.png' style="height:14px; width: 14px" />
+        <b>cortexcpp.exe</b>
+      </a>
+    </td>
+    <td style="text-align:center">
+      <a href='https://github.com/janhq/cortex.cpp/releases'>
+        <img src='https://github.com/janhq/docs/blob/main/static/img/mac.png' style="height:15px; width: 15px" />
+        <b>Intel</b>
+      </a>
+    </td>
+    <td style="text-align:center">
+      <a href='https://github.com/janhq/cortex.cpp/releases'>
+        <img src='https://github.com/janhq/docs/blob/main/static/img/mac.png' style="height:15px; width: 15px" />
+        <b>M1/M2/M3/M4</b>
+      </a>
+    </td>
+    <td style="text-align:center">
+      <a href='https://github.com/janhq/cortex.cpp/releases'>
+        <img src='https://github.com/janhq/docs/blob/main/static/img/linux.png' style="height:14px; width: 14px" />
+        <b>cortexcpp.deb</b>
+      </a>
+    </td>
+    <td style="text-align:center">
+      <a href='https://github.com/janhq/cortex.cpp/releases'>
+        <img src='https://github.com/janhq/docs/blob/main/static/img/linux.png' style="height:14px; width: 14px" />
+        <b>cortexcpp.AppImage</b>
+      </a>
+    </td>
+  </tr>
+  <tr style="text-align:center">
+    <td style="text-align:center"><b>Nightly Build</b></td>
+    <td style="text-align:center">
+      <a href='https://github.com/janhq/cortex.cpp/releases'>
+        <img src='https://github.com/janhq/docs/blob/main/static/img/windows.png' style="height:14px; width: 14px" />
+        <b>cortexcpp.exe</b>
+      </a>
+    </td>
+    <td style="text-align:center">
+      <a href='https://github.com/janhq/cortex.cpp/releases'>
+        <img src='https://github.com/janhq/docs/blob/main/static/img/mac.png' style="height:15px; width: 15px" />
+        <b>Intel</b>
+      </a>
+    </td>
+    <td style="text-align:center">
+      <a href='https://github.com/janhq/cortex.cpp/releases'>
+        <img src='https://github.com/janhq/docs/blob/main/static/img/mac.png' style="height:15px; width: 15px" />
+        <b>M1/M2/M3/M4</b>
+      </a>
+    </td>
+    <td style="text-align:center">
+      <a href='https://github.com/janhq/cortex.cpp/releases'>
+        <img src='https://github.com/janhq/docs/blob/main/static/img/linux.png' style="height:14px; width: 14px" />
+        <b>cortexcpp.deb</b>
+      </a>
+    </td>
+    <td style="text-align:center">
+      <a href='https://github.com/janhq/cortex.cpp/releases'>
+        <img src='https://github.com/janhq/docs/blob/main/static/img/linux.png' style="height:14px; width: 14px" />
+        <b>cortexcpp.AppImage</b>
+      </a>
+    </td>
+  </tr>
+</table>
+
+### Build from Source
+
+#### Windows
 1. Clone the Cortex.cpp repository [here](https://github.com/janhq/cortex.cpp).
 2. Navigate to the `engine > vcpkg` folder.
 3. Configure the vpkg:
@@ -319,7 +312,7 @@ cmake .. -DBUILD_SHARED_LIBS=OFF -DCMAKE_TOOLCHAIN_FILE=path_to_vcpkg_folder/vcp
 # Get the help information
 cortex -h
 ```
-### MacOS
+#### MacOS
 1. Clone the Cortex.cpp repository [here](https://github.com/janhq/cortex.cpp).
 2. Navigate to the `engine > vcpkg` folder.
 3. Configure the vpkg:
@@ -344,7 +337,7 @@ make -j4
 # Get the help information
 cortex -h
 ```
-### Linux
+#### Linux
 1. Clone the Cortex.cpp repository [here](https://github.com/janhq/cortex.cpp).
 2. Navigate to the `engine > vcpkg` folder.
 3. Configure the vpkg: