|
1 | 1 | --- |
2 | | -title: Overview |
3 | | -description: Cortex Overview |
4 | | -slug: "basic-usage" |
| 2 | +title: Cortex Basic Usage |
| 3 | +description: Cortex Usage Overview |
5 | 4 | --- |
6 | 5 |
|
| 6 | + |
7 | 7 | import Tabs from "@theme/Tabs"; |
8 | 8 | import TabItem from "@theme/TabItem"; |
9 | 9 |
|
10 | | -:::warning |
11 | | -🚧 Cortex.cpp is currently under development. Our documentation outlines the intended behavior of Cortex, which may not yet be fully implemented in the codebase. |
12 | | -::: |
13 | | - |
14 | 10 | Cortex has an [API server](https://cortex.so/api-reference) that runs at `localhost:39281`. |
15 | 11 |
|
| 12 | +The port parameter can be set in [`.cortexrc`](/docs/architecture/cortexrc) with the `apiServerPort` parameter |
| 13 | + |
| 14 | +## Server |
| 15 | +### Start Cortex Server |
| 16 | +```bash |
| 17 | +# By default the server will be started on port `39281` |
| 18 | +cortex |
| 19 | +# Start a server with different port number |
| 20 | +cortex -a <address> -p <port_number> |
| 21 | +# Set the data folder directory |
| 22 | +cortex --dataFolder <dataFolderPath> |
| 23 | +``` |
| 24 | + |
| 25 | +### Terminate Cortex Server |
| 26 | +```bash |
| 27 | +curl --request DELETE \ |
| 28 | + --url http://127.0.0.1:39281/processManager/destroy |
| 29 | +``` |
| 30 | + |
| 31 | +## Engines |
| 32 | +Cortex currently supports 3 industry-standard engines: llama.cpp, ONNXRuntime and TensorRT-LLM. |
| 33 | + |
| 34 | +By default, Cortex installs llama.cpp engine which supports most laptops, desktops and OSes. |
| 35 | + |
| 36 | +For more information, see [Engine Management](/docs/engines) |
16 | 37 |
|
17 | | -## Usage |
18 | | -### Start Cortex.cpp Server |
19 | | -<Tabs> |
20 | | - <TabItem value="MacOs/Linux" label="MacOs/Linux"> |
21 | | - ```sh |
22 | | - # Stable |
23 | | - cortex start |
24 | | - |
25 | | - # Beta |
26 | | - cortex-beta start |
27 | | - |
28 | | - # Nightly |
29 | | - cortex-nightly start |
30 | | - ``` |
31 | | - </TabItem> |
32 | | - <TabItem value="Windows" label="Windows"> |
33 | | - ```sh |
34 | | - # Stable |
35 | | - cortex.exe start |
36 | | - |
37 | | - # Beta |
38 | | - cortex-beta.exe start |
39 | | - |
40 | | - # Nightly |
41 | | - cortex-nightly.exe start |
42 | | - ``` |
43 | | - </TabItem> |
44 | | -</Tabs> |
45 | | -### Run Model |
| 38 | +### List available engines |
| 39 | +```bash |
| 40 | +curl --request GET \ |
| 41 | + --url http://127.0.0.1:39281/v1/engines |
| 42 | +``` |
| 43 | + |
| 44 | +### Install an Engine (eg llama-cpp) |
| 45 | +```bash |
| 46 | +curl --request POST \ |
| 47 | + --url http://127.0.0.1:39281/v1/engines/install/llama-cpp |
| 48 | +``` |
| 49 | + |
| 50 | +## Manage Models |
| 51 | +### Pull Model |
46 | 52 | ```bash |
47 | | -# Pull a model |
48 | 53 | curl --request POST \ |
49 | | - --url http://localhost:39281/v1/models/pull \ |
| 54 | + --url http://127.0.0.1:39281/v1/models/pull \ |
| 55 | + -H "Content-Type: application/json" \ |
| 56 | + --header 'Content-Type: application/json' \ |
| 57 | + --data '{ |
| 58 | + "model": "tinyllama:gguf", |
| 59 | + "id": "my-custom-model-id", |
| 60 | +}' |
| 61 | +``` |
| 62 | +If the model download was interrupted, this request will download the remainder of the model files. |
| 63 | + |
| 64 | +The downloaded models are saved to the [Cortex Data Folder](/docs/architecture/data-folder). |
| 65 | + |
| 66 | +### Stop Model Download |
| 67 | +```bash |
| 68 | +❯ curl --request DELETE \ |
| 69 | + --url http://127.0.0.1:39281/v1/models/pull \ |
50 | 70 | --header 'Content-Type: application/json' \ |
51 | 71 | --data '{ |
52 | | - "model": "mistral:gguf" |
53 | | -}' |
| 72 | + "taskId": "tinyllama:1b-gguf" |
| 73 | +}' |
| 74 | +``` |
| 75 | + |
| 76 | +### List Models |
| 77 | +```bash |
| 78 | +curl --request GET \ |
| 79 | + --url http://127.0.0.1:39281/v1/models |
| 80 | +``` |
| 81 | + |
| 82 | +### Delete Model |
| 83 | +```bash |
| 84 | +curl --request DELETE \ |
| 85 | + --url http://127.0.0.1:39281/v1/models/tinyllama:1b-gguf |
| 86 | +``` |
| 87 | + |
| 88 | +## Run Models |
| 89 | +### Start Model |
| 90 | +```bash |
54 | 91 | # Start the model |
55 | 92 | curl --request POST \ |
56 | | - --url http://localhost:39281/v1/models/start \ |
| 93 | + --url http://127.0.0.1:39281/v1/models/start \ |
57 | 94 | --header 'Content-Type: application/json' \ |
58 | 95 | --data '{ |
59 | | - "model": "mistral:gguf" |
60 | | - "prompt_template": "system\n{system_message}\nuser\n{prompt}\nassistant", |
61 | | - "stop": [], |
62 | | - "ngl": 4096, |
63 | | - "ctx_len": 4096, |
64 | | - "cpu_threads": 10, |
65 | | - "n_batch": 2048, |
66 | | - "caching_enabled": true, |
67 | | - "grp_attn_n": 1, |
68 | | - "grp_attn_w": 512, |
69 | | - "mlock": false, |
70 | | - "flash_attn": true, |
71 | | - "cache_type": "f16", |
72 | | - "use_mmap": true, |
73 | | - "engine": "llama-cpp" |
| 96 | + "model": "tinyllama:1b-gguf" |
74 | 97 | }' |
75 | 98 | ``` |
76 | | -### Chat with Model |
| 99 | + |
| 100 | +### Create Chat Completion |
77 | 101 | ```bash |
78 | 102 | # Invoke the chat completions endpoint |
79 | | -curl http://localhost:39281/v1/chat/completions \ |
80 | | --H "Content-Type: application/json" \ |
81 | | --d '{ |
82 | | - "messages": [ |
83 | | - { |
84 | | - "role": "user", |
85 | | - "content": "Hello" |
86 | | - }, |
87 | | - ], |
88 | | - "model": "mistral:gguf", |
89 | | - "stream": true, |
90 | | - "max_tokens": 1, |
91 | | - "stop": [ |
92 | | - null |
93 | | - ], |
94 | | - "frequency_penalty": 1, |
95 | | - "presence_penalty": 1, |
96 | | - "temperature": 1, |
97 | | - "top_p": 1 |
| 103 | +curl --request POST \ |
| 104 | + --url http://localhost:39281/v1/chat/completions \ |
| 105 | + -H "Content-Type: application/json" \ |
| 106 | + --data '{ |
| 107 | + "messages": [ |
| 108 | + { |
| 109 | + "role": "user", |
| 110 | + "content": "Write a Haiku about cats and AI" |
| 111 | + }, |
| 112 | + ], |
| 113 | + "model": "tinyllama:1b-gguf", |
| 114 | + "stream": false, |
98 | 115 | }' |
99 | 116 | ``` |
| 117 | + |
100 | 118 | ### Stop Model |
101 | 119 | ```bash |
102 | | -# Stop a model |
103 | 120 | curl --request POST \ |
104 | | - --url http://localhost:39281/v1/models/stop \ |
| 121 | + --url http://127.0.0.1:39281/v1/models/stop \ |
105 | 122 | --header 'Content-Type: application/json' \ |
106 | 123 | --data '{ |
107 | | - "model": "mistral:gguf" |
108 | | -}' |
| 124 | + "model": "tinyllama:1b-gguf" |
| 125 | +}' |
109 | 126 | ``` |
110 | | -### Stop Cortex.cpp Server |
111 | | -<Tabs> |
112 | | - <TabItem value="MacOs/Linux" label="MacOs/Linux"> |
113 | | - ```sh |
114 | | - # Stable |
115 | | - cortex stop |
116 | | - |
117 | | - # Beta |
118 | | - cortex-beta stop |
119 | | - |
120 | | - # Nightly |
121 | | - cortex-nightly stop |
122 | | - ``` |
123 | | - </TabItem> |
124 | | - <TabItem value="Windows" label="Windows"> |
125 | | - ```sh |
126 | | - # Stable |
127 | | - cortex.exe stop |
128 | | - |
129 | | - # Beta |
130 | | - cortex-beta.exe stop |
131 | | - |
132 | | - # Nightly |
133 | | - cortex-nightly.exe stop |
134 | | - ``` |
135 | | - </TabItem> |
136 | | -</Tabs> |
| 127 | + |
| 128 | + |
0 commit comments