This repository was archived by the owner on Jul 4, 2025. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 181
This repository was archived by the owner on Jul 4, 2025. It is now read-only.
epic: Cortex Model Structures and simplified cortex run #1512
Copy link
Copy link
Closed
Labels
category: model managementModel pull, yaml, model stateModel pull, yaml, model statecategory: model runningInference ux, handling context/parameters, runtimeInference ux, handling context/parameters, runtimetype: epicA major feature or initiativeA major feature or initiative
Milestone
Description
Problem
- Model names are complex in capturing repo, source, version, alias
- Models names should be simplified, compact and not overwhelm users with long model names
- We need a clearer way to handle Model names for
models list,cortex pullandcortex run - This affects the core logic of how we handle models in
cortex.db
Success Criteria
cortex pullmodels (huggingface, cortexso) successfully pulls modelcortex.dbsaves the right fields for pulled modelCortex model listshows simplified tablecortex runuses regex to ask user which model ID they want to run
repo, version, source, id, alias
Tasklist / Sub-issues
to be added
Eng Specs
from #1410
Concepts
- Model Repo: i.e. tinyllama, or bartowski/...
- Model Source: huggingface, cortex
- Model Version: i.e. specific quant, that belongs to a Model Repo
- Model ID: should be :
- Model Alias: user-defined shortname - Deprecated in favour of regex
1. Models Table & cortex models list
The Models table in cortex.db still remains the same as before
| model table | Description |
|---|---|
model |
Unique identifier for the model |
author_repo_id |
Author or repository identifier |
branch_name |
Doesn't exist |
path_to_model_yaml |
Path to the model's YAML file |
result of cortex models list will be simplify as follow:
$ cortex models list
| Index | Model ID |
|-------|---------------------------------------------------------------------|
| 1 | tinyllama:1b-gguf |
| 2 | tinyllama:1b-gguf |
| 3 | bartowski/Mistral-8b-instruct-gguf:Mistral-8b-instruct-8b.q4k_m |
| 4 | mistral:7b |
| 5 | nvidia-cloud/Mistral-Nemo-12b:int4 |
| 6 | huggingface.co/bartowski/Mistral-8b-instruct-gguf:quant |
- The
enginewill be infer frommodel.ymlof each model - We will read the
model.ymlthroughpath_to_model_yaml. Model IDinmodel listresult command =modelfield inModelstable ofcortex.db
When running, we would find the matching ID from the database. The model list could also include an option to filter out models:
$ cortex models list mis
| Index | Model ID |
|-------|---------------------------------------------------------------------|
| 1 | bartowski/Mistral-8b-instruct-gguf:Mistral-8b-instruct-8b.q4k_m |
| 2 | mistral:7b |
| 3 | nvidia-cloud/Mistral-Nemo-12b:int4 |
| 4 | huggingface.co/bartowski/Mistral-8b-instruct-gguf:quant |
2. cortex run with regex search (Deprecate model aliases)
- If only 1 returns, we run the one model
- If there are multiple models matched, we show a menu for the user to choose
- If no arg, we show all models and let user choose via menu.
- update API to match with this change
- we would no longer need model alias field.
Logic:
$ cortex models list
| Index | Model ID |
|-------|---------------------------------------------------------------------|
| 1 | tinyllama:1b-gguf |
| 2 | tinyllama:1b-gguf |
| 3 | bartowski/Mistral-8b-instruct-gguf:Mistral-8b-instruct-8b.q4k_m |
| 4 | mistral:7b |
| 5 | nvidia-cloud/Mistral-Nemo-12b:int4 |
| 6 | huggingface.co/bartowski/Mistral-8b-instruct-gguf:quant |
$ cortex run mis
Please select an option:
1. mistral-nemo:12b-gguf-q8
2. huggingface.co/bartowski/Mistral-8b-instruct-gguf:quant
3. mistral:7b
4. nvidia-cloud/Mistral-Nemo-12b:int4
Bug tracking
- bug: Could not able to run models since 172 #1505 (wontfix; should be resolved by this)
Metadata
Metadata
Assignees
Labels
category: model managementModel pull, yaml, model stateModel pull, yaml, model statecategory: model runningInference ux, handling context/parameters, runtimeInference ux, handling context/parameters, runtimetype: epicA major feature or initiativeA major feature or initiative