Skip to content
This repository was archived by the owner on Jul 4, 2025. It is now read-only.
This repository was archived by the owner on Jul 4, 2025. It is now read-only.

epic: Cortex Model Structures and simplified cortex run #1512

@gabrielle-ong

Description

@gabrielle-ong

Problem

  • Model names are complex in capturing repo, source, version, alias
  • Models names should be simplified, compact and not overwhelm users with long model names
  • We need a clearer way to handle Model names for models list, cortex pull and cortex run
  • This affects the core logic of how we handle models in cortex.db

Success Criteria

  1. cortex pull models (huggingface, cortexso) successfully pulls model
  2. cortex.db saves the right fields for pulled model
  3. Cortex model list shows simplified table
  4. cortex run uses regex to ask user which model ID they want to run

repo, version, source, id, alias

Tasklist / Sub-issues

to be added

Eng Specs

from #1410

Concepts

  1. Model Repo: i.e. tinyllama, or bartowski/...
  2. Model Source: huggingface, cortex
  3. Model Version: i.e. specific quant, that belongs to a Model Repo
  4. Model ID: should be :
  5. Model Alias: user-defined shortname - Deprecated in favour of regex

1. Models Table & cortex models list

The Models table in cortex.db still remains the same as before

model table Description
model Unique identifier for the model
author_repo_id Author or repository identifier
branch_name Doesn't exist
path_to_model_yaml Path to the model's YAML file

result of cortex models list will be simplify as follow:

$ cortex models list
| Index | Model ID                                                           |
|-------|---------------------------------------------------------------------|
| 1     | tinyllama:1b-gguf                                                   |
| 2     | tinyllama:1b-gguf                                                   |
| 3     | bartowski/Mistral-8b-instruct-gguf:Mistral-8b-instruct-8b.q4k_m     |
| 4     | mistral:7b                                                          |
| 5     | nvidia-cloud/Mistral-Nemo-12b:int4                                  |
| 6     | huggingface.co/bartowski/Mistral-8b-instruct-gguf:quant |
  • The engine will be infer from model.yml of each model
  • We will read the model.yml through path_to_model_yaml.
  • Model ID in model list result command = model field in Models table of cortex.db

When running, we would find the matching ID from the database. The model list could also include an option to filter out models:

$ cortex models list mis
| Index | Model ID                                                           |
|-------|---------------------------------------------------------------------|
| 1     | bartowski/Mistral-8b-instruct-gguf:Mistral-8b-instruct-8b.q4k_m     |
| 2     | mistral:7b                                                          |
| 3     | nvidia-cloud/Mistral-Nemo-12b:int4                                  |
| 4     | huggingface.co/bartowski/Mistral-8b-instruct-gguf:quant |

2. cortex run with regex search (Deprecate model aliases)

  • If only 1 returns, we run the one model
  • If there are multiple models matched, we show a menu for the user to choose
  • If no arg, we show all models and let user choose via menu.
  • update API to match with this change
  • we would no longer need model alias field.

Logic:

$ cortex models list
| Index | Model ID                                                           |
|-------|---------------------------------------------------------------------|
| 1     | tinyllama:1b-gguf                                                   |
| 2     | tinyllama:1b-gguf                                                   |
| 3     | bartowski/Mistral-8b-instruct-gguf:Mistral-8b-instruct-8b.q4k_m     |
| 4     | mistral:7b                                                          |
| 5     | nvidia-cloud/Mistral-Nemo-12b:int4                                  |
| 6     | huggingface.co/bartowski/Mistral-8b-instruct-gguf:quant |

$ cortex run mis
Please select an option:
1. mistral-nemo:12b-gguf-q8
2. huggingface.co/bartowski/Mistral-8b-instruct-gguf:quant
3. mistral:7b
4. nvidia-cloud/Mistral-Nemo-12b:int4 

Bug tracking

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions