Skip to content
This repository was archived by the owner on Jul 4, 2025. It is now read-only.
This repository was archived by the owner on Jul 4, 2025. It is now read-only.

enhancement: improved cortex model hub #1927

@ramonpzg

Description

@ramonpzg

The cortex hub currently shows the models that can be pulled using cortex pull <model-name> without any additional information about the model. In the future, we want the hub to provide high- and low-level details per model on different hardware where cortex can be deployed to and on commonly used benchmarks like MMLU or SWE on regular and custom hardware. The information provided won't be the same as the Model Cards provided by HuggingFace but more akin to a Menu of useful information where users can pick whichever recipe suits them best.

For example, each row will have a little drop-down arrow on the right-hand side:

Image

Each arrow will reveal a menu of metrics that the user can tweak for each model.

Image

Each model will have a dedicated page as well with additional information that won't fit in the table above. The table will slightly resemble a model card but will be focused on benchmarks alongside a mini-tutorial for usage with Cortex. We'll call it a "Bench Card."

Image

We would have this be fully automated.

  1. Pick a model
  2. Trigger script that
    1. Quantizes the Model
    2. Runs Benchmarks on Different Hardware
    3. Capture Results
  3. Have an LLM describe
  4. Generate a YAML file and populate it
  5. Add it to the Hub

For the YAML file, we can take inspiration from the one used in the model cards and have something like:

model-index:
  - name: llama3
    results:
      - task:
          type: hardware-benchmark
        architecture:
          - name: x86
            - time-to-first-token: 0.7
            - ...
          - name: arm
            -

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

Status

No status

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions