epic: Automated Testing for Built-in Models

## Resources

- #1125 

## Original Post

- [ ] Focus team on basic functionality - e.g. install, inference call
- [ ] Out-of-scope: model testing (for now)

**Problem**  
To avoid manually test the end-to-end functionality of various models in the Hugging Face Cortex Hub. This process is time-consuming and prone to human error, leading to inconsistencies in testing results.

**Success Criteria**  
I want to have an automated end-to-end testing framework set up for the most common models in the Hugging Face Cortex Hub. This framework should automatically run tests for the following models: 

- cortexso/llama3
- cortexso/llama3.1
- cortexso/gemma
- cortexso/gemma2
- cortexso/phi3
- cortexso/mistral
- cortexso/openhermes-2.5
- cortexso/tinyllama
- cortexso/qwen2

The tests should be executed either 

> **on weekends** or **whenever there is a new release of the LlamaCPP version.** 

The results should be easily accessible and provide clear feedback on the models' performance and functionality.

**Additional Context**  
Automating the testing process will not only save time but also ensure that any changes or updates to the models do not break existing functionality. It would be beneficial to integrate this testing with CI/CD pipelines to ensure that any new model versions are automatically tested before deployment. 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

epic: Automated Testing for Built-in Models #56

Resources

Original Post

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

epic: Automated Testing for Built-in Models #56

Description

Resources

Original Post

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions