Skip to content

Assessment: AI assessment pipeline #791

@vprashrex

Description

@vprashrex

Is your feature request related to a problem?
Partners are currently running separate AI assessment flows manually, which leads to inefficiencies and a lack of standardized evaluation comparisons on the same dataset.

Describe the solution you'd like
Build a shared Kaapi assessment pipeline that includes:

  • Dataset APIs
  • Multimodal processing
  • Multiple config-based batch runs
  • Run status tracking
  • Retries
  • Cron polling
  • SSE updates
  • Result exports

Why is this enhancement needed?
This enhancement will reduce duplicate partner efforts, ensure evaluations are repeatable, and improve scalability for high-volume programs like Inquilab.

Original issue

Describe the current behavior
A clear description of how it currently works and what the limitations are.
Partners run similar AI assessment flows separately, with manual prompt testing and no shared way to compare evaluations and configs on the same dataset.

Describe the enhancement you'd like
A clear and concise description of the improvement you want to see.
Build a shared Kaapi assessment pipeline with dataset APIs, multimodal processing, multiple config-based batch runs, run status tracking, retries, cron polling, SSE updates, and result exports.

Why is this enhancement needed?
Explain the benefits (e.g., performance, usability, maintainability, scalability).
Reduces duplicate partner effort, makes evaluations repeatable, and scales better for high-volume programs (like Inquilab).

Additional context
Add any other context, metrics, screenshots, or examples about the enhancement here.

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Projects

Status

No status

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions