Skip to content

[feature request] Cache predictions in the evaluation pipeline #14

@NISH1001

Description

@NISH1001

What

Currently, evalem.pipelines.SimpleEvaluationPipeline is stateless. That means any forward passes (including inferencing and evaluation results) aren't cached within the pipeline object. This is fine for inference+evaluation on a small sample size. However, for a bigger size, say full-on squad v2 86k train samples, re-running the inference to get predictions is time-consuming when we want to switch the Evaluator object.

Why

To speed up evaluation without re-running forward pass on a huge dataset.
This can also help in debugging for such large samples because for large samples it's a bummer to catch the runtime errors (say tokenization error relating to weird texts, etc) at a late stage during the pipeline.

How

Maybe, we can have a new CachedSimpleEvaluationPipeline or something like that to be able to load predictions from external files (text, JSON, etc.)


cc: @muthukumaranR

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions