Skip to content

[LLama] Failed to evaluate truthfulqa #617

@stamalakhov

Description

@stamalakhov

What

Failed to evaluate "truthfulqa" benchmark using lm_eval package. The benchmark needs generate method at least.

  1. To make transformers generate (from GenerateMixin) usable (without kv-cache it's very slow) we need to support DynamicCache from transformers (right now it's just a list of kv-tuples).
  2. Or we need to reimplement generate.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions