Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add TemplateLM boilerplate LM class #1279

Merged
merged 54 commits into from
Feb 22, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
54 commits
Select commit Hold shift + click to select a range
b4fcc09
loglikelihood refactor using template lm
anjor Jan 13, 2024
ea44741
linter
anjor Jan 13, 2024
4e6a870
fix whitespace in target + prompt for CoT gsm8k (#1275)
haileyschoelkopf Jan 15, 2024
d444e9a
Make `parallelize=True` vs. `accelerate launch` distinction clearer i…
haileyschoelkopf Jan 15, 2024
d41a351
Allow parameter edits for registered tasks when listed in a benchmark…
lintangsutawika Jan 15, 2024
db3ee51
Fix data-parallel evaluation with quantized models (#1270)
haileyschoelkopf Jan 15, 2024
1c07f70
Rework documentation for explaining local dataset (#1284)
lintangsutawika Jan 15, 2024
b716761
Re-add citation
StellaAthena Jan 15, 2024
370cbbe
Update CITATION.bib (#1285)
haileyschoelkopf Jan 15, 2024
4702624
Update nq_open.yaml (#1289)
haileyschoelkopf Jan 16, 2024
0013399
Update README.md with custom integration doc (#1298)
msaroufim Jan 16, 2024
8783281
Update nq_open.yaml (#1305)
Hannibal046 Jan 18, 2024
5762058
Update task_guide.md (#1306)
daniellepintz Jan 18, 2024
55e51ec
Update pyproject.toml (#1312)
haileyschoelkopf Jan 18, 2024
5fb93fc
Fix polemo2_in.yaml config name (#1313)
lhoestq Jan 18, 2024
7724bf1
Update pyproject.toml (#1314)
haileyschoelkopf Jan 18, 2024
3688b1f
Fix group register (#1315)
lintangsutawika Jan 18, 2024
b6051f9
Update task_guide.md (#1316)
djstrong Jan 18, 2024
d0de14e
Update polemo2_in.yaml (#1318)
lintangsutawika Jan 19, 2024
e7daca5
don't pass extra kwargs to mamba any more (#1328)
haileyschoelkopf Jan 22, 2024
e8bc89d
Fix Issue regarding stderr (#1327)
lintangsutawika Jan 22, 2024
fd94748
Add `local-completions` support using OpenAI interface (#1277)
mgoin Jan 22, 2024
ea12d33
fallback to classname when LM doesnt have config (#1334)
nairbv Jan 22, 2024
413f183
fix a trailing whitespace that breaks a lint job (#1335)
nairbv Jan 22, 2024
9703c8a
skip "benchmarks" in changed_tasks (#1336)
baberabb Jan 23, 2024
0ffc6b6
Update migrated HF dataset paths (#1332)
haileyschoelkopf Jan 23, 2024
fa05528
Don't use `get_task_dict()` in task registration / initialization (#1…
haileyschoelkopf Jan 23, 2024
4a2c48a
manage default (greedy) gen_kwargs in vllm (#1341)
baberabb Jan 23, 2024
08af37f
modified default gen_kwargs to work better with CLI; changed prompt_l…
baberabb Jan 24, 2024
279c5b5
update links to task_guide.md (#1348)
haileyschoelkopf Jan 24, 2024
7cf3083
`Filter` docs not offset by `doc_id` (#1349)
baberabb Jan 25, 2024
5f09e98
Add FAQ on `lm_eval.tasks.initialize_tasks()` to README (#1330)
haileyschoelkopf Jan 25, 2024
e29ed4e
Refix issue regarding stderr (#1357)
thnkinbtfly Jan 26, 2024
ca8a014
Add causalLM OpenVino models (#1290)
NoushNabi Jan 26, 2024
5f77a8f
Apply some best practices and guideline recommendations to code (#1363)
LSinev Jan 28, 2024
c986b5f
serialize callable functions in config (#1367)
baberabb Jan 29, 2024
87ea8d3
delay filter init; remove `*args` (#1369)
baberabb Jan 30, 2024
6f4e5df
Fix unintuitive `--gen_kwargs` behavior (#1329)
haileyschoelkopf Jan 31, 2024
5ff7c41
Publish to pypi (#1194)
anjor Jan 31, 2024
68a193b
Make dependencies compatible with PyPI (#1378)
haileyschoelkopf Jan 31, 2024
492191d
Add support for RWKV models with World tokenizer (#1374)
PicoCreator Jan 31, 2024
477058a
add bypass metric (#1156)
baberabb Jan 31, 2024
8d974bf
loglikelihood refactor using template lm
anjor Jan 13, 2024
3b07548
Merge branch 'main' into anjor/loglikelihood-refactor-2
anjor Jan 31, 2024
b9436a9
lint
anjor Jan 31, 2024
907968c
code review
anjor Feb 21, 2024
bb5481a
Merge branch 'main' into anjor/loglikelihood-refactor-2
anjor Feb 21, 2024
129a2ee
neuron optimum
anjor Feb 21, 2024
a97260e
Mention TemplateLM in model_guide.md
haileyschoelkopf Feb 22, 2024
63564e7
Update lm_eval/api/model.py
haileyschoelkopf Feb 22, 2024
acff950
fix linter
haileyschoelkopf Feb 22, 2024
5c17420
fix format
haileyschoelkopf Feb 22, 2024
b481947
fix format
haileyschoelkopf Feb 22, 2024
63d58f7
fix format
haileyschoelkopf Feb 22, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/model_guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,7 @@ All three request types take as input `requests` of type `list[Instance]` that h
- It should return `(ll,) : Tuple[float]` , a.k.a. solely the *loglikelihood* of producing each piece of text given no starting input.


To allow a model to be evaluated on all types of tasks, you will need to implement these three types of measurements (note that `loglikelihood_rolling` is a special case of `loglikelihood`). For a reference implementation, check out `lm_eval/models/huggingface.py` !
To allow a model to be evaluated on all types of tasks, you will need to implement these three types of measurements (note that `loglikelihood_rolling` is a special case of `loglikelihood`). For a reference implementation, check out `lm_eval/models/huggingface.py` ! Additionally, check out `lm_eval.api.model.TemplateLM` for a class that abstracts away some commonly used functions across LM subclasses, or see if your model would lend itself well to subclassing the `lm_eval.models.huggingface.HFLM` class and overriding just the initialization or a couple methods!

**Tip: be careful of indexing in loglikelihood!**

Expand Down
58 changes: 58 additions & 0 deletions lm_eval/api/model.py
Original file line number Diff line number Diff line change
Expand Up @@ -247,3 +247,61 @@ def fn(requests):

def get_cache_hook(self):
return CacheHook(self)


class TemplateLM(LM):
"""
A class acting as intermediary between the LM base class
and boilerplate often included in other LM subclasses.
"""

@property
haileyschoelkopf marked this conversation as resolved.
Show resolved Hide resolved
@abc.abstractmethod
def eot_token_id(self):
pass

@abc.abstractmethod
def tok_encode(self, string: str, **kwargs):
pass

@abc.abstractmethod
def _loglikelihood_tokens(self, requests, **kwargs):
pass

def _encode_pair(self, context, continuation):
n_spaces = len(context) - len(context.rstrip())
if n_spaces > 0:
continuation = context[-n_spaces:] + continuation
context = context[:-n_spaces]

whole_enc = self.tok_encode(context + continuation, add_special_tokens=False)
haileyschoelkopf marked this conversation as resolved.
Show resolved Hide resolved
context_enc = self.tok_encode(context, add_special_tokens=False)

context_enc_len = len(context_enc)
continuation_enc = whole_enc[context_enc_len:]

return context_enc, continuation_enc

def loglikelihood(self, requests) -> List[Tuple[float, bool]]:
new_reqs = []
for context, continuation in [req.args for req in requests]:
if context == "":
# end of text as context
context_enc, continuation_enc = (
[self.eot_token_id],
self.tok_encode(continuation),
)
else:
context_enc, continuation_enc = self._encode_pair(context, continuation)

new_reqs.append(((context, continuation), context_enc, continuation_enc))

return self._loglikelihood_tokens(new_reqs)

@abc.abstractmethod
def loglikelihood_rolling(self, requests) -> List[Tuple[float, bool]]:
pass

@abc.abstractmethod
def generate_until(self, requests) -> List[str]:
pass
37 changes: 2 additions & 35 deletions lm_eval/models/huggingface.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@

from lm_eval import utils
from lm_eval.api.instance import Instance
from lm_eval.api.model import LM
from lm_eval.api.model import TemplateLM
from lm_eval.api.registry import register_model
from lm_eval.models.utils import (
Collator,
Expand Down Expand Up @@ -64,7 +64,7 @@ def _get_accelerate_args(


@register_model("hf-auto", "hf", "huggingface")
class HFLM(LM):
class HFLM(TemplateLM):
"""
An abstracted Huggingface model class. Enables usage with both models of
`transformers.AutoModelForCausalLM` and `transformers.AutoModelForSeq2SeqLM` classes.
Expand Down Expand Up @@ -780,39 +780,6 @@ def _select_cont_toks(

return logits

def _encode_pair(
self, context: str, continuation: str
) -> Tuple[List[int], List[int]]:
n_spaces = len(context) - len(context.rstrip())
if n_spaces > 0:
continuation = context[-n_spaces:] + continuation
context = context[:-n_spaces]

whole_enc = self.tok_encode(context + continuation, add_special_tokens=False)
context_enc = self.tok_encode(context, add_special_tokens=False)

# whole_enc = self.tok_encode(context + continuation)
# context_enc = self.tok_encode(context, add_special_tokens=False)
context_enc_len = len(context_enc)
continuation_enc = whole_enc[context_enc_len:]
return context_enc, continuation_enc

def loglikelihood(self, requests: List[Instance]) -> List[Tuple[float, bool]]:
new_reqs = []
for context, continuation in [req.args for req in requests]:
if context == "":
# end of text as context
context_enc, continuation_enc = (
[self.eot_token_id],
self.tok_encode(continuation),
)
else:
context_enc, continuation_enc = self._encode_pair(context, continuation)

new_reqs.append(((context, continuation), context_enc, continuation_enc))

return self._loglikelihood_tokens(requests=new_reqs)

def loglikelihood_rolling(self, requests: List[Instance]) -> List[float]:
loglikelihoods = []

Expand Down
35 changes: 2 additions & 33 deletions lm_eval/models/neuron_optimum.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@

import lm_eval.models.utils
from lm_eval import utils
from lm_eval.api.model import LM
from lm_eval.api.model import TemplateLM
from lm_eval.api.registry import register_model
from lm_eval.models.utils import stop_sequences_criteria

Expand Down Expand Up @@ -172,7 +172,7 @@ def generate(


@register_model("neuronx")
class NEURON_HF(LM):
class NEURON_HF(TemplateLM):
"""
Enables usage with on AWS Neuron
using the HuggingFace Transformers + Transformers neuronx library.
Expand Down Expand Up @@ -447,37 +447,6 @@ def _select_cont_toks(self, logits, contlen=None, inplen=None):

return logits

def _encode_pair(self, context, continuation):
n_spaces = len(context) - len(context.rstrip())
if n_spaces > 0:
continuation = context[-n_spaces:] + continuation
context = context[:-n_spaces]

whole_enc = self.tok_encode(context + continuation, add_special_tokens=False)
context_enc = self.tok_encode(context, add_special_tokens=False)

# whole_enc = self.tok_encode(context + continuation)
# context_enc = self.tok_encode(context, add_special_tokens=False)
context_enc_len = len(context_enc)
continuation_enc = whole_enc[context_enc_len:]
return context_enc, continuation_enc

def loglikelihood(self, requests):
new_reqs = []
for context, continuation in [req.args for req in requests]:
if context == "":
# end of text as context
context_enc, continuation_enc = (
[self.eot_token_id],
self.tok_encode(continuation),
)
else:
context_enc, continuation_enc = self._encode_pair(context, continuation)

new_reqs.append(((context, continuation), context_enc, continuation_enc))

return self._loglikelihood_tokens(new_reqs)

def loglikelihood_rolling(self, requests):
loglikelihoods = []

Expand Down
35 changes: 3 additions & 32 deletions lm_eval/models/openai_completions.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@

import lm_eval.models.utils
from lm_eval import utils
from lm_eval.api.model import LM
from lm_eval.api.model import LM, TemplateLM
from lm_eval.api.registry import register_model
from lm_eval.models.utils import retry_on_specific_exceptions
from lm_eval.utils import eval_logger
Expand Down Expand Up @@ -75,7 +75,7 @@ def completion():


@register_model("openai-completions", "local-completions")
class OpenaiCompletionsLM(LM):
class OpenaiCompletionsLM(TemplateLM):
_DEFAULT_MAX_LENGTH = 2048

def __init__(
Expand Down Expand Up @@ -171,41 +171,12 @@ def device(self):
# Isn't used because we override _loglikelihood_tokens
raise NotImplementedError()

def tok_encode(self, string: str) -> List[int]:
def tok_encode(self, string: str, **kwargs) -> List[int]:
return self.tokenizer.encode(string)

def tok_decode(self, tokens: List[int]) -> str:
return self.tokenizer.decode(tokens)

def _encode_pair(
self, context: str, continuation: str
) -> Tuple[List[int], List[int]]:
n_spaces = len(context) - len(context.rstrip())
if n_spaces > 0:
continuation = context[-n_spaces:] + continuation
context = context[:-n_spaces]
whole_enc = self.tok_encode(context + continuation)
context_enc = self.tok_encode(context)
context_enc_len = len(context_enc)
continuation_enc = whole_enc[context_enc_len:]
return context_enc, continuation_enc

def loglikelihood(self, requests) -> List[Tuple[float, bool]]:
new_reqs = []
for context, continuation in [req.args for req in requests]:
if context == "":
# end of text as context
context_enc, continuation_enc = (
[self.eot_token_id],
self.tok_encode(continuation),
)
else:
context_enc, continuation_enc = self._encode_pair(context, continuation)

new_reqs.append(((context, continuation), context_enc, continuation_enc))

return self._loglikelihood_tokens(new_reqs)

def _loglikelihood_tokens(
self, requests, disable_tqdm: bool = False
) -> List[Tuple[float, bool]]:
Expand Down
35 changes: 2 additions & 33 deletions lm_eval/models/vllm_causallms.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
from tqdm import tqdm

from lm_eval.api.instance import Instance
from lm_eval.api.model import LM
from lm_eval.api.model import TemplateLM
from lm_eval.api.registry import register_model
from lm_eval.models.utils import Collator, divide
from lm_eval.utils import (
Expand Down Expand Up @@ -35,7 +35,7 @@ def run_inference_one_model(


@register_model("vllm")
class VLLM(LM):
class VLLM(TemplateLM):
_DEFAULT_MAX_LENGTH = 2048

def __init__(
Expand Down Expand Up @@ -194,37 +194,6 @@ def _model_generate(
)
return outputs

def _encode_pair(
self, context: str, continuation: str
) -> Tuple[List[int], List[int]]:
n_spaces = len(context) - len(context.rstrip())
if n_spaces > 0:
continuation = context[-n_spaces:] + continuation
context = context[:-n_spaces]

whole_enc = self.tok_encode(context + continuation, add_special_tokens=False)
context_enc = self.tok_encode(context, add_special_tokens=False)

context_enc_len = len(context_enc)
continuation_enc = whole_enc[context_enc_len:]
return context_enc, continuation_enc

def loglikelihood(self, requests: List[Instance]) -> List[Tuple[float, bool]]:
new_reqs = []
for context, continuation in [req.args for req in requests]:
if context == "":
# end of text as context
context_enc, continuation_enc = (
[self.eot_token_id],
self.tok_encode(continuation),
)
else:
context_enc, continuation_enc = self._encode_pair(context, continuation)

new_reqs.append(((context, continuation), context_enc, continuation_enc))

return self._loglikelihood_tokens(new_reqs)

def loglikelihood_rolling(self, requests: List[Instance]) -> List[float]:
loglikelihoods = []

Expand Down
Loading