-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Security features from the Hugging Face datasets library #1135
Comments
Thanks, @lhoestq ! Will see about the smoothest way to make this non-invasive for users while still having them acknowledge that they may be running untrusted code. |
I'm happy to help on this issue if it's something you'd still like for lm-evaluation-harness to be able to bump I see we include it as a parameters when initialing the constructor for The steps I had in mind:
We mentioned this may be invasive for users and introduces an extra level of friction that they don't usually need to respond to cli prompts to continue when using lm-evaluation-harness - perhaps a log line entry is enough here? Let me know what you think @haileyschoelkopf |
Hi @veekaybee , this'd be awesome! I agree that since the current behavior is to run datasets from HF datasets, remote code or no, that manually setting it to Maybe @lhoestq has suggestions for how to non-invasively incorporate this while not making it too much friction for users? |
I'm taking a look at how some other libraries handle this:
Perhaps we could do some variation of the first one, where we indicate that users should set it in the docs and then check the environment variable when a simple evaluation for HFLM is instantiated ? |
Hmmm... I think for models, allowing it to be passed through |
It would be great to make the list of tasks that require from datasets.inspect import get_dataset_infos
task_that_require_trust_remote_code = []
for task in tasks:
try:
get_dataset_config_names(task.DATASET_PATH, trust_remote_code=False)
except ValueError e:
if "trust_remote_code=True" in str(e):
task_that_require_trust_remote_code.append(task) Maybe for all the tasks that we can already trust we can add Note that there is also an environment variable |
I want to summarize so far for my own understanding, let me know if I'm missing or mischaracterizing anything: Datasets is HuggingFace's library for loading datasets for model training/tuning. Some of these datasets require running extra scripts to fully populate the data. Datasets There is a change in the new version that would allow users to pass this explicitly. when calling LM-Harness Within datasets used for evaluation, such as This means when we run evaluation, regardless of what model type we are evaluating, the task is run and the script for the task is run. For example:
will trigger this issue or warning (if squadv2 is a dataset with scripts that need to be executed) In order to understand when this would be a user issue, we'd like to understand which datasets actually require
which would mean it gets picked up from the configuration file. |
@veekaybee This is correct! @lhoestq thank you, that environment variable is quite helpful--exactly what I was hoping might exist! |
In this case, then, would it make sense to do the following: In the config:
Within the Model constructor:
A couple things I'm not clear on: Should we set the environment variable from the TaskConfig directly or once we instantiate the class? And also, how the TaskConfig dict gets passed to the HFLM class- can we call it directly? |
So to clarify, both In the config:
This should behave as anticipated, I think, as well in the absence of In the Model: |
Hi ! FYI the |
In the next version of
datasets
we are adding a new argumenttrust_remote_code
to let users decide or not to trust the dataset scripts in python on Hugging Face. The default in the next release will still betrust_remote_code=True
, but in the coming months we plan to do a major release and set it toFalse
by default.Most datasets in
lm-evaluation-harness
are defined on HF using dataset scripts, and may require passingtrust_remote_code=True
.Those datasets could also be converted to e.g. JSONL or Parquet datasets on HF (e.g. like Muennighoff/babi).
Let me know if you have questions on this change and if I can help making a smooth transition.
Ideally this should make
lm-evaluation-harness
safer for people using datasets defined with python code.cc @haileyschoelkopf
The text was updated successfully, but these errors were encountered: