Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Worksheets.codalab.org down - prohibits HELM from completing #1930

Closed
JasperHG90 opened this issue Oct 22, 2023 · 13 comments
Closed

Worksheets.codalab.org down - prohibits HELM from completing #1930

JasperHG90 opened this issue Oct 22, 2023 · 13 comments
Labels
bug Something isn't working competition Support for the NeurIPS Large Language Model Efficiency Challenge p1 Priority 1 (Required for release)

Comments

@JasperHG90
Copy link

JasperHG90 commented Oct 22, 2023

Hi!

I'm trying to run HELM with MMLU scenarios. It appears that https://worksheets.codalab.org/ is down, which is causing HELM to fail when using this scenario. I'm not sure if this is your data or that belonging to the scenario's authors, so I thought I'd post it here in case it is helm-related.

Best,

J.

@msaroufim msaroufim added bug Something isn't working competition Support for the NeurIPS Large Language Model Efficiency Challenge labels Oct 22, 2023
@msaroufim
Copy link
Collaborator

msaroufim commented Oct 22, 2023

Hi @yifanmai this is critical to fix for the LLM competition since we'd need to remove all MMLU datasets perturbations and CNN/DM from our configuration which doesn't feel so great before the Wednesday deadline

# {description: "summarization_cnndm:model=neurips/local,max_eval_instances=9",priority: 1}
# "data_augmentation=canonical"

As a backup I can remove MMLU but we'll be defining the datasets a day before the competition deadline not great

@msaroufim msaroufim added the p1 Priority 1 (Required for release) label Oct 22, 2023
@percyliang
Copy link
Contributor

Azure unfortunately disabled the CodaLab server due to a technical glitch; trying to get support to bring it back. In the meantime, perhaps we can send you the relevant files?

@msaroufim
Copy link
Collaborator

msaroufim commented Oct 22, 2023

Would it be possible to temporarily change the download link to your own mirror? It'd be much more convenient for the leaderboard and competitors to just reinstall helm from source rather then have to manually download a dataset and place it in the right place. Although tbh either would be preferrable to delaying the competition since submissions are due in 3 days on Oct 25

@JasperHG90
Copy link
Author

@percyliang if you could share the files with me that would be appreciated, thanks!

@yifanmai
Copy link
Collaborator

Deploying a hotfix shortly #1931

@yifanmai
Copy link
Collaborator

I have mirrored the files and updated main to use the new URLs. Please try pulling main and re-running.

As an aside, because of #1932, helm-run may have downloaded and cached empty versions of these files, i.e. you may need to run the following to get rid of the empty files:

# fixes error in dialect_perturbation
rm -rf benchmark_output/perturbations/dialect
# fixes error in summarization_scenario
rm -rf benchmark_output/scenarios/summarization
# fixes error in summarization_metrics
rm -rf benchmark_output/v1/eval_cache

You'll need to run the respective command if you see one of these error messages:

  File "/home/yifanmai/oss/helm/src/helm/benchmark/augmentations/dialect_perturbation.py", line 129, in load_mapping_dict
    return json.load(f)
  File "/usr/lib/python3.8/json/__init__.py", line 293, in load
    return loads(fp.read(),
  File "/usr/lib/python3.8/json/__init__.py", line 357, in loads
    return _default_decoder.decode(s)
  File "/usr/lib/python3.8/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/lib/python3.8/json/decoder.py", line 355, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
  File "/home/yifanmai/oss/helm/src/helm/benchmark/scenarios/summarization_scenario.py", line 125, in get_instances
    dataset, article_key, summary_key = self._load_dataset(self.dataset_name, output_path)
  File "/home/yifanmai/oss/helm/src/helm/benchmark/scenarios/summarization_scenario.py", line 111, in _load_dataset
    dataset = self._download_dataset(url, "xsum-sampled", output_path)
  File "/home/yifanmai/oss/helm/src/helm/benchmark/scenarios/summarization_scenario.py", line 99, in _download_dataset
    dataset = pickle.load(fin)
EOFError: Ran out of input
  File "/home/yifanmai/oss/helm/src/helm/benchmark/metrics/summarization_metrics.py", line 198, in evaluate_generation
    self._load_qafacteval(eval_cache_path)
  File "/home/yifanmai/oss/helm/src/helm/benchmark/metrics/summarization_metrics.py", line 85, in _load_qafacteval
    qafacteval_scores = pickle.load(fin)
EOFError: Ran out of input

@JasperHG90
Copy link
Author

Thanks! I can try it out tonight.

@anmolagarwal999
Copy link

anmolagarwal999 commented Oct 23, 2023

@yifanmai The fix still does not work for me , even after removing cached data. I get the below error. EDIT: Manually downloading the file from Cloud and placing it correctly works.

Error when running summarization_cnndm:temperature=0.3,device=cpu,model=neurips_local:
Traceback (most recent call last):
  File "/home/anmol/nips_challenge/efficiency_challenge_repo/external_repos/helm_tracking_remote/helm/src/helm/benchmark/runner.py", line 173, in run_all
    self.run_one(run_spec)
  File "/home/anmol/nips_challenge/efficiency_challenge_repo/external_repos/helm_tracking_remote/helm/src/helm/benchmark/runner.py", line 221, in run_one
    instances = scenario.get_instances(scenario_output_path)
  File "/home/anmol/nips_challenge/efficiency_challenge_repo/external_repos/helm_tracking_remote/helm/src/helm/benchmark/scenarios/summarization_scenario.py", line 137, in get_instances
    dataset, article_key, summary_key = self._load_dataset(self.dataset_name, output_path)
  File "/home/anmol/nips_challenge/efficiency_challenge_repo/external_repos/helm_tracking_remote/helm/src/helm/benchmark/scenarios/summarization_scenario.py", line 128, in _load_dataset
    dataset = self._download_dataset(url, "cnndm", output_path)
  File "/home/anmol/nips_challenge/efficiency_challenge_repo/external_repos/helm_tracking_remote/helm/src/helm/benchmark/scenarios/summarization_scenario.py", line 102, in _download_dataset
    dataset = pickle.load(fin)
_pickle.UnpicklingError: invalid load key, '<'.

100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00,  2.02it/s]
} [0.541s]
Traceback (most recent call last):
  File "/home/anmol/anaconda3/envs/wizard_coder/bin/helm-run", line 8, in <module>
    sys.exit(main())
  File "/home/anmol/nips_challenge/efficiency_challenge_repo/external_repos/helm_tracking_remote/helm/src/helm/common/hierarchical_logger.py", line 104, in wrapper
    return fn(*args, **kwargs)
  File "/home/anmol/nips_challenge/efficiency_challenge_repo/external_repos/helm_tracking_remote/helm/src/helm/benchmark/run.py", line 309, in main
    run_benchmarking(
  File "/home/anmol/nips_challenge/efficiency_challenge_repo/external_repos/helm_tracking_remote/helm/src/helm/benchmark/run.py", line 111, in run_benchmarking
    runner.run_all(run_specs)
  File "/home/anmol/nips_challenge/efficiency_challenge_repo/external_repos/helm_tracking_remote/helm/src/helm/benchmark/runner.py", line 182, in run_all
    raise RunnerError(f"Failed runs: [{failed_runs_str}]")
helm.benchmark.runner.RunnerError: Failed runs: ["summarization_cnndm:temperature=0.3,device=cpu,model=neurips_local"]

@agoncharenko1992
Copy link

@yifanmai I have the same error as above

@anmolagarwal999
Copy link

@yifanmai I have the same error as above

@agoncharenko1992 Manually downloading the file from Cloud and placing it correctly in the correct folder (benchmark_output/scenarios/summarization/data) works.

@yifanmai
Copy link
Collaborator

Thanks for the bug report; will investigate shortly.

@yifanmai
Copy link
Collaborator

This should be fixed by #1935. You may have to delete the file to redownload: rm -rf benchmark_output/scenarios/summarization

@pranavjain
Copy link

CodaLab is back online now. Please do let us know if you are still facing issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working competition Support for the NeurIPS Large Language Model Efficiency Challenge p1 Priority 1 (Required for release)
Projects
None yet
Development

No branches or pull requests

7 participants