Skip to content

Add WandbUploader and refactored HFUploader#87

Merged
mys007 merged 17 commits intomainfrom
refactor-uploaders
Oct 17, 2025
Merged

Add WandbUploader and refactored HFUploader#87
mys007 merged 17 commits intomainfrom
refactor-uploaders

Conversation

@mys007
Copy link
Contributor

@mys007 mys007 commented Oct 8, 2025

PR Checklist

  • Use descriptive commit messages.
  • Provide tests for your changes.
  • Update any related documentation and include any relevant screenshots.
  • Check if changes need to be made to docs (README or any guides in /docs/).
  • Reflect the changes you made in the changelog.

What type of PR is this? (check all applicable)

  • Refactor
  • Feature
  • Bug Fix
  • Optimization
  • Documentation Update

Description

  • Add WandbUploader for uploading results as W&B artifact:

    • Supports uploading to W&B storage or referring a custom storage via a custom function registered with register_artifact_upload_function.
    • Compresses large files (esp. jsonl) into individual gz files.
    • Versions the artifact with an alias which is a hash of results. But only those results which are (ideally) reproducible. So deterministic completions will result in the same hash. This needed not including irrelevant info (e.g. times, paths) and sorting keys in jsons correctly.
  • Refactor HFProcessor into HFUploader. This shares the unified base class with WandbUploader, is objectively way less spaghetti and has more meaningful unit tests.

  • Modify output directory so that the hash does not include irrelevant info (e.g. times, paths). Bonus: if new fields are introduced to EvalConfig, it does not necessarily lead to new hashes.

  • Add option to delete local output directory if upload succeeds. This may help to clear unnecessary local disk space if results will be available elsewhere anyway.

TODO / in progress: tests for WandbUploader

Related Tickets & Documents

  • Related Issue #
  • Closes #

QA Instructions, Screenshots, Recordings

Please replace this line with instructions on how to test your changes, a note
on the hardware and config this has been tested on, as well as any relevant
additional information.

Added/updated tests?

  • Yes
  • No, and this is why: please replace this line with details on why tests
    have not been included
  • I need help with writing tests

[optional] Are there any post deployment tasks we need to perform?

@dylan-rodriquez
Copy link
Contributor

is there an example run we can look at where the artifacts are uploaded?

@mys007 mys007 enabled auto-merge (squash) October 17, 2025 21:21
@mys007 mys007 merged commit a53b816 into main Oct 17, 2025
10 checks passed
@mys007 mys007 deleted the refactor-uploaders branch October 17, 2025 21:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants