Regroup all swebenchmultimodal hyperparameters in constants.py #368
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
This PR creates a single source of truth for all constant values used in the SWE-Bench Multimodal benchmark implementation by consolidating them into
benchmarks/swebenchmultimodal/constants.py.Fixes #367
Changes
New File
benchmarks/swebenchmultimodal/constants.py: Contains all hyperparameters and constant values for the SWE-Bench Multimodal benchmarkUpdated Files
benchmarks/swebenchmultimodal/run_infer.py: Updated to import constants fromconstants.pybenchmarks/swebenchmultimodal/eval_infer.py: Updated to import constants fromconstants.pybenchmarks/swebenchmultimodal/build_images.py: Updated to import constants fromconstants.pytests/test_swebenchmultimodal.py: Added comprehensive tests for all constantsConstants Consolidated
DEFAULT_DATASET,DEFAULT_SPLITDOCKER_IMAGE_PREFIXBUILD_TARGETWORKSPACE_DIRENV_SKIP_BUILD,ENV_RUNTIME_API_KEY,ENV_SDK_SHORT_SHA,ENV_REMOTE_RUNTIME_STARTUP_TIMEOUT,ENV_RUNTIME_API_URLDEFAULT_SKIP_BUILD,DEFAULT_REMOTE_RUNTIME_STARTUP_TIMEOUT,DEFAULT_RUNTIME_API_URLGIT_USER_EMAIL,GIT_USER_NAME,GIT_COMMIT_MESSAGEENV_SETUP_COMMANDSALLOWED_IMAGE_TYPESDEFAULT_EVAL_WORKERS,DEFAULT_MODEL_NAMESOLVEABLE_KEYWORDSETUP_FILES_TO_REMOVEANNOTATIONS_FILENAMETesting
All 43 tests pass, including 15 new tests specifically for the constants module:
Benefits
@simonrosenberg can click here to continue refining the PR