-
Notifications
You must be signed in to change notification settings - Fork 35
fix key error in dpgen2 download #297
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: zjgemi <liuxin_zijian@163.com>
📝 WalkthroughWalkthroughThe change updates the artifact retrieval logic in the Changes
Sequence Diagram(s)sequenceDiagram
participant Caller
participant _dl_step_item
participant Step
Caller->>_dl_step_item: Call with step and artifact name
_dl_step_item->>Step: Access step.inputs.artifacts.get(name)
_dl_step_item->>Step: Access step.outputs.artifacts.get(name)
_dl_step_item-->>Caller: Return artifact(s) or None if not present
✨ Finishing Touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
dpgen2/utils/download_dpgen2_artifacts.py(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (3)
- GitHub Check: build (3.10)
- GitHub Check: build (3.9)
- GitHub Check: build (3.8)
| target = step.inputs.artifacts.get(name) | ||
| elif io in ["output"]: | ||
| target = step.outputs.artifacts[name] | ||
| target = step.outputs.artifacts.get(name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Verification agent
🧩 Analysis chain
Add explicit None checking for robust error handling.
While using .get(name) prevents KeyError exceptions, passing None to download_artifact() may cause unexpected exceptions that aren't caught by the current error handling. Consider checking for None explicitly:
if io in ["input"]:
- target = step.inputs.artifacts.get(name)
+ target = step.inputs.artifacts.get(name)
+ if target is None:
+ logging.warning(f"artifact {name} not found in step {step_key}")
+ return
elif io in ["output"]:
- target = step.outputs.artifacts.get(name)
+ target = step.outputs.artifacts.get(name)
+ if target is None:
+ logging.warning(f"artifact {name} not found in step {step_key}")
+ returnAlso note the inconsistency: similar functions _dload_input_lower (line 220) and _dload_output_lower (line 245) still use direct dictionary access, which could cause KeyError in the same scenarios.
🏁 Script executed:
#!/bin/bash
# Check for other direct dictionary access patterns that might need the same fix
rg -n "\.artifacts\[" --type pyLength of output: 14051
🏁 Script executed:
#!/bin/bash
# Inspect the implementations of _dload_input_lower and _dload_output_lower
rg -n -C3 "def _dload_input_lower" dpgen2/utils/download_dpgen2_artifacts.py
rg -n -C3 "def _dload_output_lower" dpgen2/utils/download_dpgen2_artifacts.pyLength of output: 392
🏁 Script executed:
#!/bin/bash
# Display the bodies of _dload_input_lower and _dload_output_lower to inspect artifact access
sed -n '206,260p' dpgen2/utils/download_dpgen2_artifacts.py
sed -n '230,290p' dpgen2/utils/download_dpgen2_artifacts.pyLength of output: 3347
Fix None handling and unify artifact access across download functions
The use of .get(name) in _dl_step_item avoids KeyError but may pass None into download_artifact(). We should explicitly guard against missing artifacts and apply the same pattern in _dload_input_lower and _dload_output_lower to keep behavior consistent.
• In _dl_step_item (around lines 360–370), after
target = step.inputs.artifacts.get(name)add:
if target is None:
logging.warning(f"artifact {name} not found in step {step_key}; skipping download")
return(and similarly for the “output” branch).
• In _dload_input_lower (lines 206–214), replace direct access:
- download_artifact(step.inputs.artifacts[kk], path=pref, skip_exists=True)
+ target = step.inputs.artifacts.get(kk)
+ if target is None:
+ logging.warning(f"input artifact {kk} for step {step.key} missing; skipping")
+ continue
+ download_artifact(target, path=pref, skip_exists=True)• In _dload_output_lower (lines 230–238), do the same:
- download_artifact(step.outputs.artifacts[kk], path=pref, skip_exists=True)
+ target = step.outputs.artifacts.get(kk)
+ if target is None:
+ logging.warning(f"output artifact {kk} for step {step.key} missing; skipping")
+ continue
+ download_artifact(target, path=pref, skip_exists=True)These changes ensure we never call download_artifact(None,…) and maintain consistent error handling.
🤖 Prompt for AI Agents
In dpgen2/utils/download_dpgen2_artifacts.py around lines 360 to 370, add
explicit None checks after using .get(name) to retrieve artifacts in
_dl_step_item; if the artifact is None, log a warning and return early to avoid
passing None to download_artifact. Similarly, in _dload_input_lower (lines
206–214) and _dload_output_lower (lines 230–238), replace direct dictionary
access of artifacts with .get(name) and add the same None checks with warnings
and early returns. This unifies artifact access patterns and prevents unexpected
exceptions from passing None to download_artifact.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
tests/utils/test_dl_dpgen2_arti.py (1)
31-32: Good addition to support the updated artifact access pattern.The
.get()method correctly delegates to the existing__getitem__method, aligning with the main code changes that use.get(name)instead of direct key access to prevent KeyErrors.Consider whether the method should support the optional
defaultparameter (get(key, default=None)) to fully mirror the standard dictionary.get()behavior, especially if future test scenarios need to verify handling of missing artifacts that returnNone.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
tests/utils/test_dl_dpgen2_arti.py(1 hunks)tests/utils/test_dl_dpgen2_arti_by_def.py(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- tests/utils/test_dl_dpgen2_arti_by_def.py
🧰 Additional context used
🧬 Code Graph Analysis (1)
tests/utils/test_dl_dpgen2_arti.py (1)
tests/utils/test_dl_dpgen2_arti_by_def.py (1)
get(33-34)
⏰ Context from checks skipped due to timeout of 90000ms (3)
- GitHub Check: build (3.10)
- GitHub Check: build (3.9)
- GitHub Check: build (3.8)
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #297 +/- ##
==========================================
- Coverage 84.32% 84.20% -0.12%
==========================================
Files 104 104
Lines 6041 6105 +64
==========================================
+ Hits 5094 5141 +47
- Misses 947 964 +17 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Summary by CodeRabbit