Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions dpgen2/utils/download_dpgen2_artifacts.py
Original file line number Diff line number Diff line change
Expand Up @@ -359,9 +359,9 @@ def _dl_step_item(
[step_key, io, name] = item.split(global_step_def_split)
pref = _item_path(prefix, item)
if io in ["input"]:
target = step.inputs.artifacts[name]
target = step.inputs.artifacts.get(name)
elif io in ["output"]:
target = step.outputs.artifacts[name]
target = step.outputs.artifacts.get(name)
Comment on lines +362 to +364
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Add explicit None checking for robust error handling.

While using .get(name) prevents KeyError exceptions, passing None to download_artifact() may cause unexpected exceptions that aren't caught by the current error handling. Consider checking for None explicitly:

 if io in ["input"]:
-    target = step.inputs.artifacts.get(name)
+    target = step.inputs.artifacts.get(name)
+    if target is None:
+        logging.warning(f"artifact {name} not found in step {step_key}")
+        return
 elif io in ["output"]:
-    target = step.outputs.artifacts.get(name)
+    target = step.outputs.artifacts.get(name)
+    if target is None:
+        logging.warning(f"artifact {name} not found in step {step_key}")
+        return

Also note the inconsistency: similar functions _dload_input_lower (line 220) and _dload_output_lower (line 245) still use direct dictionary access, which could cause KeyError in the same scenarios.


🏁 Script executed:

#!/bin/bash
# Check for other direct dictionary access patterns that might need the same fix
rg -n "\.artifacts\[" --type py

Length of output: 14051


🏁 Script executed:

#!/bin/bash
# Inspect the implementations of _dload_input_lower and _dload_output_lower
rg -n -C3 "def _dload_input_lower" dpgen2/utils/download_dpgen2_artifacts.py
rg -n -C3 "def _dload_output_lower" dpgen2/utils/download_dpgen2_artifacts.py

Length of output: 392


🏁 Script executed:

#!/bin/bash
# Display the bodies of _dload_input_lower and _dload_output_lower to inspect artifact access
sed -n '206,260p' dpgen2/utils/download_dpgen2_artifacts.py
sed -n '230,290p' dpgen2/utils/download_dpgen2_artifacts.py

Length of output: 3347


Fix None handling and unify artifact access across download functions

The use of .get(name) in _dl_step_item avoids KeyError but may pass None into download_artifact(). We should explicitly guard against missing artifacts and apply the same pattern in _dload_input_lower and _dload_output_lower to keep behavior consistent.

• In _dl_step_item (around lines 360–370), after

target = step.inputs.artifacts.get(name)

add:

if target is None:
    logging.warning(f"artifact {name} not found in step {step_key}; skipping download")
    return

(and similarly for the “output” branch).

• In _dload_input_lower (lines 206–214), replace direct access:

- download_artifact(step.inputs.artifacts[kk], path=pref, skip_exists=True)
+ target = step.inputs.artifacts.get(kk)
+ if target is None:
+     logging.warning(f"input artifact {kk} for step {step.key} missing; skipping")
+     continue
+ download_artifact(target, path=pref, skip_exists=True)

• In _dload_output_lower (lines 230–238), do the same:

- download_artifact(step.outputs.artifacts[kk], path=pref, skip_exists=True)
+ target = step.outputs.artifacts.get(kk)
+ if target is None:
+     logging.warning(f"output artifact {kk} for step {step.key} missing; skipping")
+     continue
+ download_artifact(target, path=pref, skip_exists=True)

These changes ensure we never call download_artifact(None,…) and maintain consistent error handling.

🤖 Prompt for AI Agents
In dpgen2/utils/download_dpgen2_artifacts.py around lines 360 to 370, add
explicit None checks after using .get(name) to retrieve artifacts in
_dl_step_item; if the artifact is None, log a warning and return early to avoid
passing None to download_artifact. Similarly, in _dload_input_lower (lines
206–214) and _dload_output_lower (lines 230–238), replace direct dictionary
access of artifacts with .get(name) and add the same None checks with warnings
and early returns. This unifies artifact access patterns and prevents unexpected
exceptions from passing None to download_artifact.

else:
raise RuntimeError("unknown io style {io}")
try:
Expand Down
3 changes: 3 additions & 0 deletions tests/utils/test_dl_dpgen2_arti.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,9 @@


class MockedArti:
def get(self, key):
return self.__getitem__(key)

def __getitem__(
self,
key,
Expand Down
3 changes: 3 additions & 0 deletions tests/utils/test_dl_dpgen2_arti_by_def.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,9 @@


class MockedArti:
def get(self, key):
return self.__getitem__(key)

def __getitem__(
self,
key,
Expand Down