feat: update pipeline input_dataset structure and sort pipelines by u…#20
feat: update pipeline input_dataset structure and sort pipelines by u…#20MOLYHECI merged 2 commits intoOpenDCAI:backendfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This pull request updates the pipeline input_dataset structure to support both string and dictionary formats (with id and location fields), and adds sorting of pipelines by update time in descending order.
Key changes:
- Extended
input_datasetto accept either a string ID or a structured object with ID and canvas location - Added sorting functionality to return pipelines ordered by most recently updated first
- Implemented backward compatibility handling for existing string-based dataset references
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| backend/app/services/pipeline_registry.py | Updated _find_dataset_id to return dict with id and location; added Union type import; implemented pipeline sorting by updated_at field |
| backend/app/services/dataflow_engine.py | Added type checking to handle both string and dict formats for input_dataset when retrieving dataset ID |
| backend/app/schemas/pipelines.py | Created PipelineInputDataset model with id and location fields; updated PipelineConfig.input_dataset to accept Union of string and PipelineInputDataset |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| ds_id = hashlib.md5(rel_path_from_cwd.encode("utf-8")).hexdigest()[:10] | ||
| if container.dataset_registry.get(ds_id): | ||
| return ds_id | ||
| return {"id": ds_id, "location": [0, 0]} |
There was a problem hiding this comment.
Type mismatch: The function returns a list [0, 0] for the location field, but the PipelineInputDataset schema in pipelines.py expects a tuple (0, 0). This inconsistency could cause type validation errors when using Pydantic models. Change [0, 0] to (0, 0) to match the schema definition.
| def _find_dataset_id(self, pipeline_file_path: str) -> Union[str, Dict[str, Any]]: | ||
| """ | ||
| 从pipeline文件中查找first_entry_file_name,并找到对应的数据集ID | ||
| 返回 {"id": "...", "location": [0, 0]} 或 "" |
There was a problem hiding this comment.
Documentation inconsistency: The docstring states the function returns a list for location ({"id": "...", "location": [0, 0]}), but the PipelineInputDataset schema expects a tuple (0, 0). Update the docstring to reflect the correct type: {"id": "...", "location": (0, 0)}.
| 返回 {"id": "...", "location": [0, 0]} 或 "" | |
| 返回 {"id": "...", "location": (0, 0)} 或 "" |
| for ds in all_datasets: | ||
| if ds.get("root") == rel_path_from_cwd: | ||
| return ds.get("id") | ||
| return {"id": ds.get("id"), "location": [0, 0]} |
There was a problem hiding this comment.
Type mismatch: The function returns a list [0, 0] for the location field, but the PipelineInputDataset schema in pipelines.py expects a tuple (0, 0). This inconsistency could cause type validation errors when using Pydantic models. Change [0, 0] to (0, 0) to match the schema definition.
…pdate time