feat: implement comprehensive graph deprovisioning system#348
Conversation
- Added `--skip-backup` option to the deprovision command for skipping final backup creation. - Introduced `DeprovisioningConfig` class for managing deprovisioning settings, including retention days and backup hosting durations. - Updated `GraphDeprovisionResponse` model to include new fields for backup creation status, subgraphs deleted count, and records cleaned status. - Enhanced the `deprovision_graph` function to handle new parameters and provide detailed output on the deprovisioning process. - Implemented `deprovision_suspended_graphs` job and sensor to automate the deprovisioning of suspended graphs past their retention period. - Added tests for the new deprovisioning functionality, ensuring proper handling of various scenarios including backup skipping and shared repository restrictions.
|
@claude please review this PR |
|
Claude finished @jfrench9's task in 3m 57s —— View job PR Review: Graph Deprovisioning SystemI've completed a thorough review of the graph deprovisioning implementation. This is a well-structured PR that adds critical infrastructure lifecycle management capabilities. Here's my detailed analysis: ✅ Strengths1. Excellent Architecture & Design
2. Comprehensive Safety Checks
3. Proper Integration
4. Test Coverage
🔍 Issues & RecommendationsCritical Issues1. CRITICAL: Missing Session Commit Will Rollback All Changes The admin router at session = next(get_db_session())
try:
...
result = await service.deprovision_graph(...)
...
finally:
session.close()There's no
This is verified by checking existing admin routers (e.g., Impact: The deprovisioning feature will appear to work but won't persist any changes to the database. The graph will remain in its original state. Fix Required: Add result = await service.deprovision_graph(
graph_id,
session,
create_backup=not skip_backup,
skip_backup_check=skip_backup,
)
session.commit() # ADD THIS LINELocation: Fix session commit in admin router → 2. CRITICAL: Potential Data Loss - No Backup Cleanup Mechanism In def _clean_pg_records(self, graph_id: str, session: Session, result: DeprovisionResult) -> None:
"""Delete associated PostgreSQL records (credits, users, schemas, files).
GraphBackup records are intentionally kept for post-deprovisioning hosting.
"""The
Without cleanup, backups will accumulate indefinitely, causing storage costs and compliance issues. Recommendation: Add a Dagster job for backup cleanup after the hosting period, following the pattern in Add backup retention cleanup → High Priority Issues3. Subgraph Records Not Cleaned Up At session.query(GraphCreditTransaction).filter(
GraphCreditTransaction.graph_id == graph_id # Only parent ID
).delete(synchronize_session=False)This means subgraph-specific records (credits, users, files, schemas) remain in the database after deprovisioning. Recommendation: Either:
4. Dagster Job Uses At result = asyncio.run(service.deprovision_graph(graph_id, session, create_backup=True))This creates a new event loop each time. Problems:
Recommendation: Use Dagster's async op support or run in a thread pool executor, following patterns used elsewhere in your Dagster jobs. Medium Priority Issues5. Backup Created Before Subgraphs Are Deleted The deprovisioning order at lines 126-134:
This means the final backup is created before subgraphs are deleted. If someone creates/modifies a subgraph between backup and deletion, those changes won't be captured. Consider whether subgraphs should be backed up separately or if the order should change. 6. Config Uses Field Factory When Constant Would Suffice In backup_hosting_days: dict[str, int] = field(
default_factory=lambda: {
"ladybug-standard": 90,
"ladybug-large": 180,
"ladybug-xlarge": 365,
}
)While technically correct, since the dataclass is 7. Missing Type Hint for At async def _create_final_backup(self, graph, result: DeprovisionResult) -> None:The Minor Issues8. Logging Context Could Be Richer Several logging statements could include more context:
9. Test Coverage Could Include More Edge Cases While test coverage is good, consider adding tests for:
10. Documentation Gap No README updates documenting:
📋 Code Quality & Best PracticesFollows CLAUDE.md Guidelines: ✅
Error Handling: ✅ Excellent
Code Structure: ✅ Well organized
🎯 Final Verdict❌ Request Changes - Critical Bug Present This PR implements a critical feature with solid architecture and comprehensive testing. However, issue #1 (missing session commit) is a critical bug that will prevent the feature from working at all in production. No database changes will be persisted. Must Fix Before Merge:
Should Fix: Nice to Have: 📝 Changed Files Summary
💡 Testing RecommendationBefore merging, I strongly recommend:
Great work on the architecture and comprehensive testing! Once the session commit issue is fixed, this will be a robust deprovisioning system. The design patterns are excellent and the safety checks are thorough. |
- Added a method to clean subgraph PostgreSQL records (schemas, files) before marking the subgraph as deprovisioned. - Ensured session commit is called after deprovisioning to finalize changes in the database.
Summary
This PR introduces a complete graph deprovisioning system that enables automated cleanup and lifecycle management of graph resources within the robosystems platform. The implementation provides both manual and automated deprovisioning capabilities with comprehensive configuration options.
Key Accomplishments
Core Infrastructure
deprovision_service.py) that handles the complete lifecycle of graph resource cleanup, including validation, resource identification, and safe removal processesDagster Integration
API Enhancements
Breaking Changes
None. All changes are additive and maintain backward compatibility with existing graph management functionality.
Testing Coverage
Infrastructure Considerations
The implementation follows established patterns in the codebase and integrates seamlessly with existing graph lifecycle management systems.
🤖 Generated with Claude Code
Branch Info:
feature/graph-deprovisioningmainCo-Authored-By: Claude noreply@anthropic.com