Refactor graph storage limits and add usage monitoring sensor#592
Refactor graph storage limits and add usage monitoring sensor#592
Conversation
- Updated graph limits configuration to replace node and relationship limits with a soft storage cap for instances across various tiers. - Introduced a new sensor for monitoring graph instance storage usage, sending alerts when usage approaches or exceeds defined limits. - Enhanced ingestion limit checks to focus on operational safety limits for materialization, ensuring no out-of-memory errors occur. - Updated API models and responses to reflect changes in storage limits and usage tracking. - Added tests for the new usage monitoring functionality and adjusted existing tests to align with the updated graph limits structure.
|
@claude please review this PR |
|
Claude finished @jfrench9's task in 1m 49s —— View job PR Review: Refactor graph storage limits and add usage monitoring sensor
SummaryThis is a well-scoped refactor that replaces the node/relationship count limits model with a storage-based soft-limit model and adds proactive monitoring via a Dagster sensor. The design direction is sound — storage-based limits are simpler to reason about and closer to infrastructure reality. The coverage improvements are substantial. A few issues worth addressing before merging. IssuesMedium1. The method is defined but never called from production code. The limits router calls 2. Sensor creates two event loops per graph ( # For every graph, this runs twice:
loop = asyncio.new_event_loop()
try:
storage_check = loop.run_until_complete(...)
finally:
loop.close()With 100 graphs you're creating and destroying 200 event loops. Prefer a single storage_check = asyncio.run(IngestionLimitChecker.check_instance_storage(...))Even better, consolidate all async work for a single graph into one coroutine and run it once. 3. '<div style="display:flex;justify-content:space-between;align-items:center;">'
4.
5. Dedup keys stored in Using Minor6. The refactored limits router now calls 7. for (gid, is_parent), size_bytes in zip(database_ids, sizes, strict=False):
8. Default tier fallback silently misclassifies untiered graphs ( graph_tier = graph.graph_tier or "ladybug-standard"If a graph has no tier set, it inherits the Positive Notes
|
- Simplified asynchronous handling in the usage monitor sensor by replacing manual event loop management with `asyncio.run()`. - Removed the deprecated `check_graph_usage` method from `IngestionLimitChecker` to streamline the ingestion limit checks. - Updated API models to include `node_count` in the `InstanceUsage` response, ensuring consistency across usage tracking. - Adjusted tests to remove obsolete checks related to the removed `check_graph_usage` method, maintaining focus on current functionality.
|
@claude have we fixed all issues? |
|
Claude finished @jfrench9's task in 1m 19s —— View job Issue Status Check
Here's the current status of all issues from the previous review: Medium Issues
Minor Issues
Code Quality Comment (github-code-quality bot)
Summary5 of 9 issues resolved. Four remain open:
|
- Updated the graph usage monitor sensor to include a check for assigned tiers, ensuring only parent graphs with a tier are processed. - Modified the Redis client initialization to use MCP_CACHE for deduplication keys, improving cache management. - Improved documentation within the sensor to clarify the impact of async Graph API calls on latency and the rationale behind the 6-hour interval. - Added comments in the graph limits retrieval function to highlight potential latency issues with parallel API calls and suggested caching strategies.
Summary
Refactors the graph limits and storage constraint system, introducing a dedicated usage monitoring sensor and email notification capabilities. The changes consolidate how graph tier limits are defined, enforced, and communicated to users, while adding proactive storage usage monitoring via a Dagster sensor.
Key Accomplishments
Storage Limits & Tier Configuration
graph_tier.pyconfiguration to better represent storage constraints and tier boundaries.github/configs/graph.yml) to align with the new constraint modelingestion_limits.pymiddleware to simplify limit enforcement logic and improve clarity around storage-based constraintsUsage Monitoring Sensor
usage_monitor.py) that proactively monitors graph storage usage across accountsEmail Notifications via SES
operations/aws/ses.py) to support sending usage alert emailsAPI & Router Updates
models/api/graphs/limits.py) to reflect the new storage constraint structureBreaking Changes
limitsresponse models have been restructured. Clients consuming the graph limits endpoint may need to update their parsing logic to accommodate the new field names/structure.graph.ymltier configuration format has changed, which may affect deployment pipelines or infrastructure-as-code that references these values directly.Testing
tests/dagster/test_usage_monitor.py— 196 lines)tests/operations/aws/test_ses.py— 101 lines)test_ingestion_limits.py— net +365 lines of coverage improvements)Infrastructure Considerations
🤖 Generated with Claude Code
Branch Info:
refactor/graph-storage-constraintmainCo-Authored-By: Claude noreply@anthropic.com