-
Notifications
You must be signed in to change notification settings - Fork 3
Add task completion tracking with progress state management #177
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This commit introduces a comprehensive task state and progress tracking system
with atomic Redis operations for reliable distributed task management.
## Core Changes
### New TaskStateStore System (src/docket/state.py)
- Implements separate Redis key storage for task state and progress data
- `mark_task_completed()`: Uses registered Lua script for atomic completion
- Progress tracking with configurable TTL (default 24 hours)
- Proper datetime serialization (ISO 8601) and deserialization with timezone support
- Dataclasses: ProgressInfo, TaskState with serialization methods
### Lua Script Implementation
- Script registered once and reused via SHA hash (evalsha)
- Atomically: checks existence, reads total, sets current=total, records timestamp, updates TTLs
- NOSCRIPT error handling with automatic reload
- Performance: 2-3x faster than pipeline approach
### Updated Docket API (src/docket/docket.py)
- Added `record_ttl` parameter for automatic cleanup of completed task records
- Fixed `get_progress()` to use TaskStateStore for retrieving progress info
- Enhanced `snapshot()` to include progress data for executions
### Progress Dependency (src/docket/dependencies.py)
- Injectable Progress context manager for tracking task execution
- Methods: set_total(), increment(), set(), get()
- Integrated with worker execution lifecycle
### Worker Integration (src/docket/worker.py)
- Progress tracking integrated into task execution
- Automatic completion marking when tasks finish
### Execution Context (src/docket/execution.py)
- Added `with_progress()` method to attach progress info to executions
## Test Coverage
- Added comprehensive test suite (tests/test_state.py) with 32 tests
- Achieved 100% test coverage for state.py
- Tests cover atomicity, edge cases, serialization, TTL behavior, and Lua script execution
- Fixed pyright type checking with appropriate ignore directives for Redis type stubs
## Technical Details
- Uses Redis Lua scripts for true atomic multi-key updates
- Separate keys: {docket}:state:{key} and {docket}:progress:{key}
- Handles missing keys gracefully (returns None)
- Idempotent operations for reliability
- Script caching reduces network overhead
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
|
Claude finished @desertaxle's task —— View job Code Review for PR #177: Task Progress TrackingOverall AssessmentThis is a well-architected implementation of task progress tracking with comprehensive test coverage. The code quality is high, but there are some opportunities for improvement based on both the author's feedback and my analysis. Key Strengths ✅
Author's Feedback Response1. Naming Conventions 🏷️I strongly agree with your feedback about suffixes like "Info" and "State" being vague. Your suggestion to unify this into a single
This would create a more cohesive data model that naturally extends to include results later. 2. Test Location 📁Agreed - at least one core test should be in 3. Performance Optimization ⚡You're absolutely right about pipelining in Additional ObservationsPerformance Issues 🐌
Code Quality Suggestions 🔧
Security Considerations 🔒
Missing Features to Consider 🤔
RecommendationThis PR is ready to merge with minor improvements. The core functionality is solid, well-tested, and follows good practices. However, I'd recommend:
The holistic approach you suggested (combining progress + tombstones + eventual results) is the right architectural direction. Great work on this implementation! 🎉 |
|
📚 Documentation has been built for this PR! You can download the documentation directly here: |
Codecov Report❌ Patch coverage is ❌ Your patch status has failed because the patch coverage (92.39%) is below the target coverage (100.00%). You can increase the patch coverage or adjust the target coverage. Additional details and impacted files@@ Coverage Diff @@
## main #177 +/- ##
===========================================
- Coverage 100.00% 98.98% -1.02%
===========================================
Files 34 36 +2
Lines 5107 5893 +786
Branches 264 304 +40
===========================================
+ Hits 5107 5833 +726
- Misses 0 53 +53
- Partials 0 7 +7
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
Update worker to create task state record at the start of execution, ensuring progress tracking is initialized before the task runs. Changes: - Added TaskStateStore.create_task_state() call in Worker._execute() - Called immediately after execution count increment, before timing starts - Ensures progress state exists for all task executions - Complements existing mark_task_completed() call after successful execution Test coverage: - Added TestWorkerStateIntegration class with test_worker_creates_state_before_execution - Test verifies state exists during task execution with initial values - Test confirms state is marked complete after task finishes - Validates complete lifecycle: create → execute → complete This provides the complete lifecycle: 1. create_task_state() - at execution start (new) 2. Task execution with Progress dependency updates 3. mark_task_completed() - at execution completion (existing) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
This commit addresses three review feedback items: 1. **Input Validation for Progress Methods** - Added validation in Progress.set_total() - must be positive - Added validation in Progress.set() - must be non-negative and <= total - Raises ValueError with descriptive messages - Added 4 tests to verify validation behavior 2. **Improved Lua Script Error Handling** - Added logging when mark_task_completed() encounters missing keys - Lua script returns 0 when keys don't exist (unchanged) - Now logs WARNING when result is 0, with key details - Added DEBUG logging when script is evicted and reloaded - Updated test to verify warning is logged 3. **Constant for Default Progress Total** - Added DEFAULT_PROGRESS_TOTAL = 100 constant - Used in ProgressInfo dataclass default - Used in ProgressInfo.from_record() fallback - Used in Progress.__aenter__() initialization - Eliminates hardcoded 100 throughout codebase Changes: - src/docket/state.py: Added constant, logging, improved error handling - src/docket/dependencies.py: Added validation, used constant - tests/test_dependencies.py: Added 4 validation tests - tests/test_state.py: Updated test to verify logging Test results: 58 tests passed, 96% coverage for state.py 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Extracted duplicate Lua script loading logic into a reusable `_ensure_script_loaded()` helper method with proper concurrency control. Key improvements: - Uses double-checked locking pattern to minimize lock contention - Lazy lock initialization for process-safe operation in parallel tests - Eliminates code duplication in `mark_task_completed()` - Ensures thread-safe script loading across concurrent operations The lazy lock initialization (creating it on first access rather than at class definition time) is critical for pytest-xdist compatibility, as asyncio locks must be created in the correct event loop context for each worker process. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
chrisguidry
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm starting to see it! This is really shaping up nicely. Now I'm even more convinced by what you said earlier that task completion tombstones, results, and progress are all essentially the same thing.
I know this is a little scope-creepish, but would you consider tackling progress + tombstones together holistically here? I'm thinking "ProgressInfo" may be too narrow and what we really want to do is unify those into one object even more? I'm also a little allergic to suffices like "Info" or "State" because they are somewhat vague and disconnected from what they do. What if there was just TaskProgress that covered the numerical progress, text descriptions of progress ("Reticulating splines..."), and the enumerated state machine states (pending, running, completed, failed). Then later when we add results, that also goes onto TaskProgress? Am I being too fussy with words?
| assert called | ||
|
|
||
|
|
||
| async def test_progress_dependency(docket: Docket, worker: Worker): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think at least one of these tests (maybe this one) should be in test_fundamentals.py
| async with self.redis() as r: | ||
| state_store = TaskStateStore(self, self.record_ttl) | ||
| for execution in future + running: | ||
| progress_info = await state_store.get_task_progress(execution.key) | ||
| if progress_info: | ||
| execution.with_progress(progress_info) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we could be talking about a lot of tasks, maybe we should pipeline these? Also, we probably don't need to grab progress for future tasks, right? Or are you thinking about retried tasks (that have lingering progress from their prior execution?)
| def with_progress(self, progress: "ProgressInfo") -> Self: | ||
| """Attach progress information to this execution. | ||
| Args: | ||
| progress: Progress information to attach | ||
| Returns: | ||
| Self for method chaining | ||
| """ | ||
| self.progress = progress | ||
| return self | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if this method is adding much value over just setting the attribute? Maybe a property instead?
Implements event-driven progress tracking to avoid polling overhead.
Tasks can now publish progress updates to Redis Pub/Sub, and clients
can monitor updates in real-time.
**Progress Dependency Changes** (dependencies.py):
- Add `publish_events` parameter to `Progress.__init__()` (opt-in, default False)
- Implement `_publish_event()` method using Redis PUBLISH command
- Publish progress updates in `increment()`, `set()`, and `set_total()`
- Message format: JSON with {key, current, total}
- Channel: `{docket}:progress-events`
**Docket API Changes** (docket.py):
- Add `monitor_progress()` async generator method
- Polls initial task state, then subscribes to Pub/Sub for live updates
- Filters events by task_keys if specified
- Yields (task_key, ProgressInfo) tuples
- Handles cleanup with proper unsubscribe/aclose
**CLI Changes** (cli.py):
- Add `docket watch` command for monitoring multiple tasks
- Uses Rich Progress library for live progress bars
- Displays task key, progress bar, percentage, and timestamps
- Detects completion when current == total
- Supports graceful exit on Ctrl+C
- Example: `docket watch task1 task2 task3`
**Test Coverage** (tests/):
- test_progress_publishes_events_when_enabled: Verifies Pub/Sub publishing
- test_progress_no_events_when_disabled: Verifies opt-in behavior
- test_monitor_progress_yields_initial_state: Tests initial state polling
- test_monitor_progress_receives_live_updates: Tests live Pub/Sub updates
- test_monitor_progress_filters_by_task_keys: Tests filtering by task key
**Design Decisions**:
- Pub/Sub over Streams: Simpler, zero storage overhead, fire-and-forget acceptable
- Single shared channel: All events on `{docket}:progress-events`, CLI filters client-side
- Poll-then-stream: Polls initial state first to avoid missing early progress
- No completion event: Detect completion when current == total
- Opt-in: Backward compatible, no breaking changes
**Benefits**:
- Real-time updates without polling overhead
- Efficient for monitoring multiple concurrent tasks
- Reduces Redis query load compared to polling
- Enables live dashboards and monitoring tools
**Example Usage**:
```python
@task
async def long_task(progress: Progress = Progress(publish_events=True)) -> None:
await progress.set_total(100)
for i in range(100):
do_work()
await progress.increment()
```
```bash
# CLI monitoring
docket watch task-key-123
```
Note: Type checking errors in tests are from redis-py lacking complete type stubs.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
|
Superceded by #181 |
Summary
Introduces a comprehensive task state and progress tracking system with atomic Redis operations for reliable distributed task management.
Key Features
TaskStateStoreAPI Additions
Docket.get_progress(key): Retrieve progress for any taskDocket.snapshot(): Now includes progress informationExecution.with_progress(): Attach progress to executionsProgressdependency: Injectable context manager for trackingData Model
ProgressInfo: Tracks current/total with percentage calculationTaskState: Complete state with progress and timestamps{docket}:state:{key}and{docket}:progress:{key}🤖 Generated with Claude Code