Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add jitter to retries #1274

Merged
merged 4 commits into from
Mar 7, 2024
Merged

Add jitter to retries #1274

merged 4 commits into from
Mar 7, 2024

Conversation

AhmedSoliman
Copy link
Contributor

@AhmedSoliman AhmedSoliman commented Mar 7, 2024

This also removes partition_id from metric labels because of high cardinality


Example from `http localhost:5122/metrics  | grep restate_`

### TaskCenter

```
# TYPE restate_task_center_finished_total counter
restate_task_center_finished_total{kind="Disposable",status="completed"} 3
restate_task_center_finished_total{kind="SystemBoot",status="completed"} 1
# TYPE restate_task_center_spawned_total counter
restate_task_center_spawned_total{kind="ConnectionReactor"} 2
restate_task_center_spawned_total{kind="Disposable"} 3
restate_task_center_spawned_total{kind="RpcServer"} 1
restate_task_center_spawned_total{kind="SystemService"} 3
restate_task_center_spawned_total{kind="IngressServer"} 1
restate_task_center_spawned_total{kind="MetadataBackgroundSync"} 1
restate_task_center_spawned_total{kind="Shuffle"} 1024
restate_task_center_spawned_total{kind="PartitionProcessor"} 1024
restate_task_center_spawned_total{kind="RoleRunner"} 1
```

### Network

```
# TYPE restate_network_connection_created_total counter
restate_network_connection_created_total{direction="outgoing"} 1
restate_network_connection_created_total{direction="incoming"} 1
# TYPE restate_network_message_sent_total counter
restate_network_message_sent_total 3
# TYPE restate_network_message_received_total counter
restate_network_message_received_total 3
# TYPE restate_network_message_processing_duration_seconds summary
restate_network_message_processing_duration_seconds{quantile="0"} 0.000020292
restate_network_message_processing_duration_seconds{quantile="0.5"} 0.000027418295730394435
restate_network_message_processing_duration_seconds{quantile="0.9"} 0.000027418295730394435
restate_network_message_processing_duration_seconds{quantile="0.95"} 0.000027418295730394435
restate_network_message_processing_duration_seconds{quantile="0.99"} 0.000027418295730394435
restate_network_message_processing_duration_seconds{quantile="0.999"} 0.000027418295730394435
restate_network_message_processing_duration_seconds{quantile="1"} 0.000082792
restate_network_message_processing_duration_seconds_sum 0.000130501
restate_network_message_processing_duration_seconds_count 3
# TYPE restate_network_connection_send_duration_seconds summary
restate_network_connection_send_duration_seconds{quantile="0"} 0.000012083
restate_network_connection_send_duration_seconds{quantile="0.5"} 0.000015583466555819782
restate_network_connection_send_duration_seconds{quantile="0.9"} 0.000015583466555819782
restate_network_connection_send_duration_seconds{quantile="0.95"} 0.000015583466555819782
restate_network_connection_send_duration_seconds{quantile="0.99"} 0.000015583466555819782
restate_network_connection_send_duration_seconds{quantile="0.999"} 0.000015583466555819782
restate_network_connection_send_duration_seconds{quantile="1"} 0.000047875
restate_network_connection_send_duration_seconds_sum 0.000075541
restate_network_connection_send_duration_seconds_count 3
```
Copy link

github-actions bot commented Mar 7, 2024

Test Results

 92 files   -  2   92 suites   - 2   4m 6s ⏱️ +28s
 73 tests  -  5   48 ✅ + 6  25 💤  - 11  0 ❌ ±0 
196 runs   - 10  131 ✅ +20  65 💤  - 30  0 ❌ ±0 

Results for commit a0c42fc. ± Comparison against base commit 681cd8f.

This pull request removes 5 tests.
dev.restate.e2e.node.EmbeddedHandlerApiTest ‑ consecutiveSideEffects(int)
dev.restate.e2e.node.EmbeddedHandlerApiTest ‑ delayedIncrementCounter(int, IngressClient)
dev.restate.e2e.node.EmbeddedHandlerApiTest ‑ incrementCounter(int, IngressClient)
dev.restate.e2e.node.EmbeddedHandlerApiTest ‑ oneWayIncrementCounter(int, IngressClient)
dev.restate.e2e.node.EmbeddedHandlerApiTest ‑ sideEffectAndAwakeable(int)

♻️ This comment has been updated with latest results.

- IngressDispatcher is not a coroutine anymore
- Handles inputs concurrently
- Writes to bifrost and receives responses from network
Copy link
Contributor

@tillrohrmann tillrohrmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice improvements @AhmedSoliman. LGTM. +1 for merging.

crates/types/src/retries.rs Outdated Show resolved Hide resolved
@AhmedSoliman AhmedSoliman merged commit 2455c41 into main Mar 7, 2024
4 checks passed
@AhmedSoliman AhmedSoliman deleted the pr1274 branch March 7, 2024 15:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Invoker is liable to synchronized retries (stumping)
2 participants