Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

da: add submit, get context timeouts #1507

Merged
merged 4 commits into from
Feb 5, 2024
Merged

da: add submit, get context timeouts #1507

merged 4 commits into from
Feb 5, 2024

Conversation

tuxcanfly
Copy link
Contributor

@tuxcanfly tuxcanfly commented Jan 26, 2024

Overview

This PR adds context with sane default timeouts to submit and get blob methods on the DA client. Fixes #1036

Checklist

  • New and updated code has appropriate documentation
  • New and updated code has new and/or updated testing
  • Required CI checks are passing
  • Visual proof for any user facing features like CLI or documentation updates
  • Linked issues closed with keywords

Summary by CodeRabbit

  • New Features

    • Introduced timeouts for block submission and retrieval to enhance performance and reliability.
  • Bug Fixes

    • Improved error handling in test scenarios for timeouts and maximum blob size constraints.
  • Tests

    • Added new tests to cover context cancellation and specific error scenarios.

Copy link
Contributor

coderabbitai bot commented Jan 26, 2024

Walkthrough

The recent updates to the da.go and da_test.go files focus on enhancing the data availability (DA) layer's reliability and efficiency through improved timeout handling and error management. These changes aim to mitigate issues related to transaction timeouts during block submission and retrieval, ensuring more robust interaction between Rollkit and Polaris, thereby addressing critical concerns highlighted in issue #1036.

Changes

Files Change Summary
da/da.go - Added time imports.
- Added submitTimeout and retrieveTimeout constants.
- Modified context usage for timeouts.
da/da_test.go - Added bytes import.
- Modified Submit method to handle context cancellation.
- Refactored tests for error scenarios related to timeouts and max blob size.

Assessment against linked issues

Objective Addressed Explanation
Identify and resolve DA layer submission failures due to transaction timeouts. (#1036)
Investigate root cause of submission failures and errors related to ERR Error on broadcastTxCommit. (#1036) Investigation details are not mentioned in the summary; unclear if addressed.
Address height mismatch between Rollkit and Polaris due to pending block usage. (#1036) No direct mention of addressing height mismatches in the changes.
Evaluate solutions like adjusting timeout_commit, switching to different broadcast methods, or transitioning RPC frameworks to improve transaction submissions and timeouts. (#1036) Changes include timeout adjustments, but it's unclear if other solutions were evaluated or implemented.
Ensure Rollkit handles submission errors effectively to maintain sync with Polaris during transaction processing. (#1036)

Poem

In the realm of code, where data flows,
A rabbit hopped, tweaking as it goes.
Timeouts set, errors met with grace,
Ensuring every block finds its place.
🌟🐰✨

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share

Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>.
    • Generate unit-tests for this file.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit tests for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai generate interesting stats about this repository from git and render them as a table.
    • @coderabbitai show all the console.log statements in this repository.
    • @coderabbitai read src/utils.ts and generate unit tests.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (invoked as PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger a review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai help to get help.

Additionally, you can add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Configration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • The JSON schema for the configuration file is available here.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/coderabbit-overrides.v2.json

CodeRabbit Discord Community

Join our Discord Community to get help, request features, and share feedback.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 1

Configuration used: CodeRabbit UI

Commits Files that changed from the base of the PR and between d2bfbc3 and 0ecd8d6.
Files selected for processing (1)
  • da/da.go (4 hunks)
Additional comments: 3
da/da.go (3)
  • 8-8: The import of the time package is correctly added to support the new timeout functionality.
  • 123-125: The introduction of a timeout context in SubmitBlocks using submitTimeout is correctly implemented. However, ensure that the cancellation function, cancel, is always called, even if the Submit method does not return an error, to avoid potential leaks of context resources.
  • 175-177: Similar to SubmitBlocks, the RetrieveBlocks method correctly applies a timeout context using retrieveTimeout. The use of cancel after retrieving the blocks ensures that resources are properly released.

da/da.go Outdated Show resolved Hide resolved
Copy link

codecov bot commented Jan 26, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (c9d8467) 50.23% compared to head (61bb3c0) 50.31%.
Report is 5 commits behind head on main.

❗ Current head 61bb3c0 differs from pull request most recent head 42ffbb5. Consider uploading reports for the commit 42ffbb5 to get more accurate results

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1507      +/-   ##
==========================================
+ Coverage   50.23%   50.31%   +0.08%     
==========================================
  Files          52       52              
  Lines        6762     6735      -27     
==========================================
- Hits         3397     3389       -8     
+ Misses       3024     3004      -20     
- Partials      341      342       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@tuxcanfly tuxcanfly added T:bug Something isn't working C:data-availability Component: Data Availability layer abstraction labels Jan 26, 2024
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 0

Configuration used: CodeRabbit UI

Commits Files that changed from the base of the PR and between 803b02a and 5f1c3ed.
Files selected for processing (1)
  • da/da.go (4 hunks)
Files skipped from review as they are similar to previous changes (1)
  • da/da.go

da/da.go Outdated Show resolved Hide resolved
da/da.go Outdated Show resolved Hide resolved
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 0

Configuration used: CodeRabbit UI

Commits Files that changed from the base of the PR and between a4b3cd9 and fc38b78.
Files selected for processing (1)
  • da/da.go (4 hunks)
Files skipped from review as they are similar to previous changes (1)
  • da/da.go

Manav-Aggarwal
Manav-Aggarwal previously approved these changes Feb 1, 2024
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 0

Configuration used: CodeRabbit UI

Commits Files that changed from the base of the PR and between d91adbc and 3603be5.
Files selected for processing (1)
  • da/da.go (4 hunks)
Files skipped from review as they are similar to previous changes (1)
  • da/da.go

@gupadhyaya
Copy link
Member

@tuxcanfly is there a way to test this? i don't think it is addressing any problem.

MSevey
MSevey previously approved these changes Feb 2, 2024
@tuxcanfly tuxcanfly dismissed stale reviews from MSevey and Manav-Aggarwal via 6ff199c February 2, 2024 02:51
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 0

Configuration used: CodeRabbit UI

Commits Files that changed from the base of the PR and between 3603be5 and 6ff199c.
Files selected for processing (2)
  • da/da.go (4 hunks)
  • da/da_test.go (3 hunks)
Files skipped from review as they are similar to previous changes (1)
  • da/da.go
Additional comments: 5
da/da_test.go (5)
  • 4-4: The addition of the bytes package is appropriate given the use of bytes.Repeat in the test TestMockDAErrors under the submit_timeout scenario. This ensures the necessary functionality is available for manipulating byte slices in tests.
  • 71-76: Modification to the Submit method in MockDA to handle context cancellation is a crucial improvement. It allows the method to respect context timeouts or cancellations, aligning with the PR's objective to introduce context with timeouts for submission operations. This change ensures that the DA client can properly handle scenarios where a submission might take longer than expected, thereby preventing indefinite blocking.
  • 84-102: The refactoring of the test function TestMockDAErrors to include a submit_timeout scenario is well-implemented. It effectively tests the new context cancellation handling in the Submit method by simulating a timeout. This is achieved by setting up the mock to return after a delay (After(100*time.Millisecond)) and then calling doTestSubmitTimeout with a context that times out sooner (time.Second). This test ensures that the Submit method behaves as expected under timeout conditions, which is critical for the reliability of the DA client.
  • 104-107: The addition of a max_blob_size_error test scenario within TestMockDAErrors is a good practice. It tests the DA client's behavior when an error occurs while fetching the maximum blob size, which is a potential failure point. By mocking an error response for MaxBlobSize and then verifying the error handling in doTestMaxBlockSizeError, this test ensures that the DA client can gracefully handle errors related to maximum blob size constraints.
  • 161-169: The implementation of doTestSubmitTimeout function correctly tests the timeout behavior introduced in the Submit method. By setting a shorter submitTimeout than the mock's delay and asserting that the response message contains "context deadline exceeded", this test validates that the DA client correctly handles submission timeouts. This is a critical test for ensuring that the DA client's submission process is robust and can handle network or processing delays gracefully.

@tuxcanfly
Copy link
Contributor Author

tuxcanfly commented Feb 2, 2024

@gupadhyaya added test case for context timeout. The submit call relies on a sync endpoint in core and can stall the block submission loop if we don't use a context deadline, see:

https://gist.github.com/tuxcanfly/207ab3dc1fac4bb9c38bedad2a053f18

for an example, but there can be other reasons the core endpoint can timeout e.g. during inscriptions flood or improperly configured reverse proxy.

There's also a related issue here which is that if a transaction that is already in the mempool, then it should not be retried with backoff immediately, because it will error with tx already exists in cache until it gets cleared from the mempool. Edit: created #1522

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 0

Configuration used: CodeRabbit UI

Commits Files that changed from the base of the PR and between a9e49a2 and 42ffbb5.
Files selected for processing (2)
  • da/da.go (4 hunks)
  • da/da_test.go (3 hunks)
Files skipped from review as they are similar to previous changes (2)
  • da/da.go
  • da/da_test.go

@tuxcanfly tuxcanfly added this pull request to the merge queue Feb 5, 2024
Merged via the queue into main with commit 6bc4b2a Feb 5, 2024
17 checks passed
@tuxcanfly tuxcanfly deleted the tux/da-timeout branch February 5, 2024 22:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C:data-availability Component: Data Availability layer abstraction T:bug Something isn't working
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

DA layer submission failure due to tx timeout
4 participants