Skip to content

Extend the task future tracking to a Map instead of a single object.#1263

Merged
jiajunwang merged 1 commit intoapache:masterfrom
jiajunwang:accessor
Aug 13, 2020
Merged

Extend the task future tracking to a Map instead of a single object.#1263
jiajunwang merged 1 commit intoapache:masterfrom
jiajunwang:accessor

Conversation

@jiajunwang
Copy link
Copy Markdown
Contributor

@jiajunwang jiajunwang commented Aug 13, 2020

Issues

  • My PR addresses the following Helix issues and references them in the PR description:

Fixes #1262

Description

  • Here are some details about my PR, including screenshots of any UI changes:

Since the ZkBucketDataAccessor is used to access multiple paths, a single task future track is not enough. This change uses a map with the path as the key to track tasks for different paths.
Also, enhance the test case to cover the scenario of multiple paths being accessed in a short period.

Tests

  • The following tests are written for this issue:

TestZkBucketDataAccessor test method modified to test with multiple paths.

  • The following is the result of the "mvn test" command on the appropriate module:

Helx-core

[ERROR] Failures:
[ERROR] TestP2PNoDuplicatedMessage.testP2PStateTransitionEnabled:174 expected:<2200> but was:<2196>
[INFO]
[ERROR] Tests run: 1163, Failures: 1, Errors: 0, Skipped: 1
[INFO]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 01:20 h
[INFO] Finished at: 2020-08-12T21:25:40-07:00
[INFO] ------------------------------------------------------------------------

Re-run

[INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 98.501 s - in org.apache.helix.integration.messaging.TestP2PNoDuplicatedMessage
[INFO]
[INFO] Results:
[INFO]
[INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0
[INFO]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 01:45 min
[INFO] Finished at: 2020-08-12T21:34:55-07:00
[INFO] ------------------------------------------------------------------------

Helix-rest

[INFO] Tests run: 165, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 130.89 s - in TestSuite
[INFO]
[INFO] Results:
[INFO]
[INFO] Tests run: 165, Failures: 0, Errors: 0, Skipped: 0
[INFO]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 02:19 min
[INFO] Finished at: 2020-08-12T21:38:32-07:00
[INFO] ------------------------------------------------------------------------

Documentation (Optional)

  • In case of new functionality, my PR adds documentation in the following wiki page:

(Link the GitHub wiki you added)

Commits

  • My commits all reference appropriate Apache Helix GitHub issues in their subject lines. In addition, my commits follow the guidelines from "How to write a good git commit message":
    1. Subject is separated from body by a blank line
    2. Subject is limited to 50 characters (not including Jira issue reference)
    3. Subject does not end with a period
    4. Subject uses the imperative mood ("add", not "adding")
    5. Body wraps at 72 characters
    6. Body explains "what" and "why", not "how"

Code Quality

  • My diff has been formatted using helix-style.xml
    (helix-style-intellij.xml if IntelliJ IDE is used)

Since the ZkBucketDataAccessor is used to access multiple paths, a single task future track is not enough. This change uses a map with the path as the key to track tasks for different paths.
Also enhance the test case to cover the scenario of multiple paths being accessed in a short period.
Copy link
Copy Markdown
Contributor

@NealSun96 NealSun96 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Although I'm curious to know: if a path is updated on a frequent interval which is less than the TTL, its stale versions will steadily grow in size. Is that a case we would worry about?

@jiajunwang
Copy link
Copy Markdown
Contributor Author

Looks good to me. Although I'm curious to know: if a path is updated on a frequent interval which is less than the TTL, its stale versions will steadily grow in size. Is that a case we would worry about?

The TTL design is to prevent GC a node that might being accessed by other threads/nodes. If update is frequent, then we may have multiple version during the TTL window. But they will eventually be cleaned up.

@jiajunwang
Copy link
Copy Markdown
Contributor Author

This PR is ready to be merged.

@jiajunwang jiajunwang merged commit 46e9227 into apache:master Aug 13, 2020
@jiajunwang jiajunwang deleted the accessor branch August 13, 2020 19:43
huizhilu pushed a commit to huizhilu/helix that referenced this pull request Aug 16, 2020
…pache#1263)

Since the ZkBucketDataAccessor is used to access multiple paths, a single task future track is not enough. This change uses a map with the path as the key to track tasks for different paths.
Also enhance the test case to cover the scenario of multiple paths being accessed in a short period.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ZkBucketAccessor does not clean up stale version when it is used to access multiple paths

2 participants