Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove the catalyst dependencies in backup transport #12056

Merged
merged 7 commits into from
Sep 2, 2020

Conversation

LuQQiu
Copy link
Contributor

@LuQQiu LuQQiu commented Sep 1, 2020

No description provided.

@alluxio-bot
Copy link
Contributor

Automated checks report:

  • AmplabJenkins build check: PENDING
    • We were not able to detect AmplabJenkins test results on this PR. Status will update when testing completes.
  • Commits associated with Github account: PASS
  • PR title follows the conventions: PASS

Some checks failed. Please fix the reported issues and reply 'alluxio-bot, check this please' to re-run checks.

@AmplabJenkins
Copy link

Merged build finished. Test FAILed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Alluxio-PR-Builder/11121/
Test FAILed.

@AmplabJenkins
Copy link

Merged build finished. Test FAILed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Alluxio-PR-Builder/11122/
Test FAILed.

@LuQQiu
Copy link
Contributor Author

LuQQiu commented Sep 1, 2020

06:05:08.418 [ERROR] alluxio.master.journal.raft.RaftJournalTest.gainPrimacyAfterCatchup
06:05:08.418 [ERROR] Run 1: RaftJournalTest.gainPrimacyAfterCatchup:286 » Timeout Timed out waiting for fu...
06:05:08.419 [ERROR] Run 2: RaftJournalTest.gainPrimacyAfterCatchup:286 » Timeout Timed out waiting for fu...

@LuQQiu
Copy link
Contributor Author

LuQQiu commented Sep 1, 2020

Jenkins, test this please

@AmplabJenkins
Copy link

Merged build finished. Test FAILed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Alluxio-PR-Builder/11130/
Test FAILed.

@LuQQiu
Copy link
Contributor Author

LuQQiu commented Sep 1, 2020

17:39:17.802 [ERROR] alluxio.server.ft.journal.raft.EmbeddedJournalIntegrationTest.restartStress Time elapsed: 120.198 s <<< ERROR!
java.util.concurrent.TimeoutException: Timed out waiting for 11 successes options: WaitForOptions{interval=20, timeout=120000} last value: false
at alluxio.server.ft.journal.raft.EmbeddedJournalIntegrationTest.restartStress(EmbeddedJournalIntegrationTest.java:337)

@LuQQiu
Copy link
Contributor Author

LuQQiu commented Sep 1, 2020

Jenkins, test this please

@AmplabJenkins
Copy link

Merged build finished. Test FAILed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Alluxio-PR-Builder/11131/
Test FAILed.

@AmplabJenkins
Copy link

Merged build finished. Test FAILed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Alluxio-PR-Builder/11132/
Test FAILed.

@AmplabJenkins
Copy link

Merged build finished. Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Alluxio-PR-Builder/11134/
Test PASSed.

@alluxio-bot
Copy link
Contributor

Automated checks report:

  • AmplabJenkins build check: PASS
  • Commits associated with Github account: PASS
  • PR title follows the conventions: PASS

All checks passed!

Copy link
Contributor

@bf8086 bf8086 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @LuQQiu! Overall looks good. I added a couple questions inline.

*
* @param <T> the listener type
*/
public class Listeners<T> implements Iterable<Listener<T>> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any Java built-in or guava alternative of this listener we can use?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current implementation bounds the listeners with a shared thread context. So that all the listeners accept method will be executed by the same executor.

* The context for Grpc messaging single thread.
* This context uses a {@link ScheduledExecutorService} to schedule events on the context thread.
*/
public class GrpcMessagingContext {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this class needed? Will there be any problem running everything without this context?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This class mainly store two things, one executor, and one serializer.
In all the places, the current code makes sure that we read the response and write request to the same serializer.
all the requests are handled by the same executor, all the listeners acceptances are handled by the same executor.
It's more for the correctness and reduces the potential race conditions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took a deeper look into the code, the backup leader may create more than one bi-di stream (in case need to reconnect?) with the client, but all the streams shared the same context (same executor and serializer).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. We can leave it as it for now.

For serializer we will probably replace it with protobuf built-in serialization. The gRPC also provide some guarantee in message order, so this class may eventually become unnecessary.

Copy link
Contributor

@bf8086 bf8086 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@LuQQiu
Copy link
Contributor Author

LuQQiu commented Sep 2, 2020

alluxio-bot, merge this please

@alluxio-bot alluxio-bot merged commit 1468422 into Alluxio:ratis-journal Sep 2, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants