Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EMQX-9152 optimize mria running nodes #135

Merged

Conversation

SergeTupchiy
Copy link
Contributor

No description provided.

@SergeTupchiy SergeTupchiy self-assigned this Mar 29, 2023
@SergeTupchiy SergeTupchiy force-pushed the EMQX-9152-optimize-mria-running-nodes branch 2 times, most recently from ede2e6d to f958508 Compare March 31, 2023 10:11
@@ -187,7 +233,7 @@ ping(Node, Member) ->
ping(Node, _Member, 0) ->
?LOG(error, "Failed to ping ~s~n", [Node]);
ping(Node, Member, Retries) ->
case mria_node:is_running(Node) of
case mria_node:is_running(Node, ?MODULE, is_running) of
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, this should give slightly stronger guarantee, that Node is ready to receive a ping.
Previously it was calling mria_sup:is_running/0 which checks if mria_sup is a registered process.
However, supervisor process is registered at the early stage: before its init returns and all the children are started (https://github.com/emqx/otp/blob/emqx-OTP-25.1.2/lib/stdlib/src/gen.erl#L192).
Thus, it was possible to send a cast message to mria_membership server on another node when mria_sup was already running but mria_membership was not yet.

@SergeTupchiy SergeTupchiy force-pushed the EMQX-9152-optimize-mria-running-nodes branch from f958508 to a9df38e Compare March 31, 2023 20:55
@SergeTupchiy SergeTupchiy marked this pull request as ready for review March 31, 2023 20:55
@SergeTupchiy SergeTupchiy force-pushed the EMQX-9152-optimize-mria-running-nodes branch 2 times, most recently from 8b8b379 to 92405a3 Compare April 3, 2023 14:59
Comment on lines 347 to 348
, ?snk_meta := #{node := ObserveNode}}
),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit (coding style):

Suggested change
, ?snk_meta := #{node := ObserveNode}}
),
, ?snk_meta := #{node := ObserveNode}
}),

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, fixed.


%% @doc Is the application running?
%% M:F/0 must return boolean()
-spec is_running(node(), atom(), atom()) -> boolean().
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can it be done via mria_config:callback() ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but I think it will rather differ from existing callbacks both semantically and name-vise:

-type callback() :: start
                  | stop
                  | {start | stop, mria_rlog:shard()}
                  | core_node_discovery.

Adding something like is_mria_membership_running will not explain on which event this callback is expected to be called.
Also, all callbacks are now only used outside Mria (no callback is registered within Mria), but this one would seem to be private (I think we don't want to allow changing the way Mria is going to check if mria_membership is already running on anther node).

My intention of adding it to mria_node module was to (hopefully) raise some awareness that mria_node:is_running/0 may be not perfect in all the cases, so a developer is encouraged to implement a specific boolean function and use it with mria_node:is_running/3 ...

@SergeTupchiy SergeTupchiy force-pushed the EMQX-9152-optimize-mria-running-nodes branch from 92405a3 to 02fe9e8 Compare April 3, 2023 17:34

%% @doc Is the application running?
%% M:F/0 must return boolean()
-spec is_running(node(), atom(), atom()) -> boolean().
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
-spec is_running(node(), atom(), atom()) -> boolean().
-spec is_running(node(), module(), atom()) -> boolean().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

#{ ?snk_kind := Kind
, node := ActionNode
, ?snk_meta := #{node := ObserveNode}
}).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
}).
}).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

Old mria_membership API is kept compatible: no function shows info about replicant nodes.
@SergeTupchiy SergeTupchiy force-pushed the EMQX-9152-optimize-mria-running-nodes branch from 02fe9e8 to 1273fd2 Compare April 5, 2023 08:23
@SergeTupchiy SergeTupchiy merged commit f2a9f4f into emqx:main Apr 5, 2023
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants