Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix unavailableNodes list having joined nodes #45

Merged
merged 2 commits into from
Dec 6, 2023

Conversation

arcusfelis
Copy link
Contributor

@arcusfelis arcusfelis commented Dec 4, 2023

Ping result with pang could arrive after nodeup

Fix for wrong list in unavailableNodes :

Result 0 {
  "data" : {
    "cets" : {
      "systemInfo" : {
        "unavailableNodes" : [ // these should not be here
          "mongooseim@mongooseim-1.mongooseim.default.svc.cluster.local",
          "mongooseim@mongooseim-2.mongooseim.default.svc.cluster.local"
        ],
        "remoteUnknownTables" : [
          
        ],
        "remoteNodesWithoutDisco" : [
          
        ],
        "remoteNodesWithUnknownTables" : [
          
        ],
        "remoteNodesWithMissingTables" : [
          
        ],
        "remoteMissingTables" : [
          
        ],
        "joinedNodes" : [
          "mongooseim@mongooseim-0.mongooseim.default.svc.cluster.local",
          "mongooseim@mongooseim-1.mongooseim.default.svc.cluster.local",
          "mongooseim@mongooseim-2.mongooseim.default.svc.cluster.local"
        ],
        "discoveryWorks" : true,
        "discoveredNodes" : [
          "mongooseim@mongooseim-0.mongooseim.default.svc.cluster.local",
          "mongooseim@mongooseim-1.mongooseim.default.svc.cluster.local",
          "mongooseim@mongooseim-2.mongooseim.default.svc.cluster.local"
        ],
        "conflictTables" : [
          
        ],
        "conflictNodes" : [
          
        ],
        "availableNodes" : [
          "mongooseim@mongooseim-0.mongooseim.default.svc.cluster.local",
          "mongooseim@mongooseim-1.mongooseim.default.svc.cluster.local",
          "mongooseim@mongooseim-2.mongooseim.default.svc.cluster.local"
        ]
      }
    }
  }
}

Tested with:
esl/MongooseIM#4185

Tested with Helm tests.

Ping result with pang could arrive after nodeup
Copy link

codecov bot commented Dec 4, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (9c5b2c9) 98.25% compared to head (c0c4fb8) 98.26%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main      #45      +/-   ##
==========================================
+ Coverage   98.25%   98.26%   +0.01%     
==========================================
  Files          10       10              
  Lines         745      750       +5     
==========================================
+ Hits          732      737       +5     
  Misses         13       13              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

arcusfelis added a commit to esl/MongooseIM that referenced this pull request Dec 4, 2023
@arcusfelis arcusfelis marked this pull request as ready for review December 6, 2023 11:14
Copy link
Member

@chrzaszcz chrzaszcz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good in general. I think that node state could be simplified later because of too many lists in the status - one list with node status for each node instead.

test/cets_SUITE.erl Outdated Show resolved Hide resolved
@arcusfelis arcusfelis force-pushed the disco_pang_result_arrives_after_nodeup branch 2 times, most recently from 888c199 to ee5db7e Compare December 6, 2023 15:02
Cond = fun() -> lists:member(Node, nodes()) end,
cets_test_wait:wait_until(Cond, true),
Me ! send_ping_result_called,
%% Similate pang returned to cets_discovery.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Simulate? Simulated? I can't understand this sentence.

meck:passthrough([Pid, Node, pang])
end),
try
Setup = setup_two_nodes_and_discovery(Config, [wait]),
%% setup_two_nodes_and_discovery would call disconnect_node node
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"disconnect_node node"? Can't understand it either. Is it a typo or did you mean a particular node?

?assertMatch([Node1, Node2], maps:get(joined_nodes, Status)),
%% Ensure that send_ping_result was called
%% (it is not called if node is in the nodes() list)
receive_message(send_ping_result_called)
Copy link
Member

@chrzaszcz chrzaszcz Dec 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I doesn't make sense to check it after the testcase, because it might have arrived after we checked unavailable nodes. I would move it before all the ?assertMatch

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it makes some sense, because here is how the function looks like:

ping_not_connected_nodes(Nodes) ->
    Self = self(),
    NotConNodes = Nodes -- [node() | nodes()],
    [
        spawn_link(fun() -> cets_ping:send_ping_result(Self, Node, cets_ping:ping(Node)) end)
     || Node <- lists:sort(NotConNodes)
    ],
    ok.

So there is a chance that node got connected by the reason not related to the current test - in this case cets_ping:send_ping_result/2 would never be called :)

Copy link
Member

@chrzaszcz chrzaszcz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a few more comments... I really think this case got overly verbose and it's hard to understand some of the comments. I would leave only the ones that are most important, i.e. the ones that explain when the node is up, and that we send pang afterwards.

?assertMatch([], maps:get(unavailable_nodes, Status)),
%% Ensure that Node2 is in a list of joined nodes
?assertMatch([Node1, Node2], maps:get(joined_nodes, Status)),
%% Ensure that send_ping_result was called
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is unnecessary because it just says receive_message(send_ping_result_called) only in different words.

%% Ensure that Node2 is in a list of joined nodes
?assertMatch([Node1, Node2], maps:get(joined_nodes, Status)),
%% Ensure that send_ping_result was called
%% (it is not called if node is in the nodes() list)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...but this one I don't get at all. Which node? And previous line says "ensure it's called" while this one says "it is not called"... very confusing.

@arcusfelis arcusfelis force-pushed the disco_pang_result_arrives_after_nodeup branch from ee5db7e to 02fb836 Compare December 6, 2023 15:39
Make waiting condition into send_ping_result
@arcusfelis arcusfelis force-pushed the disco_pang_result_arrives_after_nodeup branch from 02fb836 to c0c4fb8 Compare December 6, 2023 15:41
Copy link
Member

@chrzaszcz chrzaszcz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks better and more readable now, fine for me 👍

@chrzaszcz chrzaszcz merged commit 0e3f83e into main Dec 6, 2023
8 checks passed
@chrzaszcz chrzaszcz deleted the disco_pang_result_arrives_after_nodeup branch December 6, 2023 15:44
arcusfelis added a commit to esl/MongooseIM that referenced this pull request Dec 6, 2023
Includes Fix unavailableNodes list having joined nodes
esl/cets#45
arcusfelis added a commit to esl/MongooseIM that referenced this pull request Dec 6, 2023
Includes Fix unavailableNodes list having joined nodes
esl/cets#45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants