Skip to content

Increase the chance of local and remote part communicate in multinode test#1914

Merged
Barthelemy merged 1 commit into
AliceO2Group:masterfrom
knopers8:fix-multinode
Aug 7, 2023
Merged

Increase the chance of local and remote part communicate in multinode test#1914
Barthelemy merged 1 commit into
AliceO2Group:masterfrom
knopers8:fix-multinode

Conversation

@knopers8

@knopers8 knopers8 commented Aug 4, 2023

Copy link
Copy Markdown
Collaborator

In one of the failing test executions I noticed sometimes the remote part starts a bit later, probably because the CI machine is overloaded or libraries take long time to load. In that specific case, there was only one message with MOs which could have been received, but it was sent together with EndOfStream. The remote part received this EndOfStream, but did not receive the message with MOs, maybe one got there faster than the other... While this is something which could be fixed at the level of passing data around, I do not assume this is reliable at this stage anyway.

Two mitigations are put in place:

  • producer runs 5s longer (5 more messages), there is less chance that the only MonitorObjectCollection that proxy and Merger receive is sent during EOS
  • the remote workflow part is started first, so it's less likely that a MonitorObjectCOllection is published before the proxy and Merger start.

… test

In one of the failing test executions I noticed sometimes the remote part starts a bit later, probably because the CI machine is overloaded or libraries take long time to load.
In that specific case, there was only one message with MOs which could have been received, but it was sent together with EndOfStream.
The remote part received this EndOfStream, but did not receive the message with MOs, maybe one got there faster than the other...
While this is something which could be fixed at the level of passing data around, I do not assume this is reliable at this stage anyway.

Two mitigations are put in place:
- producer runs 5s longer (5 more messages), there is less chance that the only MonitorObjectCollection that proxy and Merger receive is sent during EOS
- the remote workflow part is started first, so it's less likely that a MonitorObjectCOllection is published before the proxy and Merger start.
@knopers8 knopers8 marked this pull request as ready for review August 4, 2023 07:15
@knopers8 knopers8 requested a review from Barthelemy as a code owner August 4, 2023 07:15

@Barthelemy Barthelemy left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good thank you

@Barthelemy Barthelemy merged commit 8e7e62d into AliceO2Group:master Aug 7, 2023
@knopers8 knopers8 deleted the fix-multinode branch August 15, 2023 07:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

2 participants