Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

move all subkernel message handling to aux channel #2302

Closed
sbourdeauducq opened this issue Jan 9, 2024 · 1 comment · Fixed by #2396
Closed

move all subkernel message handling to aux channel #2302

sbourdeauducq opened this issue Jan 9, 2024 · 1 comment · Fixed by #2396
Assignees
Milestone

Comments

@sbourdeauducq
Copy link
Member

@sbourdeauducq sbourdeauducq added this to the ARTIQ-8 milestone Jan 9, 2024
@Spaqin
Copy link
Collaborator

Spaqin commented Feb 6, 2024

For an additional explanation why #2291 has used main DRTIO protocol to indicate that 'async' messages are awaiting in the satellite below.

Originally, the aux protocol ran with these assumptions:

  • all transactions are initiated by master,
  • transactions always go from master to one satellite,
  • downstream satellite is always ready to receive,
  • all transactions goes from master to downstream satellites, never upstream,
  • only one transaction is open at a time.

And that's it, quite simple.

However, since DDMA and subkernel support were added, it got a bit more complicated. DDMA would finish work and would need to indicate its status, and subkernels can initiate various transactions on their own to various destinations. And of course we value our time - initially DDMA status would be sent along with DestinationReply, which is sent periodically every 200ms; with subkernel activity or larger data that needs to be broken down, that could be way too long.

With the current simple system in place, messages cannot be sent upstream blindly: the master or repeater may be busy handling something, not pick up the message on time, and if another one comes before that one is copied away, the new message is lost. Even downstream has to take into account processing time, if a response is not expected, before sending another message.

That's why as a quick fix a flag that messages are pending was implemented with the main channel was implemented.

With asynchronous messages and concurrent transactions, the state of the satellite will also have to be taken into consideration; and doing it within the aux protocol without hardware flow control, will have to overcome the following challenges:

  • clearing the pathway for the next message as quickly as possible, if possible separating receiving message and handling it,
  • making sure no packet gets lost in both upstream and downstream directions,
  • minimizing the number of potential message retransmissions,
  • keeping the latency as consistent and low as possible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants