Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Z21 pending request queue #73

Open
wants to merge 15 commits into
base: master
Choose a base branch
from

Conversation

gfgit
Copy link
Contributor

@gfgit gfgit commented Jun 24, 2023

Pending request queue allows to retry sending messages after a timeout if expected reply is not received.
This also allows to detect a message is a reply to our own changes and react to it differently

See #60 for earlier discussion.

NOTE: needs #82 applied first

Both this and #82 LocoCache try to resolve the "async loop" issue (See #57).
This 2 solutions are orthogonal and can live together.

@gfgit
Copy link
Contributor Author

gfgit commented Jun 24, 2023

Quick comparison with #59 approach (not fully accurate):

LocoCache approach

PROS:

  • Less memory footprint when number of loco is low and many messages are sent
  • Can extract individual changes from LanXLocoInfo like change only direction or only emergency stop
    This is possible because we cache last value of direction, speed and emergency stop per each loco.
  • Can somewhat detect speed trend change: user reduces speed while train is accelerating or viceversa

CONS:

  • Weak logic based on heuristic. Many edge cases
  • Logic is more obscure
  • More memory when many loco are running
  • Annoying "ignore" delay when user moves throttle right after Traintastic has updated same loco
    This is needed to avoid receiving our own reply but creates discrepancy between real loco state and cached one.

Pending Request Queue approach

PROS:

  • While logic is more complex, it's easier to understand
  • More accurate for own replies (which get then ignored)
  • Expandable to other message types
  • Allows re-sending a message if no reply is received so it workarounds UDP packet loss
  • Logic is much more robust
  • No delay needed for throttle messages

CONS:

  • Can detect external changes as our own replies if messages match.
    This shouldn't be a big problem but there are at least some edge cases (see below)
  • Potential risk of big memory footprint is sending a lot of messages in a fast manner
  • Does not check received message order so 2 replies referring to same object can be
    detected in wrong order.
  • Cannot extract individual changes because it has no track of last value

Edge case for reply ordering and external changes

  • Set a Train to speed 100.
  • It will slowly set speed from 0 to 100 at regular intervals
  • Say you already have enqueued from 0 to 20 and Z21 still has not answered
  • You should expect answers in ascending order from 0 to 20
  • If you get speed 15 after speed 5 it means someone else has explicitly set speed to 15
  • This "15" might be detected as our own reply because is less than 20 and anyway less than target 100
  • So Train will continue to accelerate to 100 exceeding user set speed 15

We should detect the "bump" from 5 to 15 which should not interpreted as "train is already at 15"
but instead as "accelerate until 15"

This can be fixed adding some sort of counter to tracked replies.

Future work

The solution of course will be "best of both worlds". I'll try to tune both approaches to work together
and understand which one is doing what exactly and what are the edge cases left.

It already shows big improvement when simulating enormous network delays (100 ms for every message on Z21 side)
but it's not yet perfect.
It really needs testing with real hardware (which unfortunately I don't have) because different behaviors might arise

@reinder
Copy link
Member

reinder commented Jun 26, 2023

Looking good, I really like it. Nice work! I don't own a z21 or Z21 yet so can't test it against a real one, I only have a Digikeijs DR5000 which supports the Z21 protocol.

If you get speed 15 after speed 5 it means someone else has explicitly set speed to 15

Or that 6 .. 14 are dropped, it's UDP after all. But under normal circumstances without a high network load it probably won't happen.

Maybe you can add a debug option to log some additional info about the retries, that might be useful if we find people to do some testing with it.

p.s. As the Z21 implementation is receiving many improvements, it seems fair to me that you add a Copyright line for your name. (Don't know how this normally works...this is my first Open Source project with contributions :))

@gfgit
Copy link
Contributor Author

gfgit commented Jun 27, 2023

p.s. As the Z21 implementation is receiving many improvements, it seems fair to me that you add a Copyright line for your name. (Don't know how this normally works...this is my first Open Source project with contributions :))

I'm glad!

I forgot build is failing because I've based it on top of #56 so I'll need to modify it a bit

@gfgit gfgit mentioned this pull request Nov 2, 2023
@gfgit gfgit force-pushed the work/z21_pending_queue branch 2 times, most recently from 08a1cb3 to 2f9cc61 Compare November 2, 2023 16:57
@gfgit
Copy link
Contributor Author

gfgit commented Nov 2, 2023

Rebased onto #82

@gfgit gfgit force-pushed the work/z21_pending_queue branch 2 times, most recently from a7a7a1b to 14a13c5 Compare November 5, 2023 18:56
server/src/hardware/protocol/z21/clientkernel.hpp Outdated Show resolved Hide resolved
server/src/hardware/protocol/z21/clientkernel.hpp Outdated Show resolved Hide resolved
server/src/hardware/protocol/z21/messages.cpp Show resolved Hide resolved
server/src/hardware/protocol/z21/messages.hpp Outdated Show resolved Hide resolved
server/src/hardware/protocol/z21/messages.hpp Outdated Show resolved Hide resolved
server/src/hardware/protocol/z21/messages.hpp Outdated Show resolved Hide resolved
server/src/hardware/protocol/z21/messages.hpp Outdated Show resolved Hide resolved
server/src/hardware/protocol/z21/messages.hpp Outdated Show resolved Hide resolved
Now m_isUpdatingFromKernel

P.S. I'm tired of rebasing!
This makes it possible to detect replies from
Z21 originated by our own requests and process
them differently than externally generated
messages.

This also enables resending requests which did
not receive the expected reply in timeout
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants