Skip to content

Conversation

@Bastian-Krause
Copy link
Member

@Bastian-Krause Bastian-Krause commented Jun 15, 2022

Description
labgrid-client console --loop should keep trying to connect to the console even if the serial port is currently unvailable.
The console method is declared async because it needs to handle crossbar communication with the coordinator concurrently, see [1] (e.g. place kicks, resouces becoming available).

Until now it only tries again if microcom comes back with a return code != 0. But in case the resource is unavailable, labgrid bails out early because target.get_resource() raises a NoResourceFoundError.

Calling target.get_resource() waits until the resources are available by default. This is done in a blocking poll()/time.sleep() loop until a timeout is reached allowing no crossbar communication in the meantime. This again means changes in resource availability we wait for cannot arrive. So there is no point in waiting for them at all.

Fix that by awaiting the NetworkSerialPort resource while allowing other tasks to run when --loop is given. Afterwards "wait" for the NetworkSerialPort for 0 seconds. This either succeeds or a NoResourceFoundError is raised, both happens instantly. To make this work, instant Timeout(0.0) is now allowed.

[1] 19757a1 ("remote/client: close console connection on place release")

Checklist

  • Documentation for the feature
  • Tests for the feature
  • CHANGES.rst has been updated
  • PR has been tested

Fixes #926

@codecov
Copy link

codecov bot commented Jun 15, 2022

Codecov Report

Merging #935 (d7ac4bb) into master (1d1cecb) will decrease coverage by 0.0%.
The diff coverage is 20.0%.

❗ Current head d7ac4bb differs from pull request most recent head 2230536. Consider uploading reports for the commit 2230536 to get more accurate results

@@           Coverage Diff            @@
##           master    #935     +/-   ##
========================================
- Coverage    56.8%   56.7%   -0.1%     
========================================
  Files         149     149             
  Lines       11091   11097      +6     
========================================
  Hits         6301    6301             
- Misses       4790    4796      +6     
Impacted Files Coverage Δ
labgrid/remote/client.py 45.1% <11.1%> (-0.2%) ⬇️
labgrid/util/timeout.py 100.0% <100.0%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1d1cecb...2230536. Read the comment docs.

@khilman
Copy link
Contributor

khilman commented Jun 15, 2022

Tested-by: Kevin Hilman khilman@baylibre.com

@Bastian-Krause
Copy link
Member Author

We've stumbled upon blocking code called from async code. So let's not merge this yet, I'll come back with another version later.

In certain situations it makes sense to specify a timeout of 0, e.g. if
no blocking sleeps are desired in a method that awaits some condition.

Let's add an easy way to allow an instant timeout by considering zero a
valid value for timeouts.

Signed-off-by: Bastian Krause <bst@pengutronix.de>
@Bastian-Krause
Copy link
Member Author

The blocking wait for the NetworkSerialPort should now be fixed. labgrid-client console will now wait for 10 seconds and actually detect if the resource became available in the meantime. With --loop it will wait indefinitely until the resource becomes available. If the resource is available, it will start microcom within 0.1 seconds (so we should not miss too much output).

@khilman Please test :)

…source

`labgrid-client console --loop` should keep trying to connect to the
console even if the serial port is currently unvailable.
The console method is declared async because it needs to handle
crossbar communication with the coordinator concurrently, see [1] (e.g.
place kicks, resouces becoming available).

Until now it only tries again if microcom comes back with a return code
!= 0. But in case the resource is unavailable, labgrid bails out early
because target.get_resource() raises a NoResourceFoundError.

Calling target.get_resource() waits until the resources are available by
default. This is done in a blocking poll()/time.sleep() loop until a
timeout is reached allowing no crossbar communication in the meantime.
This again means changes in resource availability we wait for cannot
arrive. So there is no point in waiting for them at all.

Fix that by awaiting the NetworkSerialPort resource while allowing other
tasks to run when --loop is given. Afterwards "wait" for the
NetworkSerialPort for 0 seconds. This either succeeds or a
NoResourceFoundError is raised, both happens instantly.

[1] 19757a1 ("remote/client: close console connection on place release")

Signed-off-by: Bastian Krause <bst@pengutronix.de>
@Bastian-Krause Bastian-Krause changed the title remote/client: allow console looping when resource unavailable remote/client: no block in async console, loop even on unavailable resource Jun 16, 2022
@Bastian-Krause
Copy link
Member Author

Fixed s/labgrid/remote/ in the commit's subject line.

@Emantor Emantor merged commit 4d0f657 into labgrid-project:master Jun 20, 2022
@khilman
Copy link
Contributor

khilman commented Jun 20, 2022

I tested this, and it works OK the first time (e.g. if board is powered off when console --loop is run, and then powered on after console --loop has been started.)

However, if the board is then powered off, then console --loop exits with

Got EOF from port
connection lost
labgrid-client: error: Not all resources are available: {NetworkSerialPort(target=Target(name='am335x-icev2', env=None), name='USBSerialPort', state=<BindingState.bound: 1>, avail=False, host='nuc-1', port=None, speed=115200, protocol='rfc2217')}
This may be caused by disconnected exporter or wrong match entries.
You can use the 'show' command to review all matching resources.

Ideally, console --loop should continue to loop waiting for the serial port to appear again instead of exit.

Bastian-Krause added a commit to Bastian-Krause/labgrid that referenced this pull request Jun 22, 2022
…ilability

PR labgrid-project#935 added async waiting for the NetworkSerialPort. This makes
`labgrid-client console --loop` work as long as the resource changes
from being unavailable to being available. It does not work vice versa,
however: the resources are not being updated before being checked for
availability. The code inside the while loop does not run in this case.
This makes the subsequent `target.await_resources()` raise a
NoResourceFoundError which is not handled anywhere causing an error
message and exit of labgrid-client despite being called with ``--loop``.

Fix that by updating the resources before checking the resource's
availability. `labgrid-client console --loop` will now loop even if the
NetworkSerialPort changes from being available to unavailable.

Fixes: 2230536 ("remote/client: no block in async console, loop even on unavailable resource")
Signed-off-by: Bastian Krause <bst@pengutronix.de>
@Bastian-Krause Bastian-Krause deleted the bst/lgc-con-loop branch June 27, 2022 10:32
ynezz pushed a commit to ynezz/labgrid that referenced this pull request Jul 12, 2022
…ilability

PR labgrid-project#935 added async waiting for the NetworkSerialPort. This makes
`labgrid-client console --loop` work as long as the resource changes
from being unavailable to being available. It does not work vice versa,
however: the resources are not being updated before being checked for
availability. The code inside the while loop does not run in this case.
This makes the subsequent `target.await_resources()` raise a
NoResourceFoundError which is not handled anywhere causing an error
message and exit of labgrid-client despite being called with ``--loop``.

Fix that by updating the resources before checking the resource's
availability. `labgrid-client console --loop` will now loop even if the
NetworkSerialPort changes from being available to unavailable.

Fixes: 2230536 ("remote/client: no block in async console, loop even on unavailable resource")
Signed-off-by: Bastian Krause <bst@pengutronix.de>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

labgrid-client console --loop does not loop

4 participants