Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Port infos on Control+ hub not always available #63

Closed
ractive opened this issue Dec 8, 2019 · 26 comments
Closed

Port infos on Control+ hub not always available #63

ractive opened this issue Dec 8, 2019 · 26 comments
Labels
bug Something isn't working

Comments

@ractive
Copy link

ractive commented Dec 8, 2019

I'm implementing a react based remote control app using node-poweredup:
https://brick-remote.ractive.ch
I'm testing and developing this with the Control+ hubs.

The issue is that when a hub is connected, its port infos are not always available.

In this example ports A, B and D are connected, but port A never shows any info, port B most of the time and port D almost always:

I'm getting the port info with hub.getPortDeviceType(port) (see here) after a call to await hub.connect() (see here).
The ports that are identified as "unkown" also don't get sensor messages. While e..g port B gets "rotate" events, port A does not.

Steering the motors on those ports although works and e.g. a motor on port A that is recognized as "unknown" can be successfully controlled by calling hub.setMotorSpeed("A", 50) (see here).

Any ideas what is wrong here so that the port infos are not correctly set after the hub is connected?

The source code of the app can be found here and started locally with yarn start dev. Debug logging is currently enabled (see here).

@nathankellenicki
Copy link
Owner

nathankellenicki commented Dec 8, 2019

The reason for this is that you aren’t waiting to query for attached devices after connecting. Attached devices aren’t available immediately after connection - there is a handshake process, and then the hub discovers and advertises attached devices.

You can wait for the “attach”/“detach” events to be emitted from the hub for the ports you want. At that point you’ll know it’s available and ready. You can also use this to update your UI as you remove and plug in new devices.

@ractive
Copy link
Author

ractive commented Dec 8, 2019

I added an event listener for the "attach" events like this:

        poweredUP.on("discover", async (hub: Hub) => { // Wait to discover hubs
            hub.on("attach", (port, device) => {
                console.log(`Device attached to port ${port} (Device ID: ${device})`);
            });

            console.log("Connecting to hub:", hub.uuid);
            await hub.connect(); // Connect to hub
            console.log("Connected ✔");
            .....
        }

But only events for certain ports are received:
Sometimes...

Connecting to hub: 2E/BcBFQvFKbZ4oPyTBxMQ==
Device attached to port D (Device ID: 46)
Device attached to port AB (Device ID: 47)
Device attached to port ACCEL (Device ID: 58)
Device attached to port TILT (Device ID: 59)
Connected ✔

sometimes...

Connecting to hub: 2E/BcBFQvFKbZ4oPyTBxMQ==
Device attached to port AB (Device ID: 47)
Device attached to port ACCEL (Device ID: 58)
Device attached to port TILT (Device ID: 59)
Connected ✔

Port A again is never "attached".

Any ideas?

@aileo
Copy link
Contributor

aileo commented Dec 8, 2019

I have the same problem with the port A of my boost move hub, which was not detected most of the time. I did not find the circonstances that make it be found or not. I thought it was low battery, it happens less since I changed them but still.

I note we both use the Web Bluetooth version of this lib. It may be a thing to look at, I wonder if it drops some messages.

@nathankellenicki
Copy link
Owner

Interesting. Are you both using the Web Bluetooth version, or Node.js?

If you are able, can you enable localStorage.debug = "lpf2hub"; (for Web Bluetooth) or export DEBUG=lpf2hub (for Node.js) and paste the output the next time you see this issue?

@ractive
Copy link
Author

ractive commented Dec 9, 2019

I also use web bluetooth, yes.

I note we both use the Web Bluetooth version of this lib. It may be a thing to look at, I wonder if it drops some messages.

I also was wondering if the browser version has some different behaviour, but then I realized that the setup also was wrong.

I now changed the way how I set up the event listeners and handle the connected hubs. I realized that the event listeners need to be registered before calling await hub.connect(). Otherwise the attach events are fired before the listeners were registered. So my code look something like this now:

poweredUP.on("discover", async (hub: Hub) => { // Wait to discover hubs
    // Puts this hub into the list of "known hubs" with the connected flag set to false
    dispatch({type: ActionType.DISCOVER, payload: {hub}});
    // Add event listeneres....
    hub.on("attach", (port, device) => {
        dispatch({type: ActionType.ATTACH_PORT, payload: {hub, port}});
    });

    await hub.connect();

    // Sets the connected state of this hub to true
    dispatch({type: ActionType.CONNECT, payload: {hub}});
});

This seems to improve the situation so that attach events from port A are now "sometimes" being handled. But many times they are still missing.
I guess the node-poweredup web bluetooth example should be changed like this as well.

If you are able, can you enable localStorage.debug = "lpf2hub"; (for Web Bluetooth) or export DEBUG=lpf2hub (for Node.js) and paste the output the next time you see this issue?

I can provide you some debug output later on, yes. It also should easy to enable debug output by adding the "debug" key to the localStorage for https://brick-remote.ractive.ch 😉

@aileo
Copy link
Contributor

aileo commented Dec 9, 2019

I made some tests on chrome 78.0.3904.108 on windows 10 and could not reproduce on my own project, the web bluethooth example and https://brick-remote.ractive.ch/ (which is very nice).

I tried without any device on C and D ports at startup, with boost devices, with other devices... nothing wrong. Only once on the web bluetooth example, I got the port A about 2s late.

It could also be a firmware problem, I don't remember if I had an update since or not.

@ractive
Copy link
Author

ractive commented Dec 10, 2019

Interesting: I tested it yesterday on three computers and the results were completely different (all tests using the latest Chrome 78):

  • First gen MacBook Air with High Sierra: all ports always got the "attach" event. All fine! 👍
  • Lenovo X1 extreme with the latest Windows 10: The picture I described above: Port A almost never recognized, port B sometimes and port D most of the times
  • Surface Pro 4 with the latest Windows 10: No "attach" events for any port is received at all... 😱

[...] and could not reproduce [...]

And all seems fine on your system as well.

So the question is - why do different systems behave differently... 🤔

I'll collect some logs of the different system so that we can compare them...

@ractive
Copy link
Author

ractive commented Dec 10, 2019

I saved some logs. I have a control+ hub with two x-large motors connected to the ports A and B and a large motor connected to D.

This log here from the MacBook air is the happy case, where all ports on the control+ hub get "attached" (and are properly shown):
brick-remote.ractive.ch-macbook-air.log

On the Surface pro, no port is attached at all:
brick-remote.ractive.ch-surface.log

On the lenovo x1 extreme, only ports B and D are attached:
brick-remote.ractive.ch-lenovo.log

@nathankellenicki: Do you see anything suspicious?

@nathankellenicki
Copy link
Owner

@nathankellenicki: Do you see anything suspicious?

I think I do actually! I notice that a) Messages seem to be correct, and b) You do get port advertisement messages. I also notice that they tend to appear in order - starting at 0 and ending . at 99/100.

In the last two examples I see that it's always the first couple that are missed. There is a weird hacky flow in this library to figure out the hub type before connecting by actually connecting but caching received messages that don't apply to the discovery in a mailbox, to be redelivered after actually connecting. I wonder if a those two port messages come through before the connect but are for some reason not redelivered.

I'll keep digging!

@dlech
Copy link

dlech commented Dec 11, 2019

There is a weird hacky flow in this library to figure out the hub type before connecting

Why not get the hub type from the advertising data before connecting?

@nathankellenicki
Copy link
Owner

nathankellenicki commented Dec 11, 2019

There is a weird hacky flow in this library to figure out the hub type before connecting

Why not get the hub type from the advertising data before connecting?

@dlech Advertising and Manufacturer Data is not currently implemented in Web Bluetooth on Chrome (or any browser for that matter). The only way to do it currently is to connect to the hub and ask it.

https://developer.mozilla.org/en-US/docs/Web/API/BluetoothAdvertisingData

@nathankellenicki
Copy link
Owner

@ractive I've published a new version (5.0.2) with some more debug output to help solve this issue. Would you mind trying the three scenarios you posted with this new version?

You'll need to expand the debug namespaces a bit: localStorage.debug = "poweredup,bledevice,lpf2hub";

Thanks!

@ractive
Copy link
Author

ractive commented Dec 11, 2019

Here we go:

Happy case on MacBook Air with all ports detected:
brick-remote.ractive.ch-macbookair-5.0.2.log

Ports B and D detected on the lenovo:
brick-remote.ractive.ch-lenovo-5.0.2.log

Worst case scenario on the surface pro with no ports detected at all:
brick-remote.ractive.ch-surface-5.0.2.log

I hope this gives you some more insights @nathankellenicki.

To produce logfiles yourself , you can open https://brick-remote.ractive.ch/?debug=* and reload the page again (repace * with poweredup,bledevice,lpf2hub or whatever you want to have in localStorage.debug). Then scan for a hub and see the logs in the console.

@aileo
Copy link
Contributor

aileo commented Dec 11, 2019

I want to play the log game too, so I grabbed my lenovo X380 and my control+ with two basic motors on A & B and two XL motors on C & D.
I removed the 0x45 (port value) messages from the logs because accelerometer/gyro/tilt send way too much messages and the rssi ones.
I also prefixed with >>> the hub attached I/O messages.

Scanning for hubs...
bledevice Service/characteristic discovery started +0ms
Scanning result: true
bledevice Service/characteristic discovery started +1s
bledevice Service/characteristic discovery finished +2ms
poweredup Hub hT/xVVHcwxWIrb1h4EuUuA== discovered +0ms
Connecting to hub: hT/xVVHcwxWIrb1h4EuUuA==
bledevice Service/characteristic discovery started +743ms
bledevice Service/characteristic discovery finished +4ms
>>>  lpf2hub Received Message (LPF2_ALL) +0ms Uint8Array(15) [15, 0, 4, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0]
lpf2hub Sent Message (LPF2_ALL) +3ms Uint8Array(10) [10, 0, 65, 1, 2, 1, 0, 0, 0, 1]
>>>  lpf2hub Received Message (LPF2_ALL) +4ms Uint8Array(15) [15, 0, 4, 2, 1, 47, 0, 0, 16, 0, 0, 0, 16, 0, 0]
lpf2hub Sent Message (LPF2_ALL) +1ms Uint8Array(10) [10, 0, 65, 2, 2, 1, 0, 0, 0, 1]
>>>  lpf2hub Received Message (LPF2_ALL) +3ms Uint8Array(15) [15, 0, 4, 3, 1, 47, 0, 0, 16, 0, 0, 0, 16, 0, 0]
lpf2hub Sent Message (LPF2_ALL) +0ms Uint8Array(10) [10, 0, 65, 3, 2, 1, 0, 0, 0, 1]
lpf2hub Combining ports +0ms C D
lpf2hub Sent Message (LPF2_ALL) +1ms Uint8Array(6) [6, 0, 97, 1, 2, 3]
>>>  lpf2hub Received Message (LPF2_ALL) +2ms Uint8Array(15) [15, 0, 4, 50, 1, 23, 0, 0, 0, 0, 16, 0, 0, 0, 16]
>>>  lpf2hub Received Message (LPF2_ALL) +1ms Uint8Array(15) [15, 0, 4, 59, 1, 21, 0, 0, 0, 0, 16, 0, 0, 0, 16]
>>>  lpf2hub Received Message (LPF2_ALL) +1ms Uint8Array(15) [15, 0, 4, 60, 1, 20, 0, 0, 0, 0, 16, 0, 0, 0, 16]
>>>  lpf2hub Received Message (LPF2_ALL) +0ms Uint8Array(15) [15, 0, 4, 61, 1, 60, 0, 0, 0, 0, 16, 0, 0, 0, 16]
>>>  lpf2hub Received Message (LPF2_ALL) +2ms Uint8Array(15) [15, 0, 4, 96, 1, 60, 0, 1, 0, 0, 0, 1, 0, 0, 0]
>>>  lpf2hub Received Message (LPF2_ALL) +0ms Uint8Array(15) [15, 0, 4, 97, 1, 57, 0, 1, 0, 0, 0, 1, 0, 0, 0]
lpf2hub Sent Message (LPF2_ALL) +1ms Uint8Array(10) [10, 0, 65, 97, 0, 1, 0, 0, 0, 1]
>>>  lpf2hub Received Message (LPF2_ALL) +2ms Uint8Array(15) [15, 0, 4, 98, 1, 58, 0, 1, 0, 0, 0, 1, 0, 0, 0]
lpf2hub Sent Message (LPF2_ALL) +1ms Uint8Array(10) [10, 0, 65, 98, 0, 1, 0, 0, 0, 1]
>>>  lpf2hub Received Message (LPF2_ALL) +2ms Uint8Array(15) [15, 0, 4, 99, 1, 59, 0, 1, 0, 0, 0, 1, 0, 0, 0]
lpf2hub Sent Message (LPF2_ALL) +1ms Uint8Array(10) [10, 0, 65, 99, 0, 1, 0, 0, 0, 1]
>>>  lpf2hub Received Message (LPF2_ALL) +3ms Uint8Array(15) [15, 0, 4, 100, 1, 54, 0, 1, 0, 0, 0, 1, 0, 0, 0]
lpf2hub Sent Message (LPF2_ALL) +1ms Uint8Array(10) [10, 0, 65, 60, 0, 1, 0, 0, 0, 1]
lpf2hub Sent Message (LPF2_ALL) +0ms Uint8Array(10) [10, 0, 65, 59, 0, 1, 0, 0, 0, 1]
lpf2hub Sent Message (LPF2_ALL) +103ms Uint8Array(5) [5, 0, 1, 2, 2]
lpf2hub Sent Message (LPF2_ALL) +0ms Uint8Array(5) [5, 0, 1, 3, 5]
lpf2hub Sent Message (LPF2_ALL) +1ms Uint8Array(5) [5, 0, 1, 4, 5]
lpf2hub Sent Message (LPF2_ALL) +1ms Uint8Array(5) [5, 0, 1, 5, 2]
lpf2hub Sent Message (LPF2_ALL) +0ms Uint8Array(5) [5, 0, 1, 6, 2]
lpf2hub Sent Message (LPF2_ALL) +1ms Uint8Array(5) [5, 0, 1, 13, 5]
poweredup Hub hT/xVVHcwxWIrb1h4EuUuA== connected +147ms
lpf2hub Sent Message (LPF2_ALL) +1ms Uint8Array(10) [10, 0, 65, 61, 0, 10, 0, 0, 0, 1]
Connected ✔
lpf2hub Received Message (LPF2_ALL) +80ms Uint8Array(5) [5, 0, 5, 65, 6]
lpf2hub Received Message (LPF2_ALL) +113ms Uint8Array(10) [10, 0, 71, 2, 2, 1, 0, 0, 0, 1]
lpf2hub Received Message (LPF2_ALL) +1ms Uint8Array(8) [8, 0, 69, 2, 0, 0, 0, 0]
lpf2hub Received Message (LPF2_ALL) +119ms Uint8Array(10) [10, 0, 71, 3, 2, 1, 0, 0, 0, 1]
lpf2hub Received Message (LPF2_ALL) +3ms Uint8Array(8) [8, 0, 69, 3, 0, 0, 0, 0]
>>>  lpf2hub Received Message (LPF2_ALL) +117ms Uint8Array(9) [9, 0, 4, 16, 2, 47, 0, 2, 3]
lpf2hub Sent Message (LPF2_ALL) +1ms Uint8Array(10) [10, 0, 65, 16, 2, 1, 0, 0, 0, 1]
lpf2hub Received Message (LPF2_ALL) +119ms Uint8Array(10) [10, 0, 71, 97, 0, 1, 0, 0, 0, 1]
lpf2hub Received Message (LPF2_ALL) +59ms Uint8Array(10) [10, 0, 71, 98, 0, 1, 0, 0, 0, 1]
lpf2hub Received Message (LPF2_ALL) +1ms Uint8Array(10) [10, 0, 71, 99, 0, 1, 0, 0, 0, 1]
lpf2hub Received Message (LPF2_ALL) +50ms Uint8Array(10) [10, 0, 71, 60, 0, 1, 0, 0, 0, 1]
lpf2hub Received Message (LPF2_ALL) +5ms Uint8Array(10) [10, 0, 71, 59, 0, 1, 0, 0, 0, 1]
lpf2hub Received Message (LPF2_ALL) +59ms Uint8Array(6) [6, 0, 1, 2, 6, 0]
lpf2hub Received Message (LPF2_ALL) +65ms Uint8Array(9) [9, 0, 1, 3, 6, 32, 0, 0, 16]
lpf2hub Received Message (LPF2_ALL) +0ms Uint8Array(9) [9, 0, 1, 4, 6, 0, 0, 0, 8]
lpf2hub Received Message (LPF2_ALL) +2ms Uint8Array(6) [6, 0, 1, 6, 6, 100]
lpf2hub Received Message (LPF2_ALL) +12ms Uint8Array(11) [11, 0, 1, 13, 6, 144, 132, 43, 76, 240, 213]
lpf2hub Received Message (LPF2_ALL) +2ms Uint8Array(10) [10, 0, 71, 61, 0, 10, 0, 0, 0, 1]
lpf2hub Received Message (LPF2_ALL) +1ms Uint8Array(10) [10, 0, 71, 16, 2, 1, 0, 0, 0, 1]
lpf2hub Received Message (LPF2_ALL) +0ms Uint8Array(8) [8, 0, 69, 16, 0, 0, 0, 0]

Then I unplug the A port.

>>>  lpf2hub Received Message (LPF2_ALL) +50ms Uint8Array(5) [5, 0, 4, 0, 0]

And I replug it, then the magic happens

>>>  lpf2hub Received Message (LPF2_ALL) +55ms Uint8Array(15) [15, 0, 4, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0]
lpf2hub Sent Message (LPF2_ALL) +1ms Uint8Array(10) [10, 0, 65, 0, 2, 1, 0, 0, 0, 1]
lpf2hub Combining ports +1ms A B
lpf2hub Sent Message (LPF2_ALL) +1ms Uint8Array(6) [6, 0, 97, 1, 0, 1]
lpf2hub Received Message (LPF2_ALL) +61ms Uint8Array(5) [5, 0, 5, 65, 6]
lpf2hub Received Message (LPF2_ALL) +1ms Uint8Array(5) [5, 0, 5, 97, 5]

From my understanding, I miss the port A attachement message at first but I get one if I unplug/replug the device.

This tends to confirm @nathankellenicki theory.

@ractive
Copy link
Author

ractive commented Dec 11, 2019

That's a good finding. The brick-remote app now handles both attach and detach events and updates the "hub panel" properly when they happen.

So if you now unplug a motor that was shown as "not attached" and then plug it in again, it will be correctly recognized. So really just the initial attach message seem to be lost...

@nathankellenicki
Copy link
Owner

I hope this gives you some more insights @nathankellenicki.

@ractive It does. So I see from the logs for your Lenovo that the first message that gets stashed is for port 1, and on your Surface the first one is port 50. So you're potentially missing a whole bunch of messages.

I'm not sure yet, but it could be a couple of things. The first, that my stashing code is wrong and is missing a whole bunch of messages. I'll put some more debug code in and ask you to help me again. The second, that the Web Bluetooth stack is a half-baked implementation (which we know to be true!) that seems to get notified on characteristics before the client has subscribed to notifications, which would be a bigger problem. More than likely the first. :)

To produce logfiles yourself , you can open https://brick-remote.ractive.ch/?debug=* and reload the page again (repace * with poweredup,bledevice,lpf2hub or whatever you want to have in localStorage.debug). Then scan for a hub and see the logs in the console.

I checked out the project and would love to debug myself, but unfortunately a) On my MacBook Pro I can't reproduce, and b) I'm still on a work trip and only have a Powered UP hub with me, and there seems to be a bug causing React to fail to render when it tries to see what devices are on C and D (PUP hub only has A and B). I plan to raise an issue against your repo with some more info if that's ok. :)

@ractive
Copy link
Author

ractive commented Dec 12, 2019

I'll put some more debug code in and ask you to help me again.

Yes of course. It's easy to generate these logs.

PUP hub only has A and B

I was looking for a way to determine what ports a hub actually has, but there's no way to do that at the moment, so I just blindly try A-D. So it could well be that it blows up when it tries to get infos about non existing ports. 🙈
I need to see if I can get a powered up hub to test this.

I plan to raise an issue against your repo with some more info if that's ok. :)

Of course! :-)

@aileo
Copy link
Contributor

aileo commented Dec 13, 2019

I tried to understand the mailbox:

The device mailbox is emptied in subscribeToCharacteristic in webbledevice

subscribeToCharacteristic is called from both _determineLPF2HubType in poweredup-browser and connect in lpf2hub

I am not sure how it works:

  • poweredup-browser call _determineLPF2HubType in the discover.
  • it start to listen to characteristics and add messages to the mailbox (except the one about hub type).
  • it creates the hub object and keep stacking messages to the mailbox until connect is called.
  • connect is called, it change the listener and read from the mailbox.

I am lost at the _determineLPF2HubType method:

If there can be more than one message in data, then all messages are not read as it process only one message per call.

@aileo
Copy link
Contributor

aileo commented Dec 13, 2019

I also noticed startNotifications is called twice since it is in subscribeToCharacteristic. And the old event listener is removed before the new one is added but I doubt it would lead to miss a message.

@nathankellenicki
Copy link
Owner

@ractive I've just published a new version (5.0.3) with a tiny bit more debug to log all incoming data before they get emitted as full message frames. This should see if it ever comes through at all. Mind giving it a try again?

@aileo That's a good catch within _determineLPF2HubType. The reason it appends to a buffer is that a single message may be split across multiple BLE frames, so we wait until the length of the buffer matches (or exceeds) the length specified in the LWP3 message. But I hadn't taken into account multiple messages arriving in a single frame (entirely probable), so you're right - if that happens, one of them will be missed (ironically I do elsewhere, in lpf2hub.ts, in the normal message parsing).

If this is indeed the case, the debug I added in 5.0.3 should show evidence of it.

@nathankellenicki
Copy link
Owner

I also noticed startNotifications is called twice since it is in subscribeToCharacteristic. And the old event listener is removed before the new one is added but I doubt it would lead to miss a message.

Also a good catch. I suspect as the add/remove happen in the same thread tick, you're right, no messages would be lost. But I could solve this by keeping a reference and removing after the new one is added.

@ractive
Copy link
Author

ractive commented Dec 16, 2019

@nathankellenicki
Copy link
Owner

Here some logs with version 5.0.3:

So it seems in the case of the Surface, it was successful? I see it discovering all attached devices. Do you have a log from an unsuccessful case?

Lenovo I can see the issue clearly. :)

@ractive
Copy link
Author

ractive commented Dec 16, 2019

Oh. I didn't even check what devices have been attached on the surface. Here's another log, where only port D is attached, but port A & B are still missing. But yeah, there's some improvement here because for the first time at least one port has been attached on the surface. I tried several times and always just port D was attached...
brick-remote.ractive.ch-surface-5.0.3-2.log

Another thing that I noticed on the surface was that now sometimes the hub is not discovered properly anymore. A log of such an attempt is here:
brick-remote.ractive.ch-surface-5.0.3-not-connected.log

@nathankellenicki
Copy link
Owner

Well, I haven't actually changed any code, just added logging. :) So if there's an improvement, it has to be a coincidence!

Thanks for that. I think the problem is deeper than I thought - even when logging all data, the first message that comes in is for portId 0x04 - nothing beforehand.

Perhaps there's some difference between Bluetooth stacks on Windows vs macOS/Linux whereby it doesn't wait for startNotifications to actually get the hub to start sending messages, meaning some lost.

I'll need to try getting my hands on a Windows machine to debug this any further myself I think.

@nathankellenicki nathankellenicki added the bug Something isn't working label Feb 24, 2020
@nathankellenicki
Copy link
Owner

Closing as no further activity, and I'm unable to repro this. If this is still an issue, feel free to reopen. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants