Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Emergency timeout for failed HCI sends #5

Merged
merged 3 commits into from
Mar 22, 2019
Merged

Conversation

jlassahnrigado
Copy link
Contributor

@jlassahnrigado jlassahnrigado commented Mar 14, 2019

We've seen a stack trace where go-ble locked up because ble/linux/hci.(*HCI).send waits forever on a select for data to return.
This probably indicates some kind of problem deeper in the Bluetooth stack, but it would be good if the library didn't hang in these situations.

Now includes some additional cleanup to the socket Read path to handle some weird lockups on send and device stop, which were revealed in testing for the send failure.

Copy link
Contributor

@estutzenberger estutzenberger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks mostly fine to me. Just some clarifying stuff on the comments. How confident are you in these fixes?

s.rmu.Lock()
n, err := unix.Read(s.fd, p)
s.rmu.Unlock()
// Close always sends a dummy command to wake up Read
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is a little unclear. Maybe Read needs to be removed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would changing "wake up Read" to "wake this function up" help?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess? Do I understand this?
When the socket is closed, a dummy command is always sent to wake up the socket read function.

@jlassahnrigado
Copy link
Contributor Author

I'm confident these changes fix known problems, and pretty sure they won't cause other damage. I'm a bit worried there might still be other concurrency problems in both the command state machine and the socket close logic.

@estutzenberger estutzenberger merged commit fd3e6c7 into master Mar 22, 2019
@jlassahnrigado jlassahnrigado deleted the fix/failed_send branch June 6, 2019 21:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants