Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes for data races #8

Merged
merged 3 commits into from
May 4, 2024
Merged

Conversation

aleksamagicka
Copy link
Contributor

Fixes which should improve #7. See the commit messages for reasoning.

Introduce cmd_buffer, a separate buffer for storing only
the command that is sent to the device. Before this separation,
the existing buffer was shared for both the command and the
report received in ccp_raw_event(), which was copied into it.

However, because of hidraw, the raw event parsing may be trigerred
in the middle of sending a command, resulting in outputting gibberish
to the device.

Signed-off-by: Aleksa Savic <savicaleksa83@gmail.com>
@MisterZ42
Copy link
Owner

Hi, thanks for the work.

  1. I am not sure about complete_all. This should be used, when multiple threads could be waiting. But because of the mutex, there is only one thread who can be waiting inside of send_usb_cmd. And with the spinlocks, the completion should be safe. Am I missing something here? If it is only because of "signify", maybe complete would be more correct.
  2. Description of "hwmon: (corsair_cpro) Protect ccp->wait_input_report with a spinlock" should probably say "send_usb_cmd" instead of "get_data" in the letter?

Tested it for a short while and the problem with sending junk is gone when bombarding the device with different hidraw calls while using the driver. That reduces the possiblity for wrong outputs a bit!

Would you like to put it to the hwmon mailing list? (after maybe addressing my 2 comments) It is after all your work. (Only if you want)

Thank you.

@aleksamagicka
Copy link
Contributor Author

I am not sure about complete_all. This should be used, when multiple threads could be waiting. But because of the mutex, there is only one thread who can be waiting inside of send_usb_cmd. And with the spinlocks, the completion should be safe. Am I missing something here? If it is only because of "signify", maybe complete would be more correct.

Yeah, I think a complete() would work as well, it's just that I was looking at existing upstream code and I used complete_all() there.

Description of "hwmon: (corsair_cpro) Protect ccp->wait_input_report with a spinlock" should probably say "send_usb_cmd" instead of "get_data" in the letter?

Right, will fix.

Would you like to put it to the hwmon mailing list? (after maybe addressing my 2 comments) It is after all your work. (Only if you want)

Once we're all happy with it, of course!

@MisterZ42
Copy link
Owner

MisterZ42 commented Apr 25, 2024

I am not sure about complete_all. This should be used, when multiple threads could be waiting. But because of the mutex, there is only one thread who can be waiting inside of send_usb_cmd. And with the spinlocks, the completion should be safe. Am I missing something here? If it is only because of "signify", maybe complete would be more correct.

Yeah, I think a complete() would work as well, it's just that I was looking at existing upstream code and I used complete_all() there.

I would prefer complete, because (in my opinion) it represents more correctly what is actually happening.

Description of "hwmon: (corsair_cpro) Protect ccp->wait_input_report with a spinlock" should probably say "send_usb_cmd" instead of "get_data" in the letter?

Right, will fix.

Would you like to put it to the hwmon mailing list? (after maybe addressing my 2 comments) It is after all your work. (Only if you want)

Once we're all happy with it, of course!

Great, with these two changes, you can put it directly to the hwmon maintainer(s). I will add the acked-by there and pull it into this repository, once it is applied to hwmon-next.

@aleksamagicka aleksamagicka force-pushed the sync-fixes branch 2 times, most recently from d019676 to ee94f6d Compare April 25, 2024 17:52
@aleksamagicka
Copy link
Contributor Author

Thanks. Pushed the current commits for a sanity check, likely sending tomorrow or during the weekend.

@MisterZ42
Copy link
Owner

Small detail: #include <linux/spinlock.h> should be added. (missed it before, sorry)
But looks good otherwise. Thank you.

@MisterZ42
Copy link
Owner

Looks all good to me.

corsair-cpro.c Outdated
spin_lock(&ccp->wait_input_report_lock);
if (!completion_done(&ccp->wait_input_report)) {
memcpy(ccp->buffer, data, min(IN_BUFFER_SIZE, size));
complete(&ccp->wait_input_report);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I think complete_all is necessary for the !completion_done check to work as intended.

If only complete is used, then it will increment <completion>->done, but wait_for_... will immediately decrement it again. Thus, <completion>->done would remain 0 almost all the time, making the check (which can summed by if (!<completion>->done)) effectively useless.

On the other hand, completion_all sets the inner done field to a marker value (UINT_MAX) which isn't touched by wait_for.... This means that once completion_all is called, completion_done will return true until reinit_completion is called.

Since we want to use this check to try to only process reports that have been requested by the driver (at the very least as an optimization, but also to try to avoid clobbering buffer while some other part of the driver is reading it), we need the branch to not be retaken until reinit_completion is called again.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, yes, you are right. It is needed. Thanks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jonasmalacofilho Right, thanks.

I'll see if I can create a document for liquidtux as a guide for this pattern, with this pitfall noted.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good idea!

…_raw_event()

In ccp_raw_event(), the ccp->wait_input_report completion is
completed once. Since we're waiting for exactly one report in
send_usb_cmd(), use complete_all() instead of complete()
to mark the completion as spent.

Signed-off-by: Aleksa Savic <savicaleksa83@gmail.com>
Through hidraw, userspace can cause a status report to be sent
from the device. The parsing in ccp_raw_event() may happen in
parallel to a send_usb_cmd() call (which resets the completion
for tracking the report) if it's running on a different CPU where
bottom half interrupts are not disabled.

Add a spinlock around the complete_all() in ccp_raw_event() and
reinit_completion() in send_usb_cmd() to prevent race issues.

Signed-off-by: Aleksa Savic <savicaleksa83@gmail.com>
@jonasmalacofilho
Copy link

LGTM.

@MisterZ42
Copy link
Owner

Looks all good. Thank you both!

@aleksamagicka
Copy link
Contributor Author

It makes sense to add Fixes: to all three commits, right? (Right now it's only present in the first one.)

@MisterZ42
Copy link
Owner

It makes sense to add Fixes: to all three commits, right? (Right now it's only present in the first one.)

Makes sense to me.

@aleksamagicka
Copy link
Contributor Author

Sent it to LKML, hopefully all goes well.

@MisterZ42 MisterZ42 merged commit 0a4780a into MisterZ42:master May 4, 2024
@aleksamagicka aleksamagicka deleted the sync-fixes branch May 4, 2024 13:44
@aleksamagicka
Copy link
Contributor Author

Nice, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants