Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Core] rewrite locking in split transaction handlers #18417

Merged

Conversation

KarlK90
Copy link
Member

@KarlK90 KarlK90 commented Sep 19, 2022

Description

The access to the split shared memory shall be as short as possible in order to not block any pending transaction initiated by the main half of the keyboard. Most transaction handlers are deterministic and short in runtime and therefore can simply take the lock for the whole duration of the handler. Other handlers execute code that might end up calling user code or invoke interactions with external peripherals. Those handlers need a specialized implementation that takes a lock for the shortest possible duration. This is done in the following commit.

Types of Changes

  • Core
  • Bugfix
  • New feature
  • Enhancement/optimization
  • Keyboard (addition or update)
  • Keymap/layout/userspace (addition or update)
  • Documentation

Issues Fixed or Closed by This PR

Checklist

  • My code follows the code style of this project: C, Python
  • I have read the PR Checklist document and have made the appropriate changes.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I have read the CONTRIBUTING document.
  • I have added tests to cover my changes.
  • I have tested the changes and verified that they work and don't break anything (as well as I can manage).

@github-actions github-actions bot added the core label Sep 19, 2022
The access to the split shared memory shall be as short as possible in
order to not block any pending transaction initiated by the main half of
the keyboard. Most transaction handlers are deterministic and short in
runtime and therefore can simply take the lock for the whole duration of
the handler. Other handlers execute code that might end up calling user
code or invoke interactions with external peripherals. Those handlers
need a specialized implementation that takes a lock for the shortest
possible duration. This is done in the following commit.
@KarlK90 KarlK90 force-pushed the fix/transaction-handlers-short-critical-sections branch from e8fc119 to ea4b76d Compare September 19, 2022 13:24
@@ -535,7 +576,11 @@ static bool oled_handlers_master(matrix_row_t master_matrix[], matrix_row_t slav
}

static void oled_handlers_slave(matrix_row_t master_matrix[], matrix_row_t slave_matrix[]) {
if (split_shmem->current_oled_state) {
split_shared_memory_lock();

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The value of having a lock around a single read of a byte; either you read the value before or after the concurrent update, but there is no possibility of getting an inconsistent view of a single byte.

Copy link
Member Author

@KarlK90 KarlK90 Sep 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, I didn't think of that. I can surely change that, on the other hand I'm thinking: if someone changes the datatype in the future that doesn't have this atomic guarantees will they remember to introduce a lock here? I would bet that they don't do that, so while redundant at this point it could stay as a precaution? But I'm not commited to keeping it by all means. I totally forgot that the next step will be adding checksums to all transactions, so the atomic guarantees won't be there in the near future.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Appreciate the concern for defensive measures, though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doing a checksum on a single byte isn't going to speed things up either -- may as well just transfer the actual byte unconditionally.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, but the checksum is actually meant to prevent/detect data corruption as we usually don't use any resilient to means of communication (except for odd/even parity checks for some USART peripherals).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understood; but in that circumstance the request/response nature of the protocol should probably be modified such that at a certain amount of data transfer, you transmit payload+checksum in a single operation. That's a longer term discussion, at least.

Copy link
Member Author

@KarlK90 KarlK90 Oct 4, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure the refactoring should be done after some discussion, does this prevent merging the changes in this PR?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No.

@@ -566,15 +611,19 @@ static bool st7565_handlers_master(matrix_row_t master_matrix[], matrix_row_t sl
}

static void st7565_handlers_slave(matrix_row_t master_matrix[], matrix_row_t slave_matrix[]) {
if (split_shmem->current_st7565_state) {
split_shared_memory_lock();

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto

@@ -384,7 +408,11 @@ static bool backlight_handlers_master(matrix_row_t master_matrix[], matrix_row_t
}

static void backlight_handlers_slave(matrix_row_t master_matrix[], matrix_row_t slave_matrix[]) {
backlight_set(split_shmem->backlight_level);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keep this line - locking is not necessary for accessing a single byte.

@KarlK90 KarlK90 marked this pull request as ready for review September 19, 2022 14:15
@KarlK90 KarlK90 requested a review from a team September 19, 2022 14:15
@mats-nilsson
Copy link

I patched this PR and #18418 and #18419 on my split with STM32F411, and no issues detected. My previously experienced errors with lockups were addressed by #18418 alone.

Copy link
Member

@drashna drashna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

__attribute__((weak))

Changes look good, and doesn't appear to break anything.

@drashna drashna requested a review from a team September 24, 2022 07:14
@KarlK90 KarlK90 requested a review from tzarc September 25, 2022 16:34
@tzarc tzarc merged commit 56f7b34 into qmk:develop Oct 4, 2022
ramonimbao pushed a commit to ramonimbao/qmk_firmware that referenced this pull request Nov 28, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants