Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Core] Unite half-duplex and full-duplex serial drivers #13081

Merged
merged 5 commits into from Jul 1, 2021

Conversation

KarlK90
Copy link
Member

@KarlK90 KarlK90 commented Jun 2, 2021

Description

So this is a follow-up on #9842 with a plot twist as it removes the DMA backed full duplex driver and adds the full duplex functionality to the existing interrupt based driver. There are multiple reasons for this change:

  1. The DMA based driver was faster than the existing half-duplex driver, but wasn't actually faster than a full duplex driver backed by interrupts. The biggest overhead comes from clearing the input queues, after sending on the half-duplex I guess.
  2. The original DMA driver hadn't any flow control, so possible baud rates without timeouts are tightly coupled to optimization levels and the both halves reacting to transactions at the same pace. I extended the driver to use a state machine and software flow control in my personal repo. This solved this problem, but this driver is much more complex.
  3. Even the extended driver is a bit slower than the full duplex interrupt based driver. So there are no benefits left using the DMA backed one. QMK is mostly single threaded at this point and the transactions sizes are a couple of bytes (the biggest being the matrix), so DMA doesn't offer anything at this point.
  4. The driver straight-out didn't work on the djiinn by @tzarc for unknown reasons and it uses DMA channels which could lead to conflicts, especially for people building boards and not having the deeper embedded knowledge about DMA conflicts.

The benefits of this Implementation:

  1. Faster than the DMA full duplex driver.
  2. Maintainable.
  3. Better compatibility with MCUs.
  4. It targets Extensible split data sync #11930 by default.

Tested on GD32VF103 and STM32F303 in both half-duplex and full-duplex configuration.

Types of Changes

  • Core
  • Bugfix
  • New feature
  • Enhancement/optimization
  • Keyboard (addition or update)
  • Keymap/layout/userspace (addition or update)
  • Documentation

Issues Fixed or Closed by This PR

  • none

Checklist

  • My code follows the code style of this project: C, Python
  • I have read the PR Checklist document and have made the appropriate changes.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I have read the CONTRIBUTING document.
  • I have added tests to cover my changes.
  • I have tested the changes and verified that they work and don't break anything (as well as I can manage).

@tzarc tzarc added the awaiting_pr Relies on another PR to be merged first label Jun 2, 2021
@tzarc
Copy link
Member

tzarc commented Jun 2, 2021

Targets #11930 -- might end up cherry-picking it into that PR, given the prerequisite.

@tzarc tzarc self-requested a review June 2, 2021 12:22
@KarlK90 KarlK90 marked this pull request as ready for review June 2, 2021 12:40
@github-actions github-actions bot added keymap via Adds via keymap and/or updates keyboard for via support labels Jun 2, 2021
@github-actions github-actions bot removed keymap via Adds via keymap and/or updates keyboard for via support labels Jun 2, 2021
@KarlK90 KarlK90 force-pushed the unite-serial-driver branch 3 times, most recently from ab0cf21 to 9d205a7 Compare June 3, 2021 14:46
@KarlK90
Copy link
Member Author

KarlK90 commented Jun 3, 2021

I was able to save quite a few cycles for the half-duplex implementation by discarding the bytes from the input queue after transmitting without actually reading them back. Also sdClear (which is now called usart_reset) is only called in case of transmission errors on the slave side and on every iteration on the master side. Also it just hard resets the input queue. With these changes the full-duplex and half-duplex driver are within a few hundred matrix scans, full-duplex still being slightly faster.

@KarlK90 KarlK90 mentioned this pull request Jun 3, 2021
14 tasks
@firetech
Copy link
Contributor

firetech commented Jun 3, 2021

Responding to a comment from #13089 (that's more relevant here). Main topic is the custom serial driver I implemented for porting Ergodox Infinity over to using split_common.

From what I see in your code the serial_config, assigned serial driver and some hooks into the side specific init code would have to be configurable right? Feel free to chime in the serial PR :-)

@KarlK90 I think that's about it. The serial driver is already configurable (SERIAL_USART_DRIVER), but the Ergodox Infinity is a bit special since it uses one U(S)ART port for the master and a different for the slave (in theory allowing "infinite" daisychaining). A separate define (e.g. SERIAL_USART_SLAVE_DRIVER) that is set to SERIAL_USART_DRIVER by default would solve that issue, though.

The initialization code in my driver (here and here) is taken directly from the corresponding code in serial_link (currently only used by Ergodox Infinity), not entirely sure what it actually does. :P

@KarlK90
Copy link
Member Author

KarlK90 commented Jun 3, 2021

So I just pushed some changes that make the driver much more configurable and STM32 independent.

  • The serial driver is now assignable and not a fixed constant. So re-assignable from init code.
SerialDriver* serial_driver = &SERIAL_USART_DRIVER;
  • For side specific initializations overrideable weak defined hooks exist, e.g. here you can assign a different serial driver.
/**
 * @brief Overridable master specific initializations.
 */
__attribute__((weak)) void usart_master_init(void) { usart_init(); }

/**
 * @brief Overridable slave specific initializations.
 */
__attribute__((weak)) void usart_slave_init(void) { usart_init(); }
  • For a non standard serial config just define SERIAL_USART_CONFIG and this will be used.
#if defined(SERIAL_USART_CONFIG)
static SerialConfig serial_config = SERIAL_USART_CONFIG;
#else
static SerialConfig serial_config = {
    .speed = (SERIAL_USART_SPEED), /* speed - mandatory */
    .cr1   = (SERIAL_USART_CR1),
    .cr2   = (SERIAL_USART_CR2),
#    if !defined(SERIAL_USART_FULL_DUPLEX)
    .cr3   = ((SERIAL_USART_CR3) | USART_CR3_HDSEL) /* activate half-duplex mode */
#    else
    .cr3 = (SERIAL_USART_CR3)
#    endif
};
#endif

@firetech
Copy link
Contributor

firetech commented Jun 3, 2021

  • The serial driver is now assignable and not a fixed constant. So re-assignable from init code.
static SerialDriver* serial_driver = &SERIAL_USART_DRIVER;

In practice, since that is a static variable, can I really change its value from code outside that file? 🤔

  • For a non standard serial config just define SERIAL_USART_CONFIG and this will be used.

Technically, it's the STM32 config that's non-standard. The only standardized field in SerialConfig is the speed. https://chibiforge.org/doc/19.1/full_rm/struct_serial_config.html#details 😉

@KarlK90
Copy link
Member Author

KarlK90 commented Jun 3, 2021

You're right I have dropped the static qualifier. Maybe QMK standard 😛

@firetech
Copy link
Contributor

firetech commented Jun 3, 2021

You're right I have dropped the static qualifier. Maybe QMK standard 😛

Another solution would have been to send a pointer to the variable as an argument to usart_master/slave_init, i.e. void usart_master_init(SerialDriver** driver). 🤷

@firetech
Copy link
Contributor

firetech commented Jun 4, 2021

So, I've gotten this to work on Ergodox Infinity now. Prototype code here.

However, due to incompatible code in usart_init (even though it's overwritten elsewhere), it still doesn't compile for K20x unless I comment out the palSetLineMode calls in usart_init. Some more #define magic is needed there for full compatibility...

Also, I realized I also added some code to my custom serial driver to make it possible to use a master half without a slave connected (since the constant wait for serial read timeout make the scan rate unusably slow otherwise). I would not at all call it perfect, but with SERIAL_USART_TIMEOUT set to 50, I don't really notice any missed keys while typing very fast (get_current_wpm reported well over 200 "WPM" of ordered nonsense).

Not sure how relevant that feature is, it's definitely not a requirement. I partly implemented it for fun, and partly because I thought more than once that the firmware was broken while porting Ergodox Infinity to split_common (the Infinity uses the same USB-C/micro B port for USB input from host to master and serial input from master to slave, so you need to disconnect the slave from the master to be able to flash the slave, making disconnects a bit more common than for other split boards).

@KarlK90
Copy link
Member Author

KarlK90 commented Jun 4, 2021

Alright, I guarded that stm32 specific init code behind a define and added a info message about the necessity to init the pins yourself. In the future this can be expanded to support specific ports like Kinetis quite easily.

The driver is now static again and the side specific init functions get a pointer passed to it.

The connect/disconnect feature is really cool, but IMHO should go in a more universal spot. I think the scan logic would be a good fit as it already has a error disconnect counter.

Do the changes fix your compile issues? Haven't tested with a real compilation on non stm32 platforms.

@firetech
Copy link
Contributor

firetech commented Jun 4, 2021

The connect/disconnect feature is really cool, but IMHO should go in a more universal spot. I think the scan logic would be a good fit as it already has a error disconnect counter.

I think the main issue is that there are often more than one transaction per scan cycle (for LED/RGB Matrix, WPM, etc.), and I think stopping all of them would be quite messy if not done somewhere within the actual transport/transaction code (i.e. in transport_execute_transaction or the functions it calls), especially when you consider transaction_rpc calls for custom transactions, which are run outside the normal transport_master function.

Do the changes fix your compile issues? Haven't tested with a real compilation on non stm32 platforms.

Haven't tried yet, but it looks like they should. I'll try it out later today or during the weekend.

@KarlK90
Copy link
Member Author

KarlK90 commented Jun 4, 2021

This is my attempt to support disconnected peripheral sides.. But you are right, this should probably set a global flag and stop all transactions regardless where they are started from. I also think that could be done in a rather clean way, should attempt that after the split transactions PR is merged.

The other reason why I'm opposed to do this in the driver is that this would have to be done for all drivers like serial and i2c which is duplicated efforts as well and it is in the hot-path of the application which should do as little logic as possible. 🤔

@firetech
Copy link
Contributor

firetech commented Jun 4, 2021

The other reason why I'm opposed to do this in the driver is that this would have to be done for all drivers like serial and i2c which is duplicated efforts as well and it is in the hot-path of the application which should do as little logic as possible. 🤔

That is a good point. I didn't think that much about where it should be implemented since I put it in a completely custom driver when I came up with it.

It might be an idea to implement it in a driver-independent wrapper around transport_execute_transaction (also known as transport_read and transport_write). Then it should affect all transactions (including RPC) in any driver, right?

drashna added a commit to drashna/qmk_firmware that referenced this pull request Jul 27, 2021
@KarlK90 KarlK90 deleted the unite-serial-driver branch November 30, 2021 21:41
nhongooi pushed a commit to nhongooi/qmk_firmware that referenced this pull request Dec 5, 2021
* Unite half-duplex and full-duplex serial driver.

* Add full duplex operation mode to the interrupt based driver
* Delete DMA UART based full duplex driver
* The new driver targets qmk#11930

* Fix freezes with failing transactions in half-duplex

* Increase default serial TX/RX buffer size to 128 bytes

* Correctly use bool instead of size_t

Co-authored-by: Nick Brassel <nick@tzarc.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants