New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stm32 eth remove tx rx locking interrupt perforation #6610

Merged
merged 1 commit into from May 7, 2018

Conversation

Projects
None yet
6 participants
@pauluap

pauluap commented Apr 11, 2018

Description

This set of changes resulted from an investigation into Ethernet issues with the F769 Discovery board. There is a long thread of discussion at #6262

One tangent was that the HAL locking approach for the eth implementation (at least) was questioned. Since there is no mutex involved, the locking is not thread safe.

What makes it strange is that there are a couple of places where the HAL lock is unlocked without a corresponding lock. @kjbracey-arm called that a 'perforation' and I like the term. The interrupt handler always perforates the HAL lock. From the perspective of user code, that would be akin to an act of nature in the cases of when the HAL lock disappears from under them.

Further, the rationale is suspect in the context of the RX and TX operations, DMA channels are independent of each other.

The issue thread has a lot of logic analyzer traces examining the locking behavior. The bottom line was that I ended up removing the locks in the TX and RX functions as well as the ISR. I didn't touch the remaining locks because I'm presuming that there is at least some rationale for locking which needs to be maintained.

The "fix" here is the removal of the necessity to have the ISR perforate the HAL lock

@jeromecoutant Also related to the ethernet support in F769/etc

This branch also throws in some memory barriers prior to DMA release just for good measure

Pull request type

[X] Fix
[ ] Refactor
[ ] New target
[ ] Feature
[ ] Breaking change

@0xc0170

This comment has been minimized.

Member

0xc0170 commented Apr 12, 2018

Destination branch - master should be set

@ARMmbed/team-st-mcd Please review

@pauluap pauluap force-pushed the pauluap:stm32_eth_remove_tx_rx_locking_interrupt_perforation branch from ca4211d to 412ff2f Apr 12, 2018

@pauluap pauluap changed the base branch from mbed-os-5.8 to master Apr 12, 2018

@jeromecoutant

This comment has been minimized.

Contributor

jeromecoutant commented Apr 13, 2018

Hi

I would like also to get review from @kjbracey-arm
or from other Ethernet user ?

@pauluap

This branch also throws in some memory barriers prior to DMA release just for good measure

This means that it is not really needed ?

Thx

@pauluap

This comment has been minimized.

pauluap commented Apr 13, 2018

@kjbracey-arm made a comment regarding the DMBs here - #6262 (comment)

In my application, when I don't have the DMBs in place, things don't break. That doesn't mean it's right though.

But it's correct that it's not relevant to the main point of HAL locking. I could drop the DMB stuff so that it's not a distraction.

@kjbracey-arm

This comment has been minimized.

Contributor

kjbracey-arm commented Apr 13, 2018

Agree that the DMBs don't belong in this commit - if you were going to put them in, the complete set should go in with corresponding upgrades to DSB as discussed for some of the others.

@pauluap pauluap force-pushed the pauluap:stm32_eth_remove_tx_rx_locking_interrupt_perforation branch from 412ff2f to c67f535 Apr 13, 2018

@pauluap

This comment has been minimized.

pauluap commented Apr 13, 2018

Right. Removed.

@0xc0170 0xc0170 removed the do not merge label Apr 13, 2018

@0xc0170 0xc0170 requested a review from kjbracey-arm Apr 13, 2018

@kjbracey-arm

This comment has been minimized.

Contributor

kjbracey-arm commented Apr 13, 2018

I'm not really sure about this - I think it's more for the ST guys to think about. There are two issues as I see it - one is the perforation of the locks by interrupt. I think that should be removable - maybe separate that.

The other issue is that RX and TX lock against each other, unnecessarily. But conceivably they might still want to lock against config-type operations, or two people doing TX. I'm don't fully grasp the locking concept here, but if you want to stop them locking, I'm not sure just removing the locks from RX and TX calls is consistent with the HAL API as a whole, unless specifically documented. It might be that it should actually have two separate locks for Ethernet RX and TX, and config-type calls lock both. I'm not sure.

Basically, this is core STM HAL code, not really part of mbed OS, and if it's not definitely causing problems, I'm wary about patching mbed OS's version. The investigates were so protracted that I forget - did this specific fix improve your performance?

@pauluap

This comment has been minimized.

pauluap commented Apr 13, 2018

Yes, certainly, resolving that issue was very protracted.

Going through the comments, I now realize that I didn't document test cases for exactly the scenario when TX would have overlapped with RX, I only tested for the lock busyness, which was being affected by the ISR perforation.

Some discussion is in this comment - #6262 (comment)

image
This image shows that there are occurrences that either TX or RX holds a lock when the ISR is handling an interrupt, perforating the lock.

I seem to recall more personal testing that looks for occurrences when there "would have been" contention if the ISR perforation wasn't present. I found that such occasions were rare in the first place - which could be related to the workload that I utilized. However, over the amount of time I've spent running tests, the total number of "would-have-blocked" events is not insignificant. Admittedly, the actual performance improvement in my workload is probably not measurable.

Also, I would hope that the PRs I have made against STCube would be integrated upstream as well.

@pauluap

This comment has been minimized.

pauluap commented Apr 14, 2018

Here's some concrete analysis.

I created a branch, https://github.com/pauluap/mbed-os/tree/stm32_hal_eth_perforation

In it, I replaced the HAL lock and unlock in the TX and RX functions with ones that reported if a conflict was present, but also to let the function continue rather than returning on a HAL lock contention.

I also put in trace outputs for when the TX and RX functions are running. I also tested for TX and RX overlap, independent of the HAL lock.

Finally, I put in a check in the ISR for when the HAL lock is perforated. The linchpin of this analysis is whether the perforation is actually performed. I controlled this by commenting/uncommenting the perforation.

https://github.com/pauluap/mbed-os/blob/f0ff823814cd0f84d658c5695df3b056967af32c/targets/TARGET_STM/TARGET_STM32F7/device/stm32f7xx_hal_eth.c#L993

The terminology that I'll use are

  • Tripwire - A signal that goes high when the condition is asserted. Further occasions will be marked by a brief dip
  • Scope - A signal that defines entry and exit of a state of interest (function, lock, etc)
  • Pulse - A signal that goes high momentarily then low again.

The signals are:

  • ISR Perforation - Tripwire for when the HAL lock is held at time of ISR execution
  • TX & RX Lock - Tripwire for when there is contention between TX and RX for the HAL lock
  • Tx & Rx Operation - Tripwire for when the TX and RX scopes overlap
  • Tx | Rx Lock - Scope of the HAL lock
  • RX Operation - Scope of the RX function
  • Tx Operation - Scope of the TX function

My first test run was with allowing the ISR to perforate the HAL lock.
image

Here, with a test run of 2+ hours, it can be seen that HAL lock contention never occurs. The Tx and Rx scope overlap occurs relatively frequently though (as opposed to what I reported in an earlier comment). This suggests the hypothesis that the ISR perforation is "protecting" HAL lock contention from occuring.

In my second test run, I suppressed the ISR from perforating the HAL lock
image

This run is also 2+ hours. Here, it can be seen that it is definitely the ISR perforation that is allowing the TX and RX functions to coexist without running into HAL lock contention.

So now, I can positively report on the earlier question on the performance impact. For my workload, there is no performance impact - because the act of ISR HAL perforation is preventing HAL lock contention from happening.

While this patch results in a no-effect for my workload, I don't know if that's generally true. All of the ethernet traffic the application is seeing are composed of single MTUs. The turnaround processing of replies have minimal latency. I don't plan to perform such testing, but for applications with communications that require multiple MTUs or larger latency in responding to packets may see lock contention and drop packets unnecessarily.

While the intent of the locking may be allowing multiple transmitters, locking against config, etc, I would assert that none of those are guaranteed unless all of the interrupts that the ISR handles are deactivated.

There is also another perforation that I did not remove in this patch.

https://github.com/ARMmbed/mbed-os/blob/master/targets/TARGET_STM/TARGET_STM32F7/device/stm32f7xx_hal_eth.c#L1114

I have had private communications saying that ST tracks mbed-os and would link issues to their internal tracking system, but I do agree with the sentiment that this PR belongs to an upstream repository. Especially since current analysis seems to show that things work in practice from the viewpoint of mbed-os. At the moment though, there does not appear to be such an upstream repository.

@cmonr

This comment has been minimized.

Contributor

cmonr commented Apr 17, 2018

@jeromecoutant

I am wondering if I need to make this same correction to the other STM32 families ?

@pauluap

This comment has been minimized.

pauluap commented Apr 30, 2018

I can't comment to the other families. Theoretically, it makes sense, since RX by nature is asynchronous, then asynchronous TX and RX makes sense. However, since this PR isn't about code correctness but rather about hardware interaction, I don't have enough experience with the MII/RMII interface or the MAC implementations on other chips to say anything about whether their interfaces imposes some kind of serialization or the impact of DMA accesses.

If there is access to the source or engineers for the history of the decision, that may help. Was the HAL locking added to the TX and RX functions to adhere to coding requirements and then the perforation occurred later on to work around, or is there another reason that would be applicable to other families?

@0xc0170 0xc0170 added needs: CI and removed needs: review labels May 3, 2018

@0xc0170

This comment has been minimized.

Member

0xc0170 commented May 3, 2018

/morph build

@mbed-ci

This comment has been minimized.

mbed-ci commented May 3, 2018

Build : SUCCESS

Build number : 1915
Build artifacts/logs : http://mbed-os.s3-website-eu-west-1.amazonaws.com/?prefix=builds/6610/

Triggering tests

/morph test
/morph uvisor-test
/morph export-build
/morph mbed2-build

@mbed-ci

This comment has been minimized.

@mbed-ci

This comment has been minimized.

@0xc0170

This comment has been minimized.

Member

0xc0170 commented May 7, 2018

/morph mbed2-build

@cmonr

cmonr approved these changes May 7, 2018

@cmonr cmonr merged commit 0f51ea0 into ARMmbed:master May 7, 2018

12 checks passed

AWS-CI uVisor Build & Test Success
Details
ci-morph-build build completed
Details
ci-morph-exporter build completed
Details
ci-morph-mbed2-build build completed
Details
ci-morph-test test completed
Details
continuous-integration/jenkins/pr-head This commit looks good
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
travis-ci/docs Local docs testing has passed
Details
travis-ci/events Passed, runtime is 10113 cycles (+1153 cycles)
Details
travis-ci/gitattributestest Local gitattributestest testing has passed
Details
travis-ci/littlefs Passed, code size is 10092B (+0.00%)
Details
travis-ci/tools Local tools testing has passed
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment