-
Notifications
You must be signed in to change notification settings - Fork 996
runtime: make channels work in interrupts #1142
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
82294f6 to
da83522
Compare
|
@aykevl Is this design more of what you were looking for, rather than the other scheduler design? |
aykevl
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still need to look at this properly (and I intend to do so very soon), but I quickly skimmed through the code.
|
I don't want to scope-creep this PR any further, so I would suggest adding support for other boards in separate PRs after this is merged, because otherwise we would basically have to test every board to merge this. |
|
@jaddr2line please resolve merge conflict. |
|
Rebased. |
|
Rebased again after the go mod tidy. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have taken a bit more in-depth look, I need to look at the write barrier next.
Sorry, lots of things going on right now in my life so don't have as much time as I would hope.
Sidenote (semi-related): the functionality I'm most waiting for is some sort of simple condition variable that one goroutine can wait on and an interrupt can signal. That's what is needed by the vast majority of hardware, for example to queue a DMA SPI transfer and wait for it to finish. Or, in my case, wait for a BLE operation to complete (in the nrf SoftDevice) while another goroutine is allowed to run. Right now I just wait for it to complete, which is a waste:
// Wait for the next advertisement packet to arrive.
// TODO: use some sort of condition variable once the scheduler supports
// them.
arm.Asm("wfe")
if gotScanReport.Get() == 0 {
// Spurious event. Continue waiting.
continue
}
gotScanReport.Set(0)Of course, that should happen after this PR is merged (which lays the foundation for more interrupt <-> goroutine communication).
|
I think the write barrier system is solid. I am however not sure whether disabling interrupts in the GC is ever a good idea. Even the possibility of it happening may be a problem for real time applications, while endless GC loops seem rather unlikely to me (except for really badly designed interrupt handlers). I'm inclined to say it's better to risk unbounded GC than to disable interrupts in the GC. I'll think a bit about the best strategy. |
|
Actually, it seems to me that every heap pointer move would be a task pointer move from some arbitrary location (for example the blocked list in a channel) to the runqueue. Therefore it might be enough to scan the runqueue with interrupts disabled, or rather (after the heap has been scanned) do something like this: while True:
task = nil
disable interrupts
for t in runqueue:
if t not marked:
task = t
break
enable interrupts
if task == nil:
break # finished scanning
scan taskMaybe we have already discussed this, but I don't see an obvious flaw here. The advantage here is that the number of goroutines is limited, especially on bare metal systems it may be just a few or only one. Certainly iterating through the runqueue will be much more predictable than rescanning the entire heap. |
|
@aykevl I have removed the write barrier with a pattern like what you suggested. |
|
Alright, this should finally be done! |
|
I am doing some testing with this PR now... |
|
So this PR currently causes the ItsyBitsy-M4 SAMD51 board to become unresponsive. I could recover by using the manual bootloader and then flashing code from the previous version... |
|
Unresposive when running what? |
|
Once flashed, the MCU "locks up" and even the USB CDC code no longer works. The IRQ that is uses to detect the USB CDC is no longer called, so the MCU no longer goes into bootloader when switching to 1200 baud. |
|
When I was finally able to get GDB to connect (had a bad solder joint on my connector) here is the stack trace I was able to obtain: I discovered that the same problem of MCU locks up even on |
|
The issue I reported regarding M4 processors was corrected by #1182 which has now been merged into |
|
Rebased. |
|
I did a size comparison before and after this PR:
|
This probably has to do with the GC changes. |
|
Right, that makes sense. For amd64 I don't really care but for wasm it's a bit unfortunate that it adds extra overhead for something that doesn't happen on WebAssembly. |
|
I have now put parts of the conservative collector behind the baremetal build tag, and they will only build if the scheduler is enabled The test failure is completely unrelated, and is an instance of #1185 |
|
@jaddr2line if you rebase against dev the tests should pass now due to 1ad6953 |
|
Rebased. |
aykevl
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did another size comparison. The last commit (to reduce size) did not have an effect on amd64 but did reduce code size on wasm. Still there is a small (much more acceptable) size increase on wasm. Interestingly, it did sometimes reduce .bss size by 4 bytes on baremetal targets.
Size results (for diffing):
https://gist.github.com/aykevl/ea54c616224e5f8af55545cef7387d77
This is ready to go from my POV. @deadprogram any last comments before merging?
(some thoughts below inline, probably not worth changing)
|
|
||
| // Push a task onto the queue. | ||
| func (q *Queue) Push(t *Task) { | ||
| i := interrupt.Disable() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be more efficient to put all these locks where the queue/stack are used.
| if baremetal && hasScheduler { | ||
| // Channel operations in interrupts may move task pointers around while we are marking. | ||
| // Therefore we need to scan the runqueue seperately. | ||
| var markedTaskQueue task.Queue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was hoping you could just go through the runqueue instead of (essentially) rewriting it, but it seems fine. And perhaps safer.
|
From my POV this PR is ready to go. The small comments made by @aykevl are either not worth changing, or can be looked at for future refinements. Now merging, incredible work again from @jaddr2line thank you!! 🍰 🎆 |
This works right now, but there are some pieces here and there which are sub-optimal or have limited support. This is mostly a proof of concept that I hacked together tonight.