New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rewrite implementation ideas #29
Comments
Hi, thanks for the feedback, appreciate it :) This is a good opportunity to write down the story behind the rewrite. I have been asked by other people offline as well. How did the v1 tutorials came to be ?I started the v1 tutorials by rewriting existing C bare-metal tutorials in Rust. And rewriting was quite literally the case. At the time I started there were almost no Over time, my personal focus shifted towards making those tutorials Operating System tutorials, and I drifted away from the bare-metal only origins of the C tutorials. This caused three major issues:
Hence I decided to start a rewrite. What is the plan for v2With the v2 tutorials, I have set myself four main requirements:
The current outlineI have an outline in my head, but I am a bit reluctant to put it out there wholly, because several times already I realized that I had to shift it around quite a lot when I started actually implementing a lesson. The whole thing is a big moving target and I still learn a lot and have to adapt when I am actually coding it. But the general idea is this: I wanted to be able to arrive at a globally usable The story I will tell from there on will go something like this:
From there on, in an order I have yet to decide:
Completely agree. The reason it is implemented like it is right now: Did it very early and basically just translating C to Rust. I will definitely revisit your suggestions when introduction of the Mailbox is due, and implement it more idiomatically wrt Rust.
I hope the intention is answered already by my previous statements. It is this way for both didactic and practical reasons and will be corrected by the tutorials that follow.
I want to build and explain synchronization primitives all by ourselves. Just using Hope that helps. |
@andre-richter thanks for providing a detailed answer and also thanks for the tutorial(s) in general. |
Yes, RPi3 will stay the main target platform. Anyways, don't wait for the rewrite to finish anytime soon. My freetime is sparse at the moment... |
Looks like the DevieDriver went in yesterday and it looks good. I got a assembly-free version of raspbootin here : https://github.com/r0ck3tAKATrashPanda/raspbootin-rs Obviously this could use some work, and I had a lot of issues with the rebasing of the image since the linker script thinks it is in the wrong location. I am probably missing something to make this first memcpy easier, but without assembly it is a hassle (especially with the linker thinking the binary is loaded somewhere other than 0x80_000). But it was a major hassle. Changes that should probably be made :
I am not sure if this implementation is helpful at all for saving time for your implementation or not, but just wanted to leave that here. Additionally, have you considered something like a Discord channel (or the embedded-rust channel) people that are going through this/interested in this type of work? It may make collaboration easier or features other people are working on that could be contributed easier to discuss and collaborate on. |
Hi @r0ck3tAKATrashPanda, thanks for the raspbootin link, that is just in time, because this is one of the very next ones I will tackle. Thanks! Regarding Discord: One issue where I'd be happy to get input is the current folder structure. I am not happy with the current state. It is way better than the organically grown v1, bit it still features rather poor composability. I am tackling this early now because I really want to encourage people to contribute other BSPs in the future and it should be without overhead. Raspberry Pi 4 (I don't have one yet) is the logical first addition. Right now, architecture code and drivers hide in the rpi3 bsp folder. Adding an rpi4 folder given the current structure would need to copy most of the code. Doesn't make sense.
I'll continue to experiment with it, but feel free to give your thoughts. |
I like the idea of moving architecture specific code out to its own folder
structure. In terms of the drivers I don't know that drivers should move
out past the BSP in their current form. Personally I think if I have a BSP
it should contain the full driver for peripherals. I don't want to have to
`use bsp::*` and then also `use driver::miniuart-rpi3::*`. An option
thought with the current format is to move drivers into a trait for each
type of driver and move the abstraction out past the BSP folder and then
leave the `Inner` implementation within the BSP itself. This would seem to
lend itself well to the structure Mutex format you currently have. I think
that is similar to the approach of `embedded-hal` work, but I haven't been
too deep in the weeds on that project. It would make sense to have a UART
driver that needs to read and write, and then each BSP will implement the
Inner form and the specific addresses and interactions with hardware.
Thoughts?
…On Thu, Oct 10, 2019, 2:18 PM Andre Richter ***@***.***> wrote:
Hi @r0ck3tAKATrashPanda <https://github.com/r0ck3tAKATrashPanda>,
thanks for the raspbootin link, that is just in time, because this is one
of the very next ones I will tackle. Thanks!
Regarding Discord:
There is the embedded WG's Matrix channel at #rust-embedded:matrix.org
<https://matrix.to/#/!BHcierreUuwCMxVqOf:matrix.org> which would be the
logical channel to get in contact. Let me know if that works for you.
I usually cannot respond during my daytime though.
One issue where I'd be happy to get input is the current folder structure.
I am not happy with the current state. It is way better than the
organically grown v1, bit it still features rather poor composability. I am
tackling this early now because I really want to encourage people to
contribute other BSPs in the future and it should be without overhead.
Raspberry Pi 4 (I don't have one yet) is the logical first addition.
Right now, architecture code and drivers hide in the rpi3 bsp folder.
Adding an rpi4 folder given the current structure would need to copy most
of the code. Doesn't make sense.
I just uploaded a first shot a restructuring that I had laying around on
my harddrive for some days now at:
https://github.com/rust-embedded/rust-raspi3-OS-tutorials/tree/folder_restruct/06_drivers_gpio_uart
- Architecture code is separated into its own folder now.
- There is an rpi_common folder now which aims to include
commonalities between rpi3 and rpi4, which are pulled in by the respective
BSPs then, e.g. the drivers.
- I am still pondering if device drivers should be outside the BSP
folders as well though. I tend to say yes. Would need to find a folder
structure then where BSPs can easily pull them in and instantiate them. The
actual BSP folders would be rather minimalistic afterwards. Pulling in
drivers that are featured on the board, instantiating them with the correct
base MMIO addresses and/or any other BSP-specific parameters and exporting
some bsp calls. Question is if rpi_common is needed anymore when
drivers live outside the BSP folder.
I'll continue to experiment with it, but feel free to give your thoughts.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#29>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AEF5PNJDAJPOS6BVBDYTM6TQN5WZBANCNFSM4I5V6UCA>
.
|
The trait for drivers is something that will definitely happen. I envision it something like follows: For example, there is a My primary concern is really duplication of code. I would very much like to avoid that. How would you tackle that So my idea was to have driver folder featuring, for example:
And the |
Update: |
I haven't gotten a chance to mess with the changes, but it looks good, my only suggestion would be maybe to add subfolders on BSP/drivers so that all the bcm drivers are together and you branched into another board/CPU you don't have a ton of files just flat on that driver dir. |
Good point. I added that 👍 |
@r0ck3tAKATrashPanda Spent some time today on a pure Rust raspbootin. Ran into the same problem as you with the compiler generating a relative jump (your weird if). I think I found a really neat solution that doesn't even require transmute. One can take advantage of the fact that vtables always store the absolute addresses (linktime address), so by indirecting through a trait object, you can elegantly jump to the relocated code. I'll upload it on the rewrite branch as soon as I've cleaned up stuff and finished the code. |
Wow. This is a really cool solution, what was the process for getting to
this point? While both methods seem heavy handed for a non-issue in C
(Almost seems like a compiler issue, as if there should be an attribute for
minimizing optimizations, although most people would probably just use
assembly). That solution would make every binary dynamically relocatable,
not just at boot, but throughout runtime for minimal overhead, very cool!
I spent some time with the DTB for the Pi3 and built out more generic
drivers for the mailbox as well as PL011 UART.
https://github.com/r0ck3tAKATrashPanda/raspi-os
Feel free to cherry pick code you find useful, hopefully it saves you some
time in development. The mailbox implementation. I actually separated the
mail from the mailbox since they don't actually have any necessity for
being coupled together. I think probably implementing all the mailbox
functionality at the BSP is the next step when I get some time. I also
tried to beef up the implementation for the MiniUart, seperating out the
AuxRegisters from MiniUart since they aren't actually together (Aux control
SPI as well as uart). Additionally, I added all the remaining registers and
bitfields, obviously not required, but could be useful in the future. This
implementation has MiniUart and Uart0 both working, and you can switch at
runtime between them, but Uart0 still has some bugs in terms of using it as
the mail console immediately from boot, and I have yet to track that down.
…On Sun, Oct 13, 2019, 6:25 PM Andre Richter ***@***.***> wrote:
@r0ck3tAKATrashPanda <https://github.com/r0ck3tAKATrashPanda> Spent some
time today on a pure Rust raspbootin. Ran into the same problem as you with
the compiler generating a relative jump (your weird if).
I think I found a really neat solution that doesn't even require
transmute. One can take advantage of the fact that vtables always store the
absolute addresses (linktime address), so by indirecting through a trait
object, you can elegantly jump to the relocated code.
https://github.com/rust-embedded/rust-raspi3-OS-tutorials/blob/chainloader_wip/07_uart_chainloader/src/runtime_init.rs#L7
https://github.com/rust-embedded/rust-raspi3-OS-tutorials/blob/chainloader_wip/07_uart_chainloader/src/bsp/rpi3.rs#L31
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#29>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AEF5PNOX3S5MUWPXJAUSWJ3QOON4JANCNFSM4I5V6UCA>
.
|
I agree its a bit heavy in contrast to just pulling in a tiny piece of assembly, but we're here for the challenge, aren't we? 😎 Thanks for the mailbox code. Will definitely look at it when Mailbox is due. I haven't yet decided when to bring it into the picture. Most likely alongside introducing virtual memory / MMU and heap allocation. It can then serve as a lesson for setting the correct page table attributes (non-cacheable) for mailbox memory. |
Struggling a bit to get the chainloader working with the MiniUart. Apparently I get RX overruns on. Bummer... 😩 |
I think if you want to use MiniUart you are going to have to implement PC
based flow control. That is one of the downsides of MiniUart is smaller
FIFOs and no real control. Usually you don't have to deal with it because
in a console you are never (rarely) sending characters that fast.
You might get away with MiniUart if you can:
A) Implement an artificial flow control ( ack every byte which isn't
officially the protocol ) this wouldn't be very hard to do, I have a python
version of the raspbootin loader in my raspi-os repo that could be fixed to
use ACKs but will probably be slower. I also never actually used RTS/CTS on
PL011, so the larger FIFOs might be enough.
B) Instead of writing to memory on every received character, maybe collect
sets of 8 bytes then write? Or possibly push to the stack then write it
later? I would have to look at instruction timings to see if that is
actually faster.
C) Move to PL011, but implement the absolute bare bones required which is
the mailbox, and the set_clock_rate mail.
I think it is difficult to keep the chainloader directly inline with the
tutorials because you have different goals. You want the loader to be small
and fast, but also work all the time. I almost see it as a fork in
development albeit a small fork. The chainloader (might) need PL011, but as
you mentioned you wanted to wait for the mess that is the mailbox, and it
doesn't seem that there is a need for the mailbox until after the MMU
likely in the "real" version of the OS.
I think overall for ease of tutorial, change the raspbootin protocol to ack
bytes (or sets of 8 bytes). You will lose speed, but it is easier to
implement without extra features. Then you can either handwaive the
improved version, or wait until later when you implement the mailbox or the
other UART to talk about pros and cons of both, and revisit to an improved
chainloader.
Python Rasploader :
https://github.com/r0ck3tAKATrashPanda/raspi-os/blob/master/raspbootcomm.py
Edit: That script has some hard coded parameters that may need to be changed since it was just something I threw together after but trusting the Java applets results.
…On Wed, Oct 16, 2019, 4:11 AM Andre Richter ***@***.***> wrote:
Struggling a bit to get the chainloader working with the MiniUart.
Apparently I get RX overruns on. Bummer... 😩
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#29>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AEF5PNI2UY2NLAXDV6HEQOLQO3ECJANCNFSM4I5V6UCA>
.
|
So it turns out that I was fooled by a spurious RX byte... Adding a FIFO clear just before starting with the raspbootin protocol does the trick (its anyways more robust this way, one should always do it). It works fine now with the |
And by the way, I really really like the idea of having the host program in an interpreted language and not the heavy C++ that is currently used. Ideally in |
Awesome, that is way better than my solutions then for MiniUart. Means you
don't need the mailbox at all, which is even better.
In terms of the interpreter loader, yeah it would be very nice to drop into
interactive serial, I meant to do that at some point but haven't gotten
around to it, for Ruby though you may be on your own never got to be a big
fan :P
I ended up adding the timers and hardware RNG, I guess the next step is
either USB or MMU. Ideally getting to the NIC sooner rather than later just
for fun.
…On Wed, Oct 16, 2019, 1:07 PM Andre Richter ***@***.***> wrote:
And by the way, I really really like the idea of having the host program
in an interpreted language and not the heavy C++ that is currently used.
Ideally in Ruby for the tuts though, don't too many different languages
mixed in there.
It would heavily depend on having functionality for switching into a
terminal-mode after sending the kernel, like the current raspbootcom that
is used does.
So that the user can seamlessly start interacting with the chainloaded
binary.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#29>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AEF5PNOECXSIJHICTEYT7GLQO5C4FANCNFSM4I5V6UCA>
.
|
So I worked on the python script a little : https://github.com/r0ck3tAKATrashPanda/raspi-os/blob/master/raspbootcomm.py now it actually does like you suggested where it drops you into a miniterm immediately after uploading the kernel, it is super nice. If you want to port it to Ruby that would make the loading/interacting much simpler I think. Additionally, I started the very very beginning of the DWC USB 2.0 driver: https://github.com/r0ck3tAKATrashPanda/raspi-os/blob/master/src/bsp/driver/dwc_usb_2_0_hs_otg.rs I am not sure how far I will get without dynamic memory, so I may end up stalling in the middle in order to add an allocator before going back to it, but this should have most of the mindless registers done. There is so much code required for this, I don't even know if it has any place in the tutorials at all, no time soon, that is for sure. |
Will definitely have a look at your python script, thanks! re drivers: Yeah, probably stuff for the later tutorials. But it is good to have you laying the groundwork and making first experiences. Once the tuts get there in the distant future, we can surely reuse and learn from your experience. |
@r0ck3tAKATrashPanda, I added support for the Raspberry Pi 4. The MiniUart just wouldn't function. Some dynamic frequency change stuff was going on that I just could not get rid of. Googling indicated it might be an issue with firmware in early batches, but I wasn't happy to boot Raspbian on it to update it (it is on-chip now, not fetched from SD card anymore). Threw it out for good now. The Videocore sets the PL011 UART frequency now before kernel load thanks to |
The only downside of not utilizing MiniUart that I could find is that it will make using bluetooth harder. I don't know if that is ever in the plan for the tutorial, but my understanding is that the PL011 UART is connected to the Bluetooth chip and is the only method to communicate with it on the board. This was just from piecing together a few sources on how the UARTs were split, so feel free to correct that if it is wrong. |
True, I also read that you can attach the MiniUart to BT though, which then only yields reduced bandwidth compared to PL011. |
Closing this for now. Feel free to reopen whenever new questions arise. |
First of all these tutorials are excellent. They are an excellent resource overall for Rust embedded work so thank you for that!
I have been working through some of these examples and I wanted to provide my 2 cents for implementation details for v2.
There are two main details that I found confusing or inefficient that I thought would be good to at least have documented somewhere so people understand trade offs.
I was curious on the design choice of using modules and set values instead of
enum
s that held all the variants for things such as Tags within the mailbox api. (ex. https://github.com/r0ck3tAKATrashPanda/raspi-os/blob/master/src/bsp/rpi3/mbox.rs#L77)Additionally, not that it affects anything, but mbox.buffer[4] (5th byte of the buffer) doesn't need to be filled out, apparently that is used by the VideoCore to send back the size of the response buffer. It should be parsed in order to determine if the size changed when the mail was responded too. Again this isn't an issue by any means, just hoping to make more information available to people about the mailbox API.
println!
macro andconsole()
- Currently in v2 you are using the magical QEMU structure to print. In the current implementation we have aconsole()
function that is building the magical structure each time it is called, which is every timeprint!
orprintln!
is called. Obviously this isn't an issue in QEMU and it is an empty struct, but this design has major performance implications when you move to hardware and start to use UART, the amount of overhead for re-initializing and the risk of having multiple consumers attempting to access the same peripheral is quite high. Instead this should be moved to a single static ref (maybe even for the QEMU struct for good practice early on in the process of learning)If we implement UART something like
We A) don't initialize UART until we need it B) We have a mutexed (spin) UART access so we can println! from anywhere without worry (sorta).
Again, these tutorials are excellent and I am not trying to nit pick, but these are some pain points I ran into, that would be great if we could get covered in one way or another. It would be great to hear your thoughts and let me know if there is anything I can do to contribute!
The text was updated successfully, but these errors were encountered: