Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"mem" feature doesn't affect analogous AEABI symbols #253

Closed
ketsuban opened this issue Aug 8, 2018 · 8 comments
Closed

"mem" feature doesn't affect analogous AEABI symbols #253

ketsuban opened this issue Aug 8, 2018 · 8 comments

Comments

@ketsuban
Copy link
Contributor

ketsuban commented Aug 8, 2018

On the platform I'm targeting, byte writes to some parts of memory (palette data at 0x5000000, tile data at 0x6000000 and OAM at 0x7000000) are invalid - writing a byte will copy it over both bytes of the halfword. As such I need to use custom implementations of memcpy (etc.) which operate on halfword or word data so as not to silently corrupt data copied to these sections of memory.

I disabled the mem feature, but that only covers memcpy (etc.) themselves; __aeabi_memcpy (etc.) still exist, and since they call memcpy directly whether or not the names are mangled makes no difference - I end up with a VRAM-unsafe memcpy in my program.

@parched
Copy link
Contributor

parched commented Aug 8, 2018

I suspect what you are trying to do maybe undefined behaviour. For anything but normal memory you will need volatile writes. How exactly is memcpy being called from your code?

@ketsuban
Copy link
Contributor Author

ketsuban commented Aug 8, 2018

I defined a Tile struct, and copying one into VRAM gets automatically lowered to a __aeabi_memcpy call by the compiler.

@parched
Copy link
Contributor

parched commented Aug 8, 2018

Yeah I think that's probably UB. You will have to write your own memcpy like function that uses volatile stores and explicitly call it to do the copy.

@ketsuban
Copy link
Contributor Author

ketsuban commented Aug 8, 2018

I have to profess ignorance here. I don't understand why supplying a memcpy that doesn't silently corrupt graphical assets because of a hardware quirk the compiler can't be made aware of can constitute undefined behaviour. I thought volatile reads and writes were to stop the compiler optimising those accesses out, but here it's the one adding memcpy calls to my code.

Meanwhile your suggestion is to do things in a completely un-Rustlike manner - check on each compile for any time the compiler might sneak a __aeabi_memcpy or __aeabi_memset into my code and replace it with manually calling __aeabi_memcpy_not_secretly_broken, complete with casting references to raw pointers and unsafe blocks.

@parched
Copy link
Contributor

parched commented Aug 8, 2018

I don't understand why supplying a memcpy that doesn't silently corrupt graphical assets because of a hardware quirk the compiler can't be made aware of can constitute undefined behaviour.

Replacing memcpy isn't UB but you can't rely on the compiler calling it. The compiler could just as easily generate a sequence of byte stores because it assumes normal memory.

I thought volatile reads and writes were to stop the compiler optimising those accesses out.

Yes that is one thing volatile does but the other thing it does it ensures the compiler generates a load/store of the width you specify (if possible). e.g. if you do a volatile store of i32 it will generate exactly one 32-bit store but if it wasn't volatile the compiler could split it into 4 8-bit stores. (aside: I think volatile is horribly overloaded thing in C and Rust. Rust should really replace it with separate operations for each of the things it does)

@japaric
Copy link
Member

japaric commented Aug 8, 2018

@ketsuban could you provide more details about the complete operation you want to do? It sounds like you are dealing with MMIO and that will require volatile operation to prevent the compiler from reordering memory operations and from optimizing them away.

I haven't done graphical stuff before but I imagine that the operation you are doing is actually a two-step operation:

  • Copy some data (&[u8]) into a buffer at a known memory address
  • Signal the peripheral that the buffer has been loaded and that it can e.g. update the screen. This probably involves writing a byte / word to another know memory address.

Both of these operations need to be volatile or the compile is free to reorder or eliminate them.

You also mention a Tile struct. If you need precise control over the layout of that struct you should be using #[repr(C)] and may need to check the alignment of the struct as the compiler can call __aebi_memcpy or __aeabi_mempy4 depending on the alignment (one does a bytewise copy; the other does a wordwise copy).

@ketsuban
Copy link
Contributor Author

ketsuban commented Aug 8, 2018

could you provide more details about the complete operation you want to do?

It sounds flippant, but I really am just copying some data into memory which happens to have a particular bus characteristic. There are some memory-mapped IO registers, but they're in a different part of the address space, and I already have a system in place for those which uses volatile writes.

There's no two-step operation; the screen is always drawing regardless of what the processor does (unless disabled by a bit in the display control register). There's a number of background layers, each of which take tilemap data from part of VRAM based on the value of a control register (one for each background layer) and use tiles from the rest of VRAM based on the values in the tilemap. Tile data is 4bpp (or 8bpp, but there's only 512 bytes of palette RAM for backgrounds, and sixteen palettes of sixteen is more flexible), so my Tile struct is just a newtype for [u8; 32].

@josephlr
Copy link
Contributor

josephlr commented Oct 2, 2020

It sounds flippant, but I really am just copying some data into memory which happens to have a particular bus characteristic.

@ketsuban on x86_64, we deal with something similar, the VGA memory buffer. This isn't MMIO per-se, but still needs volatile writes. For your example, you could make your Tile be a newtype for [Volatile<u16>; 16], using the volatile::Volatile helper type.

As a general rule, if the memory isn't normal RAM, you must use read_volatile and write_volatile (potentially through a helper type), or LLVM will generate the wrong code. This would also be true of C/C++ code. Basically, you shouldn't have to worry if llvm emits memcpy/__aebi_memcpy/__aebi_memcpy4, as your read/writes to VRAM should never be using normal memory accesses.

@japaric I think this can be closed, as it's not a problem compiler-builtins can solve.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants