Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reading/writing to flash #257

Open
gambrose opened this issue Jan 4, 2022 · 30 comments
Open

Reading/writing to flash #257

gambrose opened this issue Jan 4, 2022 · 30 comments
Labels
enhancement New feature or request

Comments

@gambrose
Copy link

gambrose commented Jan 4, 2022

Is it possible to use this hal to read/write to the flash storage on the pico?
I would like to use the flash space, which is not my program, to store and retrieve data.

@WeirdConstructor
Copy link
Contributor

Same question crossed my mind recently. That is why I looked into using an SD card with my Raspberry Pi Pico board - for stuff I would usually put into the EEPROM of my Arduino.

Storing stuff in flash has it's limited write cycles, but for occasionally storing user settings this seems fine (in contrast to logging data, which could also be implemented with some care).

The main downside is of course, that every firmware update could potentially move the first unused byte offset of the flash and thus requires some more care for keeping settings across firmware updates.

I would love some control and reserve the first 1MB of the flash for program data and the remaining space for storing data.

@gambrose
Copy link
Author

gambrose commented Jan 9, 2022

As far a can see, you can use the memory.x to have some control over the memory layout.

I was planning on using a tiny board which has 8MB of flash so if I can restrict the size of the program to say 2MB it would leave plenty of room for data.

I would prefer to restrict the number of components I use and I don't need that many write cycles so using the flash seems like the optimal solution.

@thejpster
Copy link
Member

To read from flash, you can just core::ptr::volatile_read() the appropriate address of the appropriate type (e.g. *const u32). The start of flash is at 0x1000_0000.

The HAL currently doesn't have support for writing or erasing flash. You could look at the pico-sdk for inspiration, but I believe the steps would be:

  1. Jump to a function in RAM
  2. Disable interrupts
  3. Disable the XIP engine
  4. Send the appropriate flash write commands over QSPI (they vary depending on the chip and its size)
  5. Flush the XIP caches
  6. Re-enable the XIP
  7. Re-enable interrupts

An alternative to disabling all interrupts would be to re-enable during the (relatively expensive) erase cycle, by temporarily replacing the interrupt vector table with a copy in RAM where every vector:

  1. Suspends the erase operation
  2. Flushes the XIP cache
  3. Re-enables XIP
  4. Jumps to the original IRQ handler
  5. Disables XIP
  6. Resumes the erase operation

This is assuming that QSPI flash chips can suspend erase operations - I haven't checked, but I know that parallel NOR flash chips can, and I've seen this approached used on other MCUs that have NOR flash (albeit only on a chip that had only two IRQ handlers).

Yes, you would want to change memory.x to ensure that at least part of the flash chip is unaffected by programming your application, and is guaranteed to not contain program data. I guess in theory you could read/modify/write the currently running program, but that's probably not advisable.

@thejpster
Copy link
Member

@thejpster
Copy link
Member

@9names
Copy link
Member

9names commented Jan 9, 2022

Yeah, they still have a function in RAM for coordinating the whole thing though. Also note it's pretty much impossible to do this safely while using both cores unless we have some way of ensuring that the second core is parked. Ditto for DMA accessing flash.

@gambrose
Copy link
Author

Thanks, I have had a look at what you linked. If I understand correctly, I can use something like this

unsafe {
   hal::rom_data::flash_range_program(addr, data.as_ptr(), data.len());
}

to write a data u8 array (with a length which is multiple of 256) to location addr, aligned to a 256-byte boundary.

That function should handle the XIP and cache flushing. I would still be responsible for disabling interrupts and not using two cores.

@jannic
Copy link
Member

jannic commented Jan 10, 2022

hal::rom_data::flash_range_program(addr, data.as_ptr(), data.len());

That function should handle the XIP and cache flushing. I would still be responsible for disabling interrupts and not using two cores.

That's not enough. See the datasheet, section 2.8.3.1.3. Flash Access Functions. You need to call more than one function, and: "Note that, in between the first and last calls in this sequence, the SSI is not in a state where it can handle XIP accesses,
so the code that calls the intervening functions must be located in SRAM. The SDK hardware_flash library hides these details."

@gambrose
Copy link
Author

Thanks, I was getting confused thinking I was calling this function rather than the actual function in the ROM.

So I would need something more like this;

unsafe {
    let connect_internal_flash = hal::rom_data::connect_internal_flash;
    let flash_exit_xip = hal::rom_data::flash_exit_xip;
    let flash_range_program = hal::rom_data::flash_range_program;
    let flash_flush_cache = hal::rom_data::flash_flush_cache;
    let flash_enter_cmd_xip = hal::rom_data::flash_enter_cmd_xip;

    connect_internal_flash();
    flash_exit_xip();
    flash_range_program(addr, data.as_ptr(), data.len());
    flash_flush_cache();
    flash_enter_cmd_xip();
}

The datasheet says that I should avoid calling flash_enter_cmd_xip as it is very slow and should instead call into the flash second stage but that looks to be board specific as it depends on the flash chip. I think I need to do some more reading.

@thejpster
Copy link
Member

Yes that's why there's multiple boot2 binaries. Their job is to enable high speed read mode and XIP.

@9names 9names added the enhancement New feature or request label Jan 23, 2022
@9names
Copy link
Member

9names commented Jan 31, 2022

I spent a little bit of time reading how the pico-sdk handles the second core during flash writes.
They hook the SIO interrupt up to RAM function!
On receipt of a magic lockout value they disable interrupt, write the magic value back to the sender over the FIFO to let them know they're blocked, then they loop until the they receive an unlock message or a timeout occurs.
Pretty clever!
https://github.com/raspberrypi/pico-sdk/blob/2062372d203b372849d573f252cf7c6dc2800c0a/src/rp2_common/pico_multicore/multicore.c#L171

@thejpster
Copy link
Member

scratches head

Wait, how does this work? I don't get it.

@jannic
Copy link
Member

jannic commented Feb 2, 2022

Wait, how does this work? I don't get it.

Does https://raspberrypi.github.io/pico-sdk-doxygen/group__multicore__lockout.html help?

@thejpster
Copy link
Member

Oh, ok. So Core B hooks the SIO FIFO interrupt with a ram func, and when Core A wants to enter a critical section it writes to the FIFO, which triggers an interrupt on Core B. The IRQ handler on Core B pops a reply in the FIFO and spins until it gets an "all clear" at which point the interrupt ends and Core B resumes what it was doing.

The bit I was missing was that A can trigger and interrupt on B with a FIFO write. Got it!

@thejpster
Copy link
Member

Also, that is totally going to knock out the video on a Neotron. Note to self - the screen will go blank during a self-update of the firmware!

@jannic
Copy link
Member

jannic commented Feb 3, 2022

You could replace the lockout function on the second core with something providing video, as long as it runs from RAM. Instead of busy looping, waiting for the release message.

@thejpster
Copy link
Member

Do we have a good mechanism for ensuring an entire call stack is in RAM, and not just the top function?

@riskable
Copy link

Just as an FYI: I tried putting something together and it runs without hanging/panicking but it doesn't actually seem to write anything:

pub const BLOCK_SIZE: u32 = 65536;
pub const SECTOR_SIZE: usize = 4096;
pub const PAGE_SIZE: u32 = 256;
pub const SECTOR_ERASE: u8 = 0x20;
pub const BLOCK32_ERASE: u8 = 0x52;
pub const BLOCK64_ERASE: u8 = 0xD8;
pub const FLASH_START: u32 = 0x1000_0000;
pub const FLASH_END: u32 = 0x1020_0000; // It's a 2MByte flash chip

#[inline(never)]
#[link_section = ".data.ram_func"]
fn write_flash() {
    // Temp hard-coded locations for testing purposes:
    let addr = FLASH_END - 4096;
    let encoded: [u8; 4] = 22_u32.to_le_bytes(); // Just a test
    let mut buf = [200; 4096];
    buf[0] = encoded[0];
    buf[1] = encoded[1];
    buf[2] = encoded[2];
    buf[3] = encoded[3];
    unsafe {
        cortex_m::interrupt::free(|_cs| {
            rom_data::connect_internal_flash();
            rom_data::flash_exit_xip();
            rom_data::flash_range_erase(addr, SECTOR_SIZE, BLOCK_SIZE, SECTOR_ERASE);
            rom_data::flash_range_program(addr, buf.as_ptr(), buf.len());
            rom_data::flash_flush_cache(); // Get the XIP working again
            rom_data::flash_enter_cmd_xip(); // Start XIP back up
        });
    }
    defmt::println!("write_flash() Complete"); // TEMP
}

#[inline(never)]
#[link_section = ".data.ram_func"]
fn read_flash() -> &'static mut [u8] {
    // Temp hard-coded locations for testing purposes:
    let addr = (FLASH_END - 4096) as *mut u8;
    let my_slice = unsafe { slice::from_raw_parts_mut(addr, 256) };
    my_slice
}

When I was fooling around I swear I got it to write stuff but that was back when it was crashing/hanging like crazy. Now I can't seem to get it to write any data at all. I even verified by dumping the entire flash to a file using picotool (doesn't seem to be writing out my little buf data).

@thejpster
Copy link
Member

Are you sure the rom_data::X stuff is inlined? Maybe grab all the function pointers first, then use those inside the critical section.

@thejpster
Copy link
Member

Also, the read_flash function doesn't need to be in RAM.

@thejpster
Copy link
Member

Sorry, me again:

flash_range_erase(addr, SECTOR_SIZE, BLOCK_SIZE, SECTOR_ERASE);

Is that right? I think you're telling ROM there's a special way to erase BLOCK_SIZE bytes at once, which is to use the SECTOR_ERASE command? Pretty sure SECTOR_ERASE is only going to erase SECTOR_SIZE bytes, which is the default. Also, a block erase not on a block boundary is not going to work.

@riskable
Copy link

I've got it working! My problem was that when you use rp2040-hal::rom_data::flash_range_*() functions it expects the address space to start at 0x0000_0000 but if you want to read that data in using something like slice::from_raw_parts_mut() you have to use 0x1000_0000 (aka "XIP base"). Man that was confusing! Wish the docs were more clear about that. Actually, just a working example would be great haha.

Anyway, here's the code that works:

pub const BLOCK_SIZE: u32 = 65536;
pub const SECTOR_SIZE: usize = 4096;
pub const PAGE_SIZE: u32 = 256;
// These _ERASE commands are highly dependent on the flash chip you're using
pub const SECTOR_ERASE: u8 = 0x20; // Tested and works with W25Q16JV flash chip
pub const BLOCK32_ERASE: u8 = 0x52;
pub const BLOCK64_ERASE: u8 = 0xD8;
/* IMPORTANT NOTE ABOUT RP2040 FLASH SPACE ADDRESSES:
When you pass an `addr` to a `rp2040-hal::rom_data` function it wants
addresses that start at `0x0000_0000`. However, when you want to read
that data back using something like `slice::from_raw_parts()` you
need the address space to start at `0x1000_0000` (aka `FLASH_XIP_BASE`).
*/
pub const FLASH_XIP_BASE: u32 = 0x1000_0000;
pub const FLASH_START: u32 = 0x0000_0000;
pub const FLASH_END: u32 = 0x0020_0000;
pub const FLASH_USER_SIZE: u32 = 4096; // Amount dedicated to user prefs/stuff

#[inline(never)]
#[link_section = ".data.ram_func"]
fn write_flash(data: &[u8]) {
    let addr = FLASH_END - FLASH_USER_SIZE;
    unsafe {
        cortex_m::interrupt::free(|_cs| {
            rom_data::connect_internal_flash();
            rom_data::flash_exit_xip();
            rom_data::flash_range_erase(addr, SECTOR_SIZE, BLOCK_SIZE, SECTOR_ERASE);
            rom_data::flash_range_program(addr, data.as_ptr(), data.len());
            rom_data::flash_flush_cache(); // Get the XIP working again
            rom_data::flash_enter_cmd_xip(); // Start XIP back up
        });
    }
    defmt::println!("write_flash() Complete"); // TEMP
}

fn read_flash() -> &'static mut [u8] {
    let addr = (FLASH_XIP_BASE + FLASH_END - FLASH_USER_SIZE) as *mut u8;
    let my_slice = unsafe { slice::from_raw_parts_mut(addr, FLASH_USER_SIZE as usize) };
    my_slice
}

...and here's the code I was using to test it out (I bound it to a keystroke on my numpad):

let data = crate::read_flash();
defmt::println!("Flash data[0]: {:?}", data[0]);
defmt::println!("Incrementing data[0] by 1...");
let mut buf = [0; 256];
if data[0] == u8::MAX {
    buf[0] = 0;
} else {
    buf[0] = data[0] + 1;
}
crate::write_flash(&buf);
let data2 = crate::read_flash();
defmt::println!("Flash data[0]: {:?}", data2[0]);

The output of which looks like this:

Flash data[0]: 137
Incrementing data[0] by 1...
write_flash() Complete
Flash data[0]: 138

...and I confirmed that the data survives reboots/power cycle (so it wasn't just a trick of optimization). Speaking of optimization, I had a lot of trouble trying to get this to work until I specified lto = 'fat' in my Cargo.toml:

[profile.release]
codegen-units = 1
debug = 2
debug-assertions = false
incremental = false
lto = 'fat' # <-- HERE
opt-level = 3
overflow-checks = false

However, to be thorough I just tested all the lto options:

  • lto = 'thin': Works
  • lto = false: Causes hang
  • lto = true: Works (it's the same as 'fat')
  • lto = 'off': Causes hang

Note that you can put defmt::println!() calls inside of write_flash() but not if you use format strings. So printing static text like, "foo" would work fine but trying to print out a variable, "foo {:?}" would cause it to hang indefinitely.

Other notes:

  • I'm using RTIC and my keystroke-bound function is actually calling write_flash() from within a spawn_at() call (and I'm using a monotonic timer a la rp2040-monotonic). When I first started I was getting panics until I put #[link_section = ".data.ram_func"] in front of all the dispatchers and the function that calls write_flash() but now that I've worked everything out that doesn't seem to be necessary (I've since removed those lines that force functions into RAM).
  • I'm using PIO (ws2812-pio) in the background while these write_flash() calls are taking place and it doesn't seem to be bothered. Not getting any flickering or anything like that either (nice and smooth 👍)

@riskable
Copy link

Sorry, me again:

flash_range_erase(addr, SECTOR_SIZE, BLOCK_SIZE, SECTOR_ERASE);

Is that right? I think you're telling ROM there's a special way to erase BLOCK_SIZE bytes at once, which is to use the SECTOR_ERASE command? Pretty sure SECTOR_ERASE is only going to erase SECTOR_SIZE bytes, which is the default. Also, a block erase not on a block boundary is not going to work.

Well you have to pass something as the 3rd and 4th argument and that's what worked 🤷 . Don't assume I know what I'm doing haha.

@jannic
Copy link
Member

jannic commented Apr 15, 2022

Well you have to pass something as the 3rd and 4th argument and that's what worked shrug . Don't assume I know what I'm doing haha.

The comment in the bootrom source code explains those parameters:

// block_size must be a power of 2.
// Generally block_size > 4k, and block_cmd is some command which erases a block
// of this size. This accelerates erase speed.
// To use sector-erase only, set block_size to some value larger than flash,
// e.g. 1ul << 31.
// To override the default 20h erase cmd, set block_size == 4k.
void __noinline flash_range_erase(uint32_t addr, size_t count, uint32_t block_size, uint8_t block_cmd) {

@MathiasKoch
Copy link

Perhaps it would be possible to add abstrations based on https://github.com/rust-embedded-community/embedded-storage for this, to make it a bit easier for everyone to use?

@jannic
Copy link
Member

jannic commented Jul 8, 2022

I am working on some functions which cover the 'needs to run from RAM' requirement:
https://github.com/jannic/rp2040-flash/
Just a work in progress, and still missing documentation. But perhaps it's already useful?

@afaber999
Copy link

  • lto = 'thin': Works
  • lto = false: Causes hang
  • lto = true: Works (it's the same as 'fat')
  • lto = 'off': Causes hang

I think the issue is that all functions have to be executed from RAM, however, the HAL definitions for the rom_table_lookup can be compiled into flash (since there is no #[inline(always)] depending on your optimization settings (same holds for the rom_hword_as_ptr function).

when I compile the following code snippet:

#[inline(never)]
#[link_section = ".data.ram_func"]
fn flash_experiment( ) {

unsafe {
    flash_enter_cmd_xip();
}

}

wih lto-='off' causes hang since it the code in RAM is jumping to code in flash:

20000000 <__sdata>:
20000000: 80 b5 push {r7, lr}
20000002: 00 af add r7, sp, #0
20000004: 00 f0 02 f8 bl 0x2000000c <__Thumbv6MABSLongThunk_rp2040_hal::rom_data::flash_enter_cmd_xip::he084f9a4ab71acef> @ imm = #4
20000008: 80 bd pop {r7, pc}
2000000a: d4 d4 bmi 0x1fffffb6 <__veneer_limit+0xfff8c76> @ imm = #-88

2000000c <__Thumbv6MABSLongThunk_rp2040_hal::rom_data::flash_enter_cmd_xip::he084f9a4ab71acef>:
2000000c: 03 b4 push {r0, r1}
2000000e: 01 48 ldr r0, [pc, #4] @ 0x20000014 <$d>
20000010: 01 90 str r0, [sp, #4]
20000012: 01 bd pop {r0, pc}

20000014 <$d>:
20000014: b1 3d 00 10 .word 0x10003db1

with lto='thin', no jump to FLASH memory

20000000 <__sdata>:
20000000: 80 b5 push {r7, lr}
20000002: 00 af add r7, sp, #0
20000004: 18 20 movs r0, #24
20000006: 02 88 ldrh r2, [r0]
20000008: 14 20 movs r0, #20
2000000a: 00 88 ldrh r0, [r0]
2000000c: 01 49 ldr r1, [pc, #4] @ 0x20000014 <$d.24>
2000000e: 90 47 blx r2
20000010: 80 47 blx r0
20000012: 80 bd pop {r7, pc}

20000014 <$d.24>:
20000014: 43 58 00 00 .word 0x00005843

So I think this has to be fixed in the HAL by adding the #[inline(never)] option

@jannic
Copy link
Member

jannic commented Aug 11, 2022

I don't think inline attributes can guarantee that no flash accesses are inserted by the compiler. Rust currently just doesn't provide a way to say "this function must be in RAM and must not depend on any other memory section".
Of course #[inline(always)] may work (it usually does) - but there is no guarantee that it always will.

That's why I implemented the relevant parts in assembly: https://github.com/jannic/rp2040-flash/blob/master/src/lib.rs#L189

@werediver
Copy link

werediver commented Aug 11, 2022

@jannic On the higher-level API of your rp2040-flash crate (using it in an ongoing project; appreciate your work). I wrote a slightly more ergonomic contraption based on your original example and can prepare a pull request with either an example update or to include the suggested interface into the crate, if you find it suitable. Open to discuss improvements you may find necessary.

https://github.com/werediver/escale/blob/b0fb37f120edd2cc3f8145f326f218a94ad06d69/escale_fw_rs/app/src/flash.rs

@jannic
Copy link
Member

jannic commented Aug 15, 2022

Hi @werediver,
as I didn't use the rp2040-flash library in a real application context yet, any feedback on it's usability is very welcome! Pull requests, ideas, anything.
We are just discussing the topic on the matrix channel, https://matrix.to/#/#rp-rs:matrix.org. (start of discussion), join in if you like.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

9 participants