-
Notifications
You must be signed in to change notification settings - Fork 111
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DHARA+FATFS: boot sector is remapped without a copy #23
Comments
On Tue, Jun 22, 2021 at 06:28:35AM -0700, gedworker wrote:
I have faced with a problem which i can't solve for more than two weeks.
I am using dhara with FatFS file system and a nand flash.
I wrote a simple test which opens a file writes 64 bytes of data and closes it.
The test spins for about 1000-3000 times. The flash is large enough to hold all written data.
Everything works fine except that sometimes the fat fs function open file (f_open) crashes.
After that I can't mount the file system with f_mount function.
I have started digging into dhara and found more or less what is going on.
The f_mount function fails because it can't find the boot sector for the fat file system.
It is the zero sector which has 0xAA55 signature.
I added several logging functions into my test to track what happens with zero sector.
I have added an output log to dhara_nand_copy function.Also into dhara_nand_prog function to track all write to pages.
Also i check the validity of the zero sector reading it content through dhara_map_find function.
So my test sequence looks like the following.
1. format flash
2. open file
3. write 64bytes
4. close file
5. read zero sector and check that it has 0xAA55 signature
6. goto 2
So i see that on f_close function dhara starts copying pages. And sometimes i see that boot sector is copied also.
After that i see that boot sector is mapped to a new location.
Everything looks fine but when error happens the boot sector is mapped to a new location but I don't see that it was copied.
And after that my boot sector check function fails and file system is crashed.
But if i read the sector from using old location it is there.
So dhara remaps the boot sector but does not copy it to that location.
How that could happen? Maybe my configuration is incorrect.
What should I check?
Here is a log example:
`---<info> File closed
---<info> BOOT SECTOR is mapped to page 31334
---<info> file opened
---<info>write 64 bytes
---<info>close file
---<info> -copy from ---- to ---- (A LOT OF OTHER PAGES ARE COPYED)
---<info> -copy from ---- to ----
---<info> -copy from ---- to ----
---<info> -copy from 31334 to 31484 (HERE BOOT SECTOR IS COPYED)
---<info> -copy from ---- to ----
---<info> File closed
---<info> BOOT SECTOR is mapped to page 31484 (HERE BOOT SECTOR IS MAPPED TO A NEW LOCATION)
---<info> file opened
---<info>write 64 bytes
---<info>close file
---<info> -copy from ---- to ---- (A LOT OF OTHER PAGES ARE COPYED)
---<info> -copy from ---- to ----
---<info> -copy from ---- to ----
---<info> -copy from ---- to ---- (BOOTSECTOR is not copyed here)
---<info> -copy from ---- to ----
---<info> -copy from ---- to ----
---<info> File closed
---<info> BOOT SECTOR is mapped to page 31498 (HERE BOOT SECTOR IS MAPPED TO A NEW LOCATION)
---<info> BOOT SECTOR is invalid`
As you see from the last log the boot sector is at the new location but was not copied.
Although if i read the previous sector (31484) the boot sector will be there.
Why dhara maps a page but does not copy it?
Thanks,
Copying is not the only way that a sector can be remapped -- it could
occur with it being rewritten, and in that case you wouldn't see a call
to dhara_nand_copy.
The next thing I would try would be to log calls to dhara_nand_prog as
well and see if anything turns up for the final location.
…--
Daniel Beer ***@***.***> http://dlbeer.co.nz/
PGP: BA6E 0B26 1F89 246C E3F3 C910 1E58 C43A 160A 553B
|
Hello Daniel, Thanks for reply. [00:34:21.724,609] file_srv: Write 64 bytes on file [123.bin] [00:34:22.044,921] file_srv: write string to file done, writen 64 bytes [00:34:22.054,687] file_srv: Write 64 bytes on file [123.bin] Logs dropped (1) So the boot sector was mapped to 41394. After file close it became mapped to page 41353. |
On Wed, Jun 23, 2021 at 03:43:09AM -0700, gedworker wrote:
[00:34:22.041,992] <info> MEM_TST: DHARA:journal root is: 41566
[00:34:22.042,968] <info> MEM_TST: BOOTSECTOR IS MAPPED TO PAGE 41394
...
[00:34:22.373,046] <info> MEM_TST: DHARA:journal root is: 41582
Logs dropped (1)
[00:34:22.374,023] <info> MEM_TST: BOOTSECTOR IS MAPPED TO PAGE 41353
This looks like a problem -- during a period in which the root changes
from 41566 to 41582, and the journal doesn't wrap, there is no way that
the root can change to a page which is not something in that range.
Next steps for debugging might be:
* Have you verified your NAND driver and configuration? Use a PRNG
seeded with a unique number for each page to fill every page on the
chip, and then read back and check that all the data is there
(you're looking not just for read/write failures, but also
addressing errors that land multiple pages in the same place).
* If you're quite sure that the NAND driver is ok, I would then start
to try to reproduce the scenario on a PC using a simulated NAND
driver. If you've ruled out the NAND driver as a problem and it's a
bug in Dhara, this will make it easy to fix.
Cheers,
Daniel
…--
Daniel Beer ***@***.***> http://dlbeer.co.nz/
PGP: BA6E 0B26 1F89 246C E3F3 C910 1E58 C43A 160A 553B
|
On Thu, Jun 24, 2021 at 06:48:17AM +1200, Daniel Beer wrote:
This looks like a problem -- during a period in which the root changes
from 41566 to 41582, and the journal doesn't wrap, there is no way that
the root can change to a page which is not something in that range.
Sorry, I meant to say "there is no way that the boot sector can
change...". Any change to any mapping, if the journal hasn't wrapped,
must point to a page in the range between the old and new roots.
…--
Daniel Beer ***@***.***> http://dlbeer.co.nz/
PGP: BA6E 0B26 1F89 246C E3F3 C910 1E58 C43A 160A 553B
|
Hello, |
On Fri, Jun 25, 2021 at 07:53:15AM -0700, gedworker wrote:
I have noticed a strange thing. When i create several files so that almost 80% of the space is used i delete one file.
So there is more that 50% free space available.
After that i create a new file and write to it.
But writing now takes much longer than it was after format.
Is that relates to dhara gc or something else?
When you format, do you also reinitialize the Dhara map? If so, that may
explain it. Dhara can avoid doing GC if there are sufficiently many
unallocated sectors.
What do you have set for GC ratio? Is there any opportunity to invoke
trim() on sectors that are known to be unused in your filesystem (e.g.
the sectors occupied by a deleted file)?
…--
Daniel Beer ***@***.***> http://dlbeer.co.nz/
PGP: BA6E 0B26 1F89 246C E3F3 C910 1E58 C43A 160A 553B
|
Hello,
The file creation after delete takes 34 seconds. |
On Sun, Jun 27, 2021 at 12:53:15PM -0700, gedworker wrote:
I have tested today with simulator on Visual studio.
My test sequence looks the following
1. format_flahs with DHARA initialization
2. create_file_100MB --------------------takes 610 MS
3. create_file_100MB --------------------takes 609 MS
4. create_file_100MB --------------------takes 609 MS
5. delete file 1 file using f_unlink
6. create_file_100MB --------------------takes 34313 MS
The file creation after delete takes 34 seconds.
I initialize Dhara map on format.
The garbage ratio is set to 4.
The size of the flash is 385 MB.
So that happens because GC is working? Because when 3 files are created they occupy 300MB. And there 85MB left.
I have checked that when i create only two files and delete one and create another. It is created fast enough.
I will check whether trim can be used.
Yes, that is likely a consequence of GC. You could also set GC_RATIO to
something much lower, but then you will have a corresponding drop in
usable space.
A read cache will also likely increase performance significantly (many
partial reads are performed during both writes and GC).
Cheers,
Daniel
…--
Daniel Beer ***@***.***> http://dlbeer.co.nz/
PGP: BA6E 0B26 1F89 246C E3F3 C910 1E58 C43A 160A 553B
|
Hello,
I see that when i set larger value for GC ratio (It is the last parameter in dhara_map_init function) the file is created much faster after delete. Why do you recommend to set lower ratio?
Does that mean that on initialization more space will be reserved for dhara structures or more space will be consumed durint writing and deletion. There is not any good detailed description on dhara internal mechanism so i am a bit confused how everything works under the hood. |
On Mon, Jun 28, 2021 at 01:21:09PM -0700, gedworker wrote:
I see that when i set larger value for GC ratio (It is the last
parameter in dhara_map_init function) the file is created much faster
after delete. Why do you recommend to set lower ratio?
Then that's a bit strange, and I'm not sure what's going on there. Did
the file creation actually succeed (remember that you will have less
available space)?
The GC ratio determines the amount of GC work done per write when the
map size is near capacity.
Does that mean that on initialization more space will be reserved for
dhara structures or more space will be consumed durint writing and
deletion.
It means that more of the raw flash is allowed to be garbage at any
given time. This, combined with a lower number of usable sectors, means
that less work is required to keep up.
There is not any good detailed description on dhara internal mechanism
so i am a bit confused how everything works under the hood.
Have you read map_internals.txt in the repository?
Cheers,
Daniel
…--
Daniel Beer ***@***.***> http://dlbeer.co.nz/
PGP: BA6E 0B26 1F89 246C E3F3 C910 1E58 C43A 160A 553B
|
@gedworker did you ever figure out what was happening? I was struggling with what seems to be the exact same problem this week. I create a FAT32 partition, write a file close the file, unmount and remount. It works sometimes but eventually it fails to read the boot sector. If you found a solution that would be wonderful. I added a bunch of debugging including verifying all writes/ and copies by reading back and from what I can tell my NAND driver is working fine. Thanks, -Brett |
Hello bbrother , |
Hello,
I have faced with a problem which i can't solve for more than two weeks.
I am using dhara with FatFS file system and a nand flash.
I wrote a simple test which opens a file writes 64 bytes of data and closes it.
The test spins for about 1000-3000 times. The flash is large enough to hold all written data.
Everything works fine except that sometimes the fat fs function open file (f_open) crashes.
After that I can't mount the file system with f_mount function.
I have started digging into dhara and found more or less what is going on.
The f_mount function fails because it can't find the boot sector for the fat file system.
It is the zero sector which has 0xAA55 signature.
I added several logging functions into my test to track what happens with zero sector.
I have added an output log to dhara_nand_copy function.Also into dhara_nand_prog function to track all write to pages.
Also i check the validity of the zero sector reading it content through dhara_map_find function.
So my test sequence looks like the following.
So i see that on f_close function dhara starts copying pages. And sometimes i see that boot sector is copied also.
After that i see that boot sector is mapped to a new location.
Everything looks fine but when error happens the boot sector is mapped to a new location but I don't see that it was copied.
And after that my boot sector check function fails and file system is crashed.
But if i read the sector from using old location it is there.
So dhara remaps the boot sector but does not copy it to that location.
How that could happen? Maybe my configuration is incorrect.
What should I check?
Here is a log example:
--- File closed
--- BOOT SECTOR is mapped to page 31334
--- file opened
---write 64 bytes
---close file
--- -copy from ---- to ---- (A LOT OF OTHER PAGES ARE COPYED)
--- -copy from ---- to ----
--- -copy from ---- to ----
--- -copy from 31334 to 31484 (HERE BOOT SECTOR IS COPYED)
--- -copy from ---- to ----
--- File closed
--- BOOT SECTOR is mapped to page 31484 (HERE BOOT SECTOR IS MAPPED TO A NEW LOCATION)
--- file opened
---write 64 bytes
---close file
--- -copy from ---- to ---- (A LOT OF OTHER PAGES ARE COPYED)
--- -copy from ---- to ----
--- -copy from ---- to ----
--- -copy from ---- to ---- (BOOTSECTOR is not copyed here)
--- -copy from ---- to ----
--- -copy from ---- to ----
--- File closed
--- BOOT SECTOR is mapped to page 31498 (HERE BOOT SECTOR IS MAPPED TO A NEW LOCATION)
--- BOOT SECTOR is invalid
As you see from the last log the boot sector is at the new location but was not copied.
Although if i read the previous sector (31484) the boot sector will be there.
Why dhara maps a page but does not copy it?
Thanks,
The text was updated successfully, but these errors were encountered: