Skip to content

fatfs: Update error code mapping #6120

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Feb 26, 2018
Merged

fatfs: Update error code mapping #6120

merged 2 commits into from
Feb 26, 2018

Conversation

geky
Copy link
Contributor

@geky geky commented Feb 16, 2018

This is in response to #6072

A lot of the error codes in fatfs were mapped incorrectly. This patch revisits the error code mapping to try to correct these mistakes. I'm open to feedback on these.

ChanFS was should be ChanFS description
FR_OK 0 0 Succeeded
FR_DISK_ERR EIO EIO A hard error occurred in the low level disk I/O layer
FR_INT_ERR EBADF -1 Assertion failed
FR_NOT_READY EIO EIO The physical drive cannot work
FR_NO_FILE ENOENT ENOENT Could not find the file
FR_NO_PATH ENOENT ENOTDIR Could not find the path
FR_INVALID_NAME ENOENT EINVAL The path name format is invalid
FR_DENIED EACCES EACCES Access denied due to prohibited access or directory full
FR_EXIST EEXIST EEXIST Access denied due to prohibited access
FR_INVALID_OBJECT EFAULT EBADF The file/directory object is invalid
FR_WRITE_PROTECTED EACCES EACCES The physical drive is write protected
FR_INVALID_DRIVE ENOENT ENODEV The logical drive number is invalid
FR_NOT_ENABLED ENXIO ENODEV The volume has no work area
FR_NO_FILESYSTEM ENOENT EINVAL There is no valid FAT volume
FR_MKFS_ABORTED EBADF EIO The f_mkfs() aborted due to any problem
FR_TIMEOUT EBADF ETIMEDOUT Could not get a grant to access the volume within defined period
FR_LOCKED EACCES EBUSY The operation is rejected according to the file sharing policy
FR_NOT_ENOUGH_CORE ENOMEM ENOMEM LFN working buffer could not be allocated
FR_TOO_MANY_OPEN_FILES ENFILE ENFILE Number of open files > FF_FS_LOCK
FR_INVALID_PARAMETER ENOEXEC EINVAL Given parameter is invalid

cc @kegilbert, @deepikabhavnani
intended for minor release
should resolve #6072
note, #5653 is related but not fixed by this pr (requires changes in ChanFS).

A lot of the error codes in fatfs were mapped incorrectly. This patch
revisits the error code mapping to try to correct these mistakes
@kegilbert
Copy link
Contributor

kegilbert commented Feb 16, 2018

Should FR_TIMEOUT be tied to try again (EAGAIN) or timer expired (ETIME)? It's not really a timer expiring so not a great match either but ¯\(ツ)

@geky
Copy link
Contributor Author

geky commented Feb 16, 2018

I was thinking EAGAIN to align with the netsocket APIs. But it looks like this error comes from the possibility of failing to get a mutex. If we base this off open group's sem_timedwait, I believe this should actually be ETIMEDOUT. The only place I could find ETIME was in ioctl calls, but I'm not sure why it's used there over ETIMEDOUT.

Will update to ETIMEDOUT.

kegilbert
kegilbert previously approved these changes Feb 16, 2018
Copy link
Contributor

@kegilbert kegilbert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@deepikabhavnani
Copy link

deepikabhavnani commented Feb 20, 2018

FR_LOCKED | EACCES | EBUSY | The operation is rejected according to the file sharing policy

Non blocking comment: Operation rejection is because of access permissions, should it be EACCES?

@geky
Copy link
Contributor Author

geky commented Feb 20, 2018

I'm glad you asked. I think some of the ChanFS error descriptions are a bit misleading. I ended up looking at their use in the codebase which helped clarify things:

  • FR_LOCKED

    /* The object was opened. Reject any open against writing file and all write mode open */
    return (acc != 0 || Files[i].ctr == 0x100) ? FR_LOCKED : FR_OK;

    From it's use, it looks like FR_LOCKED indicates the file is already in use. ChanFS doesn't handle concurrent modification of files so instead it returns FR_LOCKED in this case. Similar situations (such as removing a directory that's in use) return EBUSY in POSIX.

case FR_TIMEOUT: /* (15) Could not get a grant to access the volume within defined period */
default: /* Bad file number */
return -EBADF;
case FR_OK: return 0; // (0) Succeeded
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you fix the formatting (as it was)

case FR_OK:                   
    return 0;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kind of like the new style, makes it more compressed and a bit easier to read. Not sure of the precedence here though just my 2 cents.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@0xc0170, #6014 will update this anyways won't it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@geky Just because a PR fixes style issues, doesn't mean we now get to freely make them ;)

@kegilbert I believe the new line is already a precedent. I also think that if someone were doing a quick scan of the code, the indented new line sticks out, whereas this formatting is easy to miss (imo).

Copy link
Contributor Author

@geky geky Feb 23, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cmonr Feel free to update

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough
image

@cmonr cmonr dismissed stale reviews from deepikabhavnani and kegilbert via b40ff8a February 23, 2018 17:23
Copy link
Contributor

@kegilbert kegilbert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM (although could use a squash)

@cmonr
Copy link
Contributor

cmonr commented Feb 23, 2018

/morph build

@mbed-ci
Copy link

mbed-ci commented Feb 23, 2018

Build : SUCCESS

Build number : 1236
Build artifacts/logs : http://mbed-os.s3-website-eu-west-1.amazonaws.com/?prefix=builds/6120/

Triggering tests

/morph test
/morph uvisor-test
/morph export-build
/morph mbed2-build

@mbed-ci
Copy link

mbed-ci commented Feb 23, 2018

@mbed-ci
Copy link

mbed-ci commented Feb 23, 2018

@cmonr cmonr merged commit b7c2b1f into master Feb 26, 2018
@JanneKiiskila
Copy link
Contributor

Do you have a label for API breakages? You should have.

@geky
Copy link
Contributor Author

geky commented Mar 2, 2018

Thanks @kjbracey-arm

From a different conversation #6072 (comment):

@0xc0170, quick question, do we consider changing error codes to match documentation acceptable for patch releases? My gut says no, but at the same time these may be considered bugs?

If API specifies error codes and those are correct, it is a bug in the implementation. Should be fine for patch release (assuming it has clear commit msg and gets to the release notes).

If in doubt, we are close to feature release so can be fixed there

The key here was changing error codes to match documentation. It's a gray area, but we were under the impression this was acceptable. Let us know if we're wrong though. I'm fine with holding error code fixes until minor releases to keep dependencies moving smoothly.

@kjbracey
Copy link
Contributor

kjbracey commented Mar 5, 2018

I didn't see that discussion.

I agree it's a grey area, but in this case I did get my ear pulled by the client guys - this did break their apps. I'd be inclined to try to make the change asap on a patch if I thought it was unlikely anyone was relying on the broken behaviour (or if we knew someone was already relying on correct behaviour due to using another FS, and this would stop them using our FS). This late in the 5.7 cycle though, and given that it did trip them, I think we should wait for 5.8.0.

If the documentation was clear, I'm not sure why they wouldn't write to cope with the documented values as well as the incorrect ones they must have found by inspection/testing. Did we actually specify it locally, or are we just deferring to POSIX?

@0xc0170
Copy link
Contributor

0xc0170 commented Mar 5, 2018

Do you have a label for API breakages? You should have.

Yes, it is in the PR type template and that should signal this is for a minor version. It is now labeled for the next upcoming minor

If the documentation was clear, I'm not sure why they wouldn't write to cope with the documented values as well as the incorrect ones they must have found by inspection/testing. Did we actually specify it locally, or are we just deferring to POSIX?

@geky Looking at this change, where are tests that would need to be fixed?

@geky
Copy link
Contributor Author

geky commented Mar 5, 2018

This specific pr landing on 5.7 was a mistake, I was more asking about future changes of this sort. It sounds like the answer is to only allow change on patch if there's a good reason.

I did note this should be on a minor release in the original pr. See the pr message. I wasn't following the merge closely enough to notice it landed on a patch:

intended for minor release


@kjbracey-arm, We are just defering to POSIX. Though this was raised as a concern elsewhere and I'm planning on adding error codes to the FileSystem doxygen when I can.

@0xc0170, This pr is a part of the work required to run the filesystem tests on FAT. The tests expect the correctly error codes. Currently we aren't running FAT filesystem tests in CI.

@cmonr
Copy link
Contributor

cmonr commented Mar 5, 2018

I did note this should be on a minor release in the original pr. See the pr message.

Yup, I missed that.

@0xc0170, This pr is a part of the work required to run the filesystem tests on FAT. The tests expect the correctly error codes. Currently we aren't running FAT filesystem tests in CI.

Sounds like quite the chicken and egg problem. Does this mean we're close to being able to run the tests in CI?

@kjbracey-arm Do we have an idea of what features client is/are depending on and/or contact we could ping with PRs that might affect them? I wasn't expecting a change like this would have tripped anyone up.

@kjbracey
Copy link
Contributor

kjbracey commented Mar 6, 2018

Not sure what specifically tripped the client guys - I expect it was some sort of device update initialisation code doing something like if (result == XXXX) { reformat_partition(); }.

Basically this sort of change could trip up anyone who actually cares about return values (but not enough to actually check for standard POSIX ones as well as what fatfs was currently returning). Code tends not to inspect error values, but sometimes they matter.

Client code is maybe a bit more susceptible to error return changes due to having the "PAL" glue layer using conversions from native error codes to PAL_XXX values. That gives an extra layer that can make an error mapping mistake. Often seems to need updating for a different network interface/stack/filing system - this effectively starts acting like a different filing system.

@geky geky deleted the g-fat-errors branch April 20, 2018 16:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

FAT Filesystem error code maps to incorrect errors
8 participants