-
Notifications
You must be signed in to change notification settings - Fork 6.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fdtable: fix longstanding layering violations #75348
base: main
Are you sure you want to change the base?
fdtable: fix longstanding layering violations #75348
Conversation
twister -p native_sim -p qemu_riscv64 -p qemu_x86 -p qemu_x86_64 -p qemu_cortex_a53 -p mps2/an385 -T tests/posix
...
INFO - Total complete: 270/ 270 100% skipped: 84, failed: 0, error: 0
INFO - 45 test scenarios (270 test instances) selected, 84 configurations skipped (58 by static filter, 26 at runtime).
INFO - 186 of 270 test configurations passed (100.00%), 0 failed, 0 errored, 84 skipped with 0 warnings in 403.48 seconds
INFO - In total 7029 test cases were executed, 3759 skipped on 6 out of total 21 platforms (28.57%)
INFO - 181 test configurations executed on platforms, 5 test configurations were only built. A lot of build-only targets being skipped here.. twister --build-only -T tests/net/ -T tests/lib/c_lib
...
INFO - Total complete: 7042/7042 100% skipped: 5895, failed: 0, error: 0
INFO - 250 test scenarios (7042 test instances) selected, 5895 configurations skipped (5868 by static filter, 27 at runtime).
INFO - 1147 of 7042 test configurations passed (100.00%), 0 failed, 0 errored, 5895 skipped with 0 warnings in 2589.39 seconds
INFO - 0 test configurations executed on platforms, 1147 test configurations were only built. twister -i -p mimxrt1170_evk/mimxrt1176/cm7 -p mm_feather -p nrf9131ek/nrf9131/ns -s tests/posix/common/portability.posix.common.newlib -s tests/posix/common/portability.posix.common.tls.newlib -s tests/posix/fs/portability.posix.fs.newlib -s tests/posix/fs/portability.posix.fs.tls.newlib -s tests/posix/headers/portability.posix.headers.newlib.with_posix_api -s tests/lib/c_lib/thrd/libraries.libc.c11_threads.newlib -s tests/lib/c_lib/thrd/libraries.libc.c11_threads.newlib_nano
...
INFO - Total complete: 21/ 21 100% skipped: 2, failed: 0, error: 0
INFO - 7 test scenarios (21 test instances) selected, 2 configurations skipped (0 by static filter, 2 at runtime).
INFO - 19 of 21 test configurations passed (100.00%), 0 failed, 0 errored, 2 skipped with 0 warnings in 127.21 seconds
INFO - In total 696 test cases were executed, 210 skipped on 3 out of total 7 platforms (42.86%)
INFO - 0 test configurations executed on platforms, 19 test configurations were only built. |
468003b
to
398403c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although I understand what and why you are doing this I don't think this is the solution.
Defining a type in the "wrong" header to avoid including another header from that one will end up with an even more messy situation, as now issues will depend on the order of header inclusion by users of those headers.
It is not safe to redefine these types (as you do not know what the C library is defining those to), and attempting to guard those definitions with the C library internal guard names is going to not work with other libraries or when those guards names change.
@aescolar - this was the "path of least resistance" change that is non-destructive. The better change would be to simply not use inappropriate types in the base OS (i.e. simply change return values from |
While painful, I think the only 'conforming' solution is to avoid POSIX types entirely and either use existing C types or create your own Zephyr-specific ones. For For C libraries for Zephyr could then adopt this new OS type for their own Reading through the picolibc code, I'm afraid it defines |
Thanks for your help with this @keith-packard 🪁 |
398403c
to
55290f0
Compare
55290f0
to
f633095
Compare
@cfriedt the nature of this change which involves changing types, touching 400 files and changing sginatures of many APIs requires more discussion and following the process outlined above, i.e. discussion and consesnus in architecture working group and wide broadcasting of the change to the community and those affected. This would have been the case if we were not in the middle of a release. The fact that we are in the middle of a release and closing on a hard freeze milestone in 2 days makes it a bit more critical to get a signoff from stakeholders, not only rely on CI passing to get it merged. The arch. wg was cancelled yesterday, so I can give you a slot in today's TSC to make the case for this to go into 3.7, if you wish to do that please mark this issue with the TSC label and join the TSC today. |
@nashif - I have an appointment during the TSC call unfortunately. It's not something I usually plan on attending these days. Additionally, I would be advocating for the same things I advocated for the last time I attended the TSC meeting. I.e. putting v3.7-branch into a better state for additional bug fixes and stabilization fixes. At this point, it's unlikely that POSIX will be able to receive any significant backports in v3.7-branch. There are also other fairly significant bugs that would be part of the release at this point (such as #75402). Personally, I wouldn't consider tagging v3.7.0 without additional bug fixes, but that isn't my decision to make. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 to fixing the layering issue, it is unfortunate that there are a lot of affected files and we didn't get to stop this from snowballing much earlier.
It is pretty late in the current release cycle, but for users that jump from one LTS release to the next, getting this sorted out before the release means that LTS v3 users will be better positioned for the next LTS release.
Not only is it late in the cycle but it still is a tree-wide change so I believe the process linked in a previous comment would still need to be followed, and this potential change communicated to a wider audience ASAP. |
Yes, the process requires the arch wg to approve, and then when it is stamped, an email should be sent out to the devel mailing list before it is merged, etc. If the arch wg does not come to a consensus it goes to the TSC, etc. For the arch wg to approve, there would need to be consensus, and I think that would be the challenging part; some people really like defending bad decisions. It would also need some buy-in from the release manager... but... 🤷🏼♂️ It is very late in the release cycle for all of this. Of course; this should have been fixed many, many, releases ago 🤷🏼♂️ |
Define k_off_t to have native support for measuring file offsets within any kind of file that is supportable in Zephyr. CONFIG_OFFSET_64BIT is added to allow 32-bit users to force 64-bit offsets, when trading space and speed for flexibility makes sense for the application. Define k_ssize_t to have native support for reporting file I/O sizes along with negative error codes. Signed-off-by: Chris Friedt <cfriedt@tenstorrent.com>
Update sof to use k_ssize_t instead of the POSIX ssize_t, thus avoiding layering violations in core parts of Zephyr. Signed-off-by: Chris Friedt <cfriedt@tenstorrent.com>
A shortcut was taken many years ago by "cherry-picking" the POSIX off_t and ssize_t types for use throughout Zephyr. Additionally, the POSIX header fcntl.h, as well as constants defined in that header, were being used inappropriately throughout Zephyr. Doing so created a dependency cycle: below POSIX's position in the stack, code depended on POSIX; above POSIX's position in the stack, code depends on the lower parts. Such dependency cycles usually result in fragility and instability of the software stack. Use the newly defined k_off_t and k_ssize_t types throughout Zephyr's stack, where off_t and ssize_t were previously used inappropriately. Additionally, use ZVFS-prefixed constants instead of their POSIX counterparts. Additionally, where appropriate, ensure the POSIX fcntl.h header is prefixed with <zephyr/posix/fcntl.h>. We effectively create a mutual dependency to resolve the cyclic dependency, as described in GitHub issue zephyrproject-rtos#51211. For clarification, it is inappropriate to use POSIX types or functions within the kernel, core OS, OS services, or really anywhere that is not equal to or above POSIX's position in the software stack. In other words, if a library uses POSIX, Zephyr's POSIX implementation cannot depend on that library. Similarly, if a system service uses POSIX, Zephyr's POSIX implementation cannot depend on that system service. Signed-off-by: Chris Friedt <cfriedt@tenstorrent.com>
502a493
to
641a504
Compare
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From ESP32 side and memory discussion, LGTM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ooph, that's a lot of change. Might as well +1 just so we can get this in. In the longer term, maybe we should look at moving usage from ssize_t (AIUI a POSIX type) to ptrdiff_t (ISO C). Not sure whether off_t has an equivalent in the standard library, though? Probably not, so I guess a k_off_t is unavoidable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops forgot vote
ssize_t len; | ||
k_ssize_t len; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this really correct? This driver is for native_sim, and some 10 lines later the return value from the host libc's read()
is assigned to it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The definitions should be identical (as far as the underlying type goes). Otherwise, there would be warnings / errors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't doubt that. I was mainly wondering where a native_sim
driver like this conceptually sits in the software stack. It's calling the host OS APIs (like read()
) so in that sense it could be seen to be above POSIX (in which case ssize_t
would be more appropriate), however the driver also accesses Zephyr kernel APIs. Anyway, this isn't necessarily that critical (as you say, the types should be the same), which is why I didn't want to block this PR by requesting changes, but I do think it would be good to understand where native_sim
driver code sits in the software stack.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, it's kind of a recursion. I would still say those drivers sit below (Zephyr's) POSIX API.
The namespace issue is kind of an unfortunate side-effect (not without solutions, ofc)
Duh, I forgot stdio. So interestingly fread/fwrite() use |
Technically size_t would be the same already for 64 bit (without the sign) |
Not at all -- On 32-bit Linux systems, |
Aren't standards wonderful?? 😂🙃 |
The PR is reaching the limit of what GH can handle, opening it in the browser just takes forever :( from the commit message of the largest commit:
Can this be done in its own commit?
ditto
ditto |
@@ -866,6 +866,23 @@ config XIP | |||
supply a linker command file when building your image. Enabling this | |||
option increases both the code and data footprint of the image. | |||
|
|||
config OFFSET_64BIT | |||
bool "Use 64-bit offsets" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would be great if this went into its own section in Kconfig (menu), i.e. do not put alongside other features and make it obvious is it a type, maybe others will be added to the same menu later.
OFFSET_64BIT
can be confusing, maybe make it obvious we are talking about a type here: TYPE_OFFSET_64BIT
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. The closest thing I found to this so far is TIMEOUT_64BIT
. Should that be moved as well? It also has to do with a type (k_timeout_t
).
|
Although fdtable sat a layer below posix, as part of the base os, it was reaching up a layer for POSIX types, creating another dependency cycle.
Instead of forcing POSIX types to be used, simply add the native Zephyr types,
k_off_t
andk_ssize_t
. Remove<sys/types.h>
and<fcntl.h>
inclusion below the POSIX layer, where appropriate.Generally, it would be best to use only ISO or native zephyr types below POSIX.
(Would have fixed #75205 with a very small workaround)
Fixes #75513
Fixes #10436
Update:
To future-proof Zephyr in terms of file, disk, and device compatibility, a Kconfig option was added that allows the user to explicitly use a 64-bit
k_off_t
, even if the underlying hardware is only 32-bit.