Skip to content

[libc][stdio] Add fopen_s and bootstrap annex k. #152248

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 18 commits into
base: main
Choose a base branch
from

Conversation

bassiounix
Copy link
Contributor

@bassiounix bassiounix commented Aug 6, 2025

RFC https://discourse.llvm.org/t/rfc-bounds-checking-interfaces-for-llvm-libc/87685

TL;DR

Add support for C11 Annex K "Bounds-checking interfaces" to LLVM libc.

What changed?

This PR adds initial support for C11 Annex K "Bounds-checking interfaces" to LLVM libc, including:

  1. Added new types required by Annex K:

    • errno_t
    • rsize_t
    • constraint_handler_t
  2. Implemented constraint handler functions:

    • set_constraint_handler_s
    • abort_handler_s (default handler)
    • ignore_handler_s
  3. Added helper macros and utilities for constraint violation handling

  4. Implemented the first Annex K function:

    • fopen_s - a safer version of fopen
  5. Added necessary infrastructure for future Annex K function implementations

How to test?

  1. Build LLVM libc with the new Annex K functions enabled
  2. Write a test program that uses fopen_s and the constraint handler functions
  3. Verify that constraint violations are properly handled by the default abort_handler_s
  4. Test changing the constraint handler with set_constraint_handler_s

Example:

#define __STDC_WANT_LIB_EXT1__ 1
#include <stdio.h>
#include <stdlib.h>

int main() {
  FILE *fp;
  errno_t err = fopen_s(&fp, "test.txt", "r");
  if (err != 0) {
    printf("Failed to open file\n");
    return 1;
  }
  // Use file...
  fclose(fp);
  return 0;
}

Why make this change?

C11 Annex K provides bounds-checking interfaces that help improve security and safety in C programs. These functions are designed to prevent buffer overflows and other common security vulnerabilities. Adding support for these interfaces in LLVM libc makes it more compliant with the C11 standard and provides developers with safer alternatives to traditional C library functions.

Copy link
Contributor Author

This stack of pull requests is managed by Graphite. Learn more about stacking.

@bassiounix bassiounix force-pushed the users/bassiounix/spr/08-06-_libc_add_fopen_s branch from 2802a39 to 0466b6a Compare August 6, 2025 05:03
@bassiounix bassiounix requested a review from frobtech August 6, 2025 05:37
@bassiounix bassiounix changed the title [libc] Add fopen_s. [libc][stdio] Add fopen_s and bootstrap annex k. Aug 6, 2025
@bassiounix bassiounix force-pushed the users/bassiounix/spr/08-06-_libc_add_fopen_s branch from 66ec35a to 90fc6a7 Compare August 6, 2025 08:40
@bassiounix bassiounix force-pushed the users/bassiounix/spr/08-06-_libc_add_fopen_s branch 2 times, most recently from ec4924b to 2b051fe Compare August 8, 2025 13:16
Comment on lines 12 to 13
#if defined(__STDC_VERSION__) && __STDC_VERSION__ >= 201112L && \
defined(__STDC_WANT_LIB_EXT1__) && __STDC_WANT_LIB_EXT1__ == 1
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want the check for __STDC_VERSION__? I'm thinking about the case where someone in C++ is setting the __STDC_WANT_LIB_EXT1__ macro before including a C standard library header; that should still work for them, right?

(https://eel.is/c++draft/library#headers-10 relates)

Copy link
Contributor Author

@bassiounix bassiounix Aug 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

they should be able to include it yes. this could conflict with the compiler and make it unavailable. What I'm trying to do is to make sure no one before C11 can use them, if the first check is removed, then it's available in every C version.

If there is a C++ equivalent macro I'd put it here.

Also since this is C++ related, I wonder whether we should make this change here regarding C++ or in libcxx. I can't answer this question, but if it's the second case, we have to share code with libcxx as with #147386 (or this is my analogy anyway)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there's a deeper question for llvm-libc: what is the policy regarding symbols and language standard versions? Will the symbols only be declared for the standardized range of standards versions? Or will the symbols be declared in all standard versions because the symbol will exist within the binary regardless (we aren't vending a C11-specific version of the library that's different from the C23-specific version, right?).

Copy link
Contributor Author

@bassiounix bassiounix Aug 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are the only 3 places where I could find __STDC_VERSION__:

It's macro related stuff only, not core code.

So from what I found, I think the answer is: no, not really.

I could remove this specific check, but isn't this against the specification? shouldn't we be as compatible with the standard as possible?
We talked earlier about "standard-compliant" implementation and MSVC, that's why I am bringing it here again.
Michael said that the goal is not to provide the most compatible implementation libc but a useful libc.

What do you think?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could remove this specific check, but isn't this against the specification? shouldn't we be as compatible with the standard as possible?

I think it's not actually against the specification, but that doesn't mean it's a good idea. :-D Because standards before C11 had no notion of __STDC_WANT_LIB_EXT1__, and because that macro is in the reserved identifier space, we can decide that the user defining that macro to 1 means they get Annex K in modes before C11. The user is explicitly opting into a language extension in that case.

Where it gets tricky is, if the user doesn't define that macro then fopen_s is not a reserved identifier in modes before C11, which means the user can have an external symbol with that name. So having the symbols in the library at all make for the potential to "steal" an identifier from the user.

But the reason I brought this up in regards to C++ is because __STDC_VERSION__ is not defined in C++ and so having a check for that alone would mean you can't get to these interfaces from C++. We could also address that by adding || defined(__cplusplus) though.

Copy link
Contributor Author

@bassiounix bassiounix Aug 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aha.

For the latter, I'll let @michaelrj-google and other code owners decide on this matter, I can't say which one should it be atm.

Also it'd be nice to have a versioned libc for each language version. If the community wants it and it has a use case , I can start looking into achieving that! This makes me curious on how other libcs implements this and is it a specified behavior in the standard or not.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could also address that by adding || defined(__cplusplus) though.

Btw is it specified which version of C++ it's available on? like can we say C++03 has access to annex k?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general I'm in favor of having one library and only using C version selection where necessary. I believe that the reason for the version checks for the char types is due to them being undefined in some versions of C but primitive types in recent C++.

If a user wants to filter by C standard version or potentially by POSIX we'd probably need to use what headergen has: a list of functions tagged by standard and version. I think that's doable, but I'm not sure it's useful right now. I don't expect that many programs to define their own versions of functions from later C standards, then complain when it doesn't work.

@bassiounix bassiounix force-pushed the users/bassiounix/spr/08-06-_libc_add_fopen_s branch from 1128759 to 1e8e8df Compare August 8, 2025 16:28
@bassiounix bassiounix marked this pull request as ready for review August 8, 2025 17:24
Copy link

github-actions bot commented Aug 9, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

@bassiounix bassiounix force-pushed the users/bassiounix/spr/08-06-_libc_add_fopen_s branch from ef58e7b to d0931ec Compare August 9, 2025 14:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants