Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extension: allow recursive macros #65851

Open
wants to merge 18 commits into
base: main
Choose a base branch
from

Conversation

kelbon
Copy link
Contributor

@kelbon kelbon commented Sep 9, 2023

Add 'define2' directive, which works as 'define', but allows recursive macros

Motivation:

  • There are huge amount of code, which uses code generation/ boiler plate macros /misused terrible templates which is basically 'for each token' or somehow may be solved with this new feature
  1. Nlohmann json:
    https://github.com/nlohmann/json/blob/836b7beca4b62e2a99465edef44066b7401fd704/include/nlohmann/detail/macro_scope.hpp#L320

  2. boost preprocessor:
    https://github.com/boostorg/preprocessor/blob/develop/include/boost/preprocessor/seq/detail/limits/split_1024.hpp

  3. boost pfr:
    (codegen)
    https://github.com/boostorg/pfr/blob/develop/include/boost/pfr/detail/core17_generated.hpp

  4. data_parallel_vector:
    https://github.com/kelbon/AnyAny/blob/4b056be2b6cbcfa1a407f7ee75279af414e390e4/include/anyany/noexport/data_parallel_vector_details.hpp#L62

  • Its easily may be used for what 'magic enum' do, in many cases it can replace reflection ( because many who want reflection actually just want to create a JSON schema without specifying names twice )

  • C++20 adds __VA_OPT__, which is designed for recursive macros, but there are no such thing in C++!

Examples:

fold
#define2 $fold_right(op, head, ...) ( head __VA_OPT__(op $fold_right(op, __VA_ARGS__)) )
#define2 $fold_left(op, head, ...) ( __VA_OPT__($fold_left(op, __VA_ARGS__) op) head )

static_assert($fold_right(+, 1, 2, 3) == 6);
// error: static assertion failed due to requirement '((((4) + 3) + 2) + 1) == 4'
static_assert($fold_left(+, 1, 2, 3, 4) == 4);
reverse token stream
#define2 $reverse(head, ...) __VA_OPT__($reverse(__VA_ARGS__) , ) head

// works as expected
constexpr int A[] = { $reverse($reverse($reverse(1, 2, 3))) };
constexpr int B[] = { 3, 2, 1 };
static_assert(A[0] == B[0] && A[1] == B[1] && A[2] == B[2]);
transform token stream ( literaly for each )
#define2 $transform(macro, head, ...) macro(head) __VA_OPT__($transform(macro, __VA_ARGS__))

#define $to_string(tok) #tok,

constexpr const char* names[] = {
  $transform($to_string, a, b)
#undef $to_string
};
static_assert(names[0][0] == 'a' && names[1][0] == 'b');
calculate count of tokens
#define2 TOKCOUNT_IMPL(head, ...) (1 __VA_OPT__(+ TOKCOUNT_IMPL(__VA_ARGS__)))
// works for zero args too
#define $tok_count(...) (0 __VA_OPT__(+ TOKCOUNT_IMPL(__VA_ARGS__)) )

static_assert($tok_count() == 0);
static_assert($tok_count(1, 2, (4, 5, 6)) == 3);
boost pfr without code generation
// placeholders for actual calculations
template<typename T>
consteval int aggregate_size() { return 3; }
constexpr int tie(auto&... args) { return sizeof...(args); }

#define2 $try_expand(value, head, ...)                                            \
if constexpr (aggregate_size<decltype(value)>() == $tok_count(+1, __VA_ARGS__)) { \
  auto [head __VA_OPT__(,) __VA_ARGS__] = value;                                  \
  return tie(head __VA_OPT__(,) __VA_ARGS__);                                     \
}                                                                                 \
__VA_OPT__($try_expand(value, __VA_ARGS__))

constexpr auto magic_get(auto aggregate) {
  $try_expand(aggregate, _3, _2, _1);
}

struct abc { int a, b, c; };
static_assert(magic_get(abc{}) == 3);

Here magic get expands to (screenshot from clangd builded with this patch)

image

infinite recursion macro:
#define2 A A
// produces 'error: unknown type name 'A'' (expanded to 'A')
// A

image

@kelbon kelbon requested review from a team as code owners September 9, 2023 14:59
@llvmbot llvmbot added clang Clang issues not falling into any other category clang-format clang:frontend Language frontend issues, e.g. anything involving "Sema" labels Sep 9, 2023
@kelbon kelbon changed the title add define2 pp directive Extension: allow recursive macros Sep 9, 2023
@cor3ntin
Copy link
Contributor

cor3ntin commented Sep 9, 2023

Compiler extensions are best proposed through a RFC, ie a discourse post here:

https://discourse.llvm.org/c/clang/6

Explaining the motivation for the feature, which the pr description already does a good job.

There are questions about the name, theorically this is an identifier that c and c++ committees may want to use in the future.

You should also consider whether this needs extensions flags and extension warnings.

Is this something you plan to propose to other compilers? Standardization?

It does sound like an interesting feature though, and the implementation appear reasonable.

You should also consider recursion limits and other edge cases.

In terms of the code itself, it's missing unit tests

@kelbon
Copy link
Contributor Author

kelbon commented Sep 9, 2023

There are questions about the name, theorically this is an identifier that c and c++ committees may want to use in the future.

Yes, the name define2 is a placeholder, but the more I look at it, the more it looks appropriate

Is this something you plan to propose to other compilers? Standardization?

Maybe, but I dont think standard committee will consider the proposal without reference implementation(if its not a modules ofc)) and, in case of preprocessor, usage in 'real world'

You should also consider whether this needs extensions flags and extension warnings.
missing unit tests

i will research how to do it in llvm

@rymiel
Copy link
Member

rymiel commented Sep 9, 2023

How would you deal with the issue of infinite loops? The following program:

#define2 boom(X) boom(X)

boom(0)

Will continue to infinitely(?) consume memory until it is killed, either by the user or due to running out of memory on the system

@kelbon
Copy link
Contributor Author

kelbon commented Sep 9, 2023

How would you deal with the issue of infinite loops? The following program:

#define2 boom(X) boom(X)

boom(0)

Will continue to infinitely(?) consume memory until it is killed, either by the user or due to running out of memory on the system

As for other similar things like templates/inheriting/includes must be recursion depth limit, but in this case "MacroInfo" required to be 40 bytes(and code dont says for what, but i expect some runtime sized structs etc), so now i think how to do it better

@AaronBallman
Copy link
Collaborator

I'd like to echo what @cor3ntin suggested above -- this needs an RFC to be posted to Discourse so the community is aware of the potential new language extension. We have a set of criteria for what we're looking for in such an RFC: https://clang.llvm.org/get_involved.html#criteria -- having the patch is a great start though as you can link to the patch from the RFC to give people a more concrete idea of what's being proposed.

Yes, the name define2 is a placeholder, but the more I look at it, the more it looks appropriate

I'm not certain it's a particularly good name in terms of standardization as there's no way for users to distinguish between define and define2.

Maybe, but I dont think standard committee will consider the proposal without reference implementation(if its not a modules ofc)) and, in case of preprocessor, usage in 'real world'

There's a chicken-and-egg problem. The committee wants to see implementation experience and the community wants to see support from the committee for larger ideas (we don't really want to add language extensions over objections from a standards committee unless there's sufficiently compelling reasons to do so).

I have not yet had the chance to review the code changes, but I do have some concerns at a high-level. Preprocessor language extensions are a bit difficult because of adoption issues. Unless other implementations also implement the same extension, a significant number of folks wind up needing to use preprocessor conditionals to skip the new feature and write fallback code so their macros remain portable, but once you write that fallback code, you no longer need the extension code in the first place. When writing your RFC, it would be good to mention whether you've been talking to other implementations and what their thoughts, opinions, and concerns are. It's also worth noting that you can write recursive macros in C, though it's not easy without use of a macro library like P99.

@kelbon
Copy link
Contributor Author

kelbon commented Sep 11, 2023

this needs an RFC to be posted to Discourse so the community is aware of the potential new language extension

I will create it soon

I'm not certain it's a particularly good name in terms of standardization as there's no way for users to distinguish between define and define2.

I think this should be determined during the discussion, another possible name is define_recursive (but its not requireid to be recursive..)

Unless other implementations also implement the same extension, a significant number of folks wind up needing to use preprocessor conditionals to skip the new feature and write fallback code so their macros remain portable

Yes, thats problem, every extension even not in preprocessor have same problems, but I think there are many projects now which use only clang and if people like the idea possibly it will be gcc extension too

It's also worth noting that you can write recursive macros in C

As far as I know, the maximum that can be done is to code-generate many very difficult-to-understand constructs that allow you to repeat your code up to N times, creating the illusion of recursion and I would like to fix it

Copy link
Contributor

@cor3ntin cor3ntin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@github-actions
Copy link

github-actions bot commented Sep 24, 2023

✅ With the latest revision this PR passed the C/C++ code formatter.

@kelbon
Copy link
Contributor Author

kelbon commented Sep 24, 2023

Version with __THIS_MACRO__ instead of new preprocessor directive is ready, so actual examples are like

#define reverse(head, ...) __VA_OPT__(__THIS_MACRO__(__VA_ARGS__) , ) head

__THIS_MACRO__ behaves exactly as macro name(reverse in this case), but its allowed to be expanded recursively

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants