Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8260976: investigate ways to filter jextract output #469

Closed

Conversation

@mcimadamore
Copy link
Collaborator

@mcimadamore mcimadamore commented Mar 12, 2021

This patch adds new filtering capabilities to jextract, along the lines of what discussed in [1] (thanks Duncan for all the insightful comments on the topic!).

The patch adds some new options:

  • --include-{union, struct, var, function, typedef} = <symbol-name>

When one (or more) options like these are specified, jextract goes into some kind of avdanced filtering mode, and will only emit bindings for the specified symbols. Note: when no --include-xyz option is used, jextract behaves exactly like before - e.g. everything is extracted.

To help the user with the filtering process, another option has been added:

  • --dump-includes <file-name>

When this option is used, jextract will not emit any bindings - instead, it will dump all the included symbol on the specified file. The file is a simple text file, organized as follows (the following has been obtained extracting OpenGL):

#### Extracted from: /usr/include/KHR/khrplatform.h

--include-macro KHRONOS_BOOLEAN_ENUM_FORCE_SIZE    # header: /usr/include/KHR/khrplatform.h
--include-macro KHRONOS_FALSE                      # header: /usr/include/KHR/khrplatform.h
--include-macro KHRONOS_MAX_ENUM                   # header: /usr/include/KHR/khrplatform.h
--include-macro KHRONOS_SUPPORT_FLOAT              # header: /usr/include/KHR/khrplatform.h
--include-macro KHRONOS_SUPPORT_INT64              # header: /usr/include/KHR/khrplatform.h
--include-macro KHRONOS_TRUE                       # header: /usr/include/KHR/khrplatform.h
--include-typedef khronos_boolean_enum_t           # header: /usr/include/KHR/khrplatform.h
--include-typedef khronos_float_t                  # header: /usr/include/KHR/khrplatform.h
--include-typedef khronos_int16_t                  # header: /usr/include/KHR/khrplatform.h
--include-typedef khronos_int32_t                  # header: /usr/include/KHR/khrplatform.h
--include-typedef khronos_int64_t                  # header: /usr/include/KHR/khrplatform.h
--include-typedef khronos_int8_t                   # header: /usr/include/KHR/khrplatform.h
--include-typedef khronos_intptr_t                 # header: /usr/include/KHR/khrplatform.h
--include-typedef khronos_ssize_t                  # header: /usr/include/KHR/khrplatform.h
--include-typedef khronos_stime_nanoseconds_t      # header: /usr/include/KHR/khrplatform.h
--include-typedef khronos_uint16_t                 # header: /usr/include/KHR/khrplatform.h
--include-typedef khronos_uint32_t                 # header: /usr/include/KHR/khrplatform.h
--include-typedef khronos_uint64_t                 # header: /usr/include/KHR/khrplatform.h
--include-typedef khronos_uint8_t                  # header: /usr/include/KHR/khrplatform.h
--include-typedef khronos_uintptr_t                # header: /usr/include/KHR/khrplatform.h
--include-typedef khronos_usize_t                  # header: /usr/include/KHR/khrplatform.h
--include-typedef khronos_utime_nanoseconds_t      # header: /usr/include/KHR/khrplatform.h

#### Extracted from: /usr/include/alloca.h

--include-function alloca    # header: /usr/include/alloca.h
--include-macro _ALLOCA_H    # header: /usr/include/alloca.h

#### Extracted from: /usr/include/endian.h

--include-macro BIG_ENDIAN       # header: /usr/include/endian.h
--include-macro BYTE_ORDER       # header: /usr/include/endian.h
--include-macro LITTLE_ENDIAN    # header: /usr/include/endian.h
--include-macro PDP_ENDIAN       # header: /usr/include/endian.h
--include-macro _ENDIAN_H        # header: /usr/include/endian.h

In other words, the generated file contains all the symbols that have been inclided, in the form of "options". In fact, the generated file can be played back into jextract - assuming foo.conf is the name of our file, the we can do the following:

jextract @foo.conf Foo.h

Multiple @file can be used with jextract, and can also be mixed with explicit command-line options (all this support predates the work discussed in this patch, but makes it very worthwhile); this is also the same syntax supported by other launchers like javac.

The output file should be stable - e.g. all entries are grouped by header file (and header files are sorted alphabetially); whithin each header, entries are sorted by category (e.g. variable vs. function) and then also alphabetically within same category. This should help users finding things quickly.

In addition, each line has a comment which states from which header is the symbol coming from. I found this super helpful to define custom header-based filtering scheme using grep and regex (similar to what --filter used to do).

The implementation is relatively straightforward - other than adding support for the new option, it was mostly matter of making sure that OutputFactory does notgenerate the filtered out symbols.

[1] - https://mail.openjdk.java.net/pipermail/panama-dev/2021-March/012429.html


Progress

  • Change must not contain extraneous whitespace
  • Change must be properly reviewed

Issue

  • JDK-8260976: investigate ways to filter jextract output

Reviewers

Download

$ git fetch https://git.openjdk.java.net/panama-foreign pull/469/head:pull/469
$ git checkout pull/469

@bridgekeeper
Copy link

@bridgekeeper bridgekeeper bot commented Mar 12, 2021

👋 Welcome back mcimadamore! A progress list of the required criteria for merging this PR into foreign-jextract will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk openjdk bot added the rfr label Mar 12, 2021
@mlbridge
Copy link

@mlbridge mlbridge bot commented Mar 12, 2021

Webrevs

Copy link
Member

@JornVernee JornVernee left a comment

LGTM.

One minor suggestion inline.

@openjdk
Copy link

@openjdk openjdk bot commented Mar 12, 2021

@mcimadamore This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8260976: investigate ways to filter jextract output

Reviewed-by: jvernee

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 88 new commits pushed to the foreign-jextract branch:

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the foreign-jextract branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready label Mar 12, 2021
@mlbridge
Copy link

@mlbridge mlbridge bot commented Mar 14, 2021

Mailing list message from Duncan Gittins on panama-dev:

Maurizio

This works very well and I've abandoned my local jextract edits as I can
use jextract as-is to filter out unused symbols. The per line #header
comments are very helpful to filter with findstr/grep and to understand
where symbols come from.

As you mentioned in JDK-8260976, point #7 will help to warn if there are
un-used --includes, plus I think it would useful in all cases if
jextract reported a one line totals summary "Generated X symbols of Y,
filtered out (Y-X)" to give quick / easy feedback on the impact after a
command line or config file change.

Kind regards

Duncan

On 12/03/2021 14:14, Maurizio Cimadamore wrote:

@mcimadamore
Copy link
Collaborator Author

@mcimadamore mcimadamore commented Apr 8, 2021

/integrate

@openjdk openjdk bot closed this Apr 8, 2021
@openjdk openjdk bot added integrated and removed ready rfr labels Apr 8, 2021
@openjdk
Copy link

@openjdk openjdk bot commented Apr 8, 2021

@mcimadamore Since your change was applied there have been 467 commits pushed to the foreign-jextract branch:

Your commit was automatically rebased without conflicts.

Pushed as commit 04160de.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
2 participants