Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarification on scan inclusions and exclusions #422

Closed
MikeEdgar opened this issue Apr 19, 2020 · 9 comments · Fixed by #533
Closed

Clarification on scan inclusions and exclusions #422

MikeEdgar opened this issue Apr 19, 2020 · 9 comments · Fixed by #533

Comments

@MikeEdgar
Copy link
Member

MikeEdgar commented Apr 19, 2020

The MP OpenAPI specification currently defines four configuration keys that deal with including or excluding packages or classes during annotation scanning.

  1. mp.openapi.scan.packages - the list of packages to scan
  2. mp.openapi.scan.classes - the list of classes to scan
  3. mp.openapi.scan.exclude.packages - the list of packages to exclude from scans
  4. mp.openapi.scan.exclude.classes - the list of classes to exclude from scans

I would be helpful to (1) explicitly define the precedence for these configurations (and enforce with TCK tests) and (2) define whether the inclusion of a package or exclusion of a package is recursive. Leaving the detail of package recursion as an implementation decision is a problem for portability.

Relates to smallrye/smallrye-open-api#289

@EricWittmann
Copy link
Contributor

@arthurdm Any chance you can track down the other implementations so they can describe how they implement this feature. That way we can determine what the current behavior is across the implementations. Perhaps we can converge (or have already converged) and thus codify that behavior in the spec.

That said, on the hangout call today it was suggested that we allow regular expressions in these properties. That seems like a very good idea - then we just need to make it clear the order of operations (presumably include first then exclude).

@arthurdm
Copy link
Member

hey @EricWittmann / @MikeEdgar - sorry for the long delay in replying.

The Open Liberty (Swagger-based) relevant code is here.

From that we can infer:

  • exclusion takes precedence over inclusion (since it alters the list after)
  • packages are recursive (uses "startsWith" on class names)

@MikeEdgar
Copy link
Member Author

Payara also seems to use startsWith for packages also, but filters first using full class names, which is different.

Payara code

Here are my thoughts for how this should be logically defined in the specification and tested with TCK cases. These should be handled sequentially, and exit as soon as one is found to be true.

  1. A class is excluded if listed in mp.openapi.scan.exclude.classes
  2. A class is included if listed in mp.openapi.scan.classes
  3. A class is excluded if its package or any parent package is listed in mp.openapi.scan.exclude.packages unless a more complete package or parent package is named in mp.openapi.scan.packages
  4. A class is included if its package or any parent package is listed in mp.openapi.scan.packages
  5. A class is included if mp.openapi.scan.classes and mp.openapi.scan.packages are both empty/undeclared.

@EricWittmann
Copy link
Contributor

My only issue with this is that it may be mutually exclusive with the idea of using regular expressions. Because I don't think you can do (3) with regular expressions - it's not possible to determine "a more complete package". But if we decode to not support regex then I'm cool with this approach.

@MikeEdgar
Copy link
Member Author

Do you think it would make sense to introduce two additional properties for regex? I really like the idea of supporting regex, but my concern would be around introducing breaking behavior for applications and/or requiring implementations to figure out if a property value is a pattern or just a comma-separated list of package names. If the value contains a comma, should it be split into multiple regex patterns for matching(?), is the presence of a comma an indicator that it is not a regex(?), etc.

  • mp.openapi.scan.exclude.pattern: pattern to match against a class's fully qualified name for exclusion
  • mp.openapi.scan.pattern: pattern to match against a class's fully qualified name for inclusion

I would expect these properties to be handled somewhere in (3) of the steps I listed above. Perhaps there would need to be some rules around the "quality" as determined by java.util.regex.Matcher#group where the longest matching section wins (just brain storming).

CC: @phillip-kruger in case you're not seeing this thread and want to weigh in.

@phillip-kruger
Copy link
Member

How about we treat it as a string except if it explicitly starts with ^ (https://docs.oracle.com/javase/tutorial/essential/regex/bounds.html) then it's a regex ?

@MikeEdgar
Copy link
Member Author

That sounds like a nice idea - and I would even add "or ends with $". That would give both "starts with" or "ends with" functionality. This can be something we experiment with in SmallRye as well as I don't think it should break backward compatibility.

Azquelt pushed a commit to Azquelt/microprofile-open-api that referenced this issue Mar 17, 2022
…lrye.config-smallrye-config-1.8.5

Bump smallrye-config from 1.8.4 to 1.8.5
@MikeEdgar
Copy link
Member Author

We discussed this issue in a recent team call and the agreement was that the current behavior of the SmallRye scanner will be documented in the spec and a TCK test added to verify the order of handling the include/exclude configuration properties. Comments/suggestions for changes in the PR (once opened) will certainly be considered.

@MikeEdgar MikeEdgar added this to the MP OpenAPI 3.1 milestone Mar 26, 2022
@Azquelt
Copy link
Member

Azquelt commented May 16, 2022

What was the use case for wanting to be able to use regex here?

My initial thought is that regex seems like a bad fit for matching package names:

  1. The . character has a different special meaning in each case
  2. Package names are hierarchical and when you're selecting packages, it's generally going to be picking subtrees out of this hierarchy, whereas regex is designed for doing complex matching on all sorts of text

I particularly wouldn't want support for regex to come at the cost of not allowing a more specific match to take priority over a less specific one. Consider the following rules:

  • include com.example.a
  • exclude com.example.a,b
  • include com.example.a.b.c

I think it's fairly important that these rules have the following result:

  • com.example.a.MyClass is included
  • com.example.a.b.MyClass is excluded
  • com.example.a.b.c.MyClass is included

Implementations are always free to define some way of supporting the use of regex here but I don't see any need to add it to the spec unless there are clear use cases for it. When we look at the use cases, we might decide for example that a simpler and more restrictive wildcard syntax is more appropriate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants