Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhance the expressiveness of buildExtensions with allowing directory moves #1689

Closed
natebosch opened this issue Jul 27, 2018 · 22 comments · Fixed by #3152
Closed

Enhance the expressiveness of buildExtensions with allowing directory moves #1689

natebosch opened this issue Jul 27, 2018 · 22 comments · Fixed by #3152
Labels
P2 A bug or feature request we're likely to work on type-enhancement A request for a change that isn't a bug

Comments

@natebosch
Copy link
Member

Build extensions covers a large set of use cases and has a simple implementation, but doesn't cover more complex situations.

We should consider selectively expanding what we can express with specific use cases that are likely to happen. One that has come up a couple times is the idea of parallel directory structure. For instance writing outputs to /web/docs when the inputs are in /docs - but keeping the remaining paths identical.

When we originally proposed the build extension scheme we did suggest the configuration language could be expanded - our criteria was just that it was static configuration over a snippet of code.

Pros:

  • Not overly complicated to support for a big jump in possibilities.
  • Still high on convention so does not drastically increase how hard it is to understand the build (with the exception of generate_for/sources interaction noted below). May make some builds easier to understand if they are today working around this with hacks.

Cons:

  • Would be a breaking change to introduce a new field on Builder to hold this information.
  • Will be an extra configuration option that most builders probably won't use - expanding the configuration surface area is always costly.
  • We deal with build extensions in a few different places and we'd need to be careful that they are all updated in sync.
  • The interaction between generate_for and a target's sources is already hard to understand - this would make it even harder.

I think if we can find a few use cases that would be solved by something simple here we should consider it.

@bagintz
Copy link

bagintz commented Oct 11, 2018

I would like this, and if I ever have child I will name them "Dart" after you guys if you can make it happen......

@natebosch
Copy link
Member Author

@bagintz - can you describe your use case?

@bagintz
Copy link

bagintz commented Oct 12, 2018

It is simply to keep all the generated files in a single, separate folder to keep my models "clean" and easier to read. It is not a show stopper, but a matter of convenience.

@natebosch
Copy link
Member Author

Thanks for the info. Note that if we are able to deliver the feature as described in this specific issue it would be a configuration made by the builder author and not a setting for the end user. A configuration to allow you to choose to do this with builders you didn't write would require #683 as well.

There was a similar request made and a longer discussion at google/json_serializable.dart#306

@jakemac53
Copy link
Contributor

Alright, I was thinking about this some recently, and I have a very rough idea that I think would probably be expressive enough to be pretty general, while not being to complex or difficult to implement.

Essentially, the idea is to allow capture groups inside the input extensions, which can be filled in arbitrarily into the output extensions.

I haven't thought much about the syntax, but lets say for example you mark a capture group with {{<identifier>}}, and you impose the following rules:

  • Identifiers have to be unique
  • Every capture group in the input extensions has to appear in all output extensions
  • Capture groups can't appear at the beginning or end of the input or output extensions

So, given one of the initial examples, to output to web/docs with inputs from docs, you could do something like the following:

builders:
  my_builder:
    build_extensions:
      docs/{{0}}.dart:
        - web/docs/{{0}}.md

I believe these could directly translate to a regex, so the above for instance would translate to this regex docs\/(.*).dart$. We can then run that regex on all inputs instead of doing a suffix match.

This essentially begs two fairly obvious questions (and more I'm sure):

  • Is it fast enough
  • How do we do builder ordering (today we use the input/output extensions)

In terms of speed I think we will just have to measure. It should only really affect initial asset graph build time so that shouldn't be difficult.

For builder ordering I am thinking we could just use the last token as the extension, and treat that the same way as normal input/output extensions today. So in the above example it would run after any generator that produces .dart file, and before anything that consumes .md files.

cc @natebosch @nshahan @grouma

@natebosch
Copy link
Member Author

I really like the idea of having a capture group in the extensions config. It buys a lot of expressiveness for not too much conceptual complexity and I think we can smoothly upgrade to that without impacting existing config.

I don't think we can use regexes for this, they don't exist in starlark as far as I know and I'd still like for this to be compatible with bazel. If we limit it to one capture group then we can implement it with a prefix/suffix match easily enough. I think we could use {{}} as the placeholder on both sides, and only allow the one.

Some questions:

  1. Would we allow the capture group to contain directory separators? Can docs/{{}}.dart match docs/something/foo.dart and output to web/docs/something/foo.dart?
  2. If you have a prefix does your match implicitly start at the root of the package or can it be nested? Can docs/{{}}.dart match something/docs/foo.dart and output to something/web/docs/foo.dart?

Also a minor nit:

in the above example it would run after any generator that produces .dart file, and before anything that consumes .md files.

That's not the way it works today, your input extensions don't impact ordering, only required_inputs as a separate config.

@jakemac53
Copy link
Contributor

  1. Would we allow the capture group to contain directory separators? Can docs/{{}}.dart match docs/something/foo.dart and output to web/docs/something/foo.dart?

Yes, I think its critical to support this to support entire directory moves (you can't know the full structure as a builder, and only supporting top level dirs would be unfortunate).

2. If you have a prefix does your match implicitly start at the root of the package or can it be nested? Can docs/{{}}.dart match something/docs/foo.dart and output to something/web/docs/foo.dart?

Maybe we should allow the user to choose, possibly by following regex syntax (preceding $ to indicate it should be the root?)

@jakemac53
Copy link
Contributor

That's not the way it works today, your input extensions don't impact ordering, only required_inputs as a separate config.

Oh, I totally thought we did that... 🤔.

@SAGARSURI
Copy link

Any update on this? Spreading generated code throughout the main project is not maintainable and looks messy.

@natebosch
Copy link
Member Author

There is no update yet and we don't have any planned work to address this.

As a reminder, this particular issue impacts capability as a builder author. If you are hoping for the ability as a user to be able to shift around generated code that will not be feasible even if we add the feature requested in this issue.

@smkhalsa
Copy link

smkhalsa commented Nov 16, 2020

Hi there. I'm the author of https://github.com/gql-dart/ferry, and this feature would add a lot of value for our users.

We currently generate several files per user-generated .graphql file. Our users generally want to keep their GraphQL operations next to the widgets that use those operations, and the source directory can quickly become cluttered.

As a workaround, we've been recommending that users define their .graphql files in a /graphql subdirectory to keep the generated files separate:

lib/
  src/
    components/
      graphql/
        all_pokemon.graphql
        all_pokemon.ast.gql.dart # generated
        all_pokemon.data.gql.dart # generated
        all_pokemon.data.gql.g.dart # generated
        all_pokemon.req.gql.dart # generated
        all_pokemon.req.gql.g.dart # generated
        all_pokemon.var.gql.dart # generated
        all_pokemon.var.gql.g.dart # generated
      all_pokemon.dart

However, if a user wants to edit the .graphql file, they still have to wade through all the generated files to find it.

Ideally, we'd keep the Widget and the .graphql file in the same directory and build all the generated files to a subdirectory.

@jakemac53
Copy link
Contributor

@smkhalsa would the "capture groups" proposal I had above support what you want to do? What would your ideal file layout be?

@smkhalsa
Copy link

smkhalsa commented Nov 16, 2020

@jakemac53 yes, I believe so.

Here's an example of one of our builder's build_extensions:

build_extensions:
  '.graphql':
    - '.data.gql.dart'

Which results in the following directory structure:

lib/
  src/
    components/
      graphql/
        all_pokemon.graphql
        all_pokemon.data.gql.dart # generated
        all_pokemon.data.gql.g.dart # generated
      all_pokemon.dart

So by modifying our build_extensions like so...

build_extensions:
  '{{0}}.graphql':
    - 'graphql/{{0}}.data.gql.dart'

...and pulling our all_pokemon.graphql source file out of the subdirectory, we could achieve our desired structure:

lib/
  src/
    components/
      graphql/
        all_pokemon.data.gql.dart # generated
        all_pokemon.data.gql.g.dart # generated
      all_pokemon.dart
      all_pokemon.graphql

@jakemac53
Copy link
Contributor

That mostly sounds good, I did have a restriction in place originally that you couldn't start with a capture group, although to be completely honest I can't remember why 😆 . It does seem necessary for what you would want to do, which also seems like a common thing to want to solve with this.

I think this basically boils down to the 2nd question that @natebosch had originally.

@smkhalsa
Copy link

smkhalsa commented Nov 16, 2020

For us it would have to match nested directory structures, not just the root, since our objective is to colocate graphql operations (and their generated data classes) next to widgets.

@smkhalsa
Copy link

@jakemac53 anything I can do to push this forward?

@jakemac53
Copy link
Contributor

@smkhalsa I think we have some general agreement here on at least a path forward. I think what needs to happen next is somebody working up a prototype that we could play with.

I am not sure when myself or other owners here would have the time to do that. We do generally accept contributions, but this would be a fairly large feature with a lot of risk in terms of it never landing, and needing thorough testing. So it wouldn't be for the faint of heart haha.

@smkhalsa
Copy link

It seems like this issue keeps coming up not only for users of the build package itself (#1689, #2732, #3126, #1132), but also for users of many downstream packages.

For example:
dart-lang/source_gen#508
dart-lang/webdev#1200
gql-dart/ferry#3
simolus3/drift#836
google/json_serializable.dart#484
google/json_serializable.dart#370
google/json_serializable.dart#347
google/json_serializable.dart#306

Any chance this could get added to the roadmap for an upcoming release?

@simolus3
Copy link
Contributor

I started playing around with this, but I wasn't sure if I got the semantics right. Maybe someone can help clarify if this is the behavior we want, I can then start working on a draft PR to evaluate.

The way I see it is that we essentially split build inputs into an (optional) directory matcher and a file extension matcher, right? So lib/{{}}.dart matches all Dart files recursively under lib/. Here, the directory matcher is lib/{{}} and the extension matcher is *.dart, easy enough. But what if there's no capture group? Is something like lib/.dart a valid build input that matches all Dart files a folder named lib? If so, would that break backwards-compatibility? If not, how do we determine what parts of the build input are responsible for matching directories?

Also, should {{}} match greedily? Let's say I declare build_outputs: {lib/{{}}.dart: [docs/{{}}.md]}. And let's say there's an asset with the path lib/src/lib/foo.dart. Does the builder output docs/src/lib/foo.md or lib/src/docs/foo.md? Should we let a builder author choose?

Do we want to allow a directory part in the build outputs if no directory matcher is present in the build input? E.g, is build_outputs: {'.dart': ['generated/.dart']} allowed? It could match lib/src/foo.dart and expect lib/src/generated/foo.dart as an output.

Do we want to allow a directory part in the build input if none is used in the build output? Could a builder specify build_outputs: {'lib/{{}}.dart': ['.g.dart']} to have something equivalent to {.dart: [.g.dart]} and a generate_for option? It was mentioned that "Every capture group in the input extensions has to appear in all output extensions" but I don't see why that's strictly necessary? I can see why something like $lib/{{}}.dart: [$docs/.md] should be forbidden because it can lead to the same output having multiple primary inputs, but maybe we can be more lenient?

@jakemac53
Copy link
Contributor

The way I see it is that we essentially split build inputs into an (optional) directory matcher and a file extension matcher, right? So lib/{{}}.dart matches all Dart files recursively under lib/. Here, the directory matcher is lib/{{}} and the extension matcher is *.dart, easy enough.

I wouldn't think of it as a directory/extension matcher really. You could for instance have extensions like: lib/foo_{{}}.dart: [docs/foo_{{}}.dart]. This is really just a substring match for lib/foo_, followed by a capture group, and a suffix match of .dart. The substring match can be anything, not just a directory, and its not a strict prefix match either (there can be stuff before it, so for instance bar/lib/foo_test.dart would match that input.

Basically, we retain the principle that the entire build extension key is still a suffix match, we just allow some wildcard capture groups within it.

As discussed above we probably do want to consider adding a special identifier (maybe ^) to indicate that it should be a prefix match as well as a suffix match (basically it needs to match the entire path). So you could do ^lib/foo_{{}}.dart. This would be usable from any build extension as a general feature, and it would enforce that there must be a prefix match as well as a suffix match for the extension.

But what if there's no capture group? Is something like lib/.dart a valid build input that matches all Dart files a folder named lib? If so, would that break backwards-compatibility? If not, how do we determine what parts of the build input are responsible for matching directories?

Those should work the same as normal extensions today, so lib/.dart would only match files whose path ends with exactly lib/.dart (probably none). The goal here is to have no changes at all for existing build extension configuration, and make this just a natural extension of what you are allowed to do in them.

Also, should {{}} match greedily? Let's say I declare build_outputs: {lib/{{}}.dart: [docs/{{}}.md]}. And let's say there's an asset with the path lib/src/lib/foo.dart. Does the builder output docs/src/lib/foo.md or lib/src/docs/foo.md? Should we let a builder author choose?

This should output docs/src/lib/foo.md I think. But that is a good question. I don't think we really want to make it configurable, we could always add that later if necessary.

Do we want to allow a directory part in the build outputs if no directory matcher is present in the build input? E.g, is build_outputs: {'.dart': ['generated/.dart']} allowed? It could match lib/src/foo.dart and expect lib/src/generated/foo.dart as an output.

I don't quite understand this one, maybe you were just missing the capture groups. So for instance I do think '{{}}.dart': ['generated/{{}}.dart'] should probably be allowed. However I did have this rule - capture groups can't appear at the beginning or end of the input or output extensions - which would preclude that, but I don't remember why... .

Do we want to allow a directory part in the build input if none is used in the build output? Could a builder specify build_outputs: {'lib/{{}}.dart': ['.g.dart']} to have something equivalent to {.dart: [.g.dart]} and a generate_for option?

The equivalent of this would be {'lib/{{}}.dart': ['lib/{{}}.g.dart']}. You do need to use the capture group in the output extension to get unique names. Remember that output extensions completely replace the matching input extensions, so what you had for an output extension would just create a top level file with a path of .g.dart.

It was mentioned that "Every capture group in the input extensions has to appear in all output extensions" but I don't see why that's strictly necessary? I can see why something like $lib/{{}}.dart: [$docs/.md] should be forbidden because it can lead to the same output having multiple primary inputs, but maybe we can be more lenient?

This goes to the explanation of the last one, if you don't use them in the output extensions you will end up with conflicting files. I don't know of any valid use case for it.

jakemac53 pushed a commit that referenced this issue Aug 2, 2021
Support capture groups for build extensions.

Closes #1689.
@jakemac53
Copy link
Contributor

This is now released with package:build version 2.1.0 and package:build_runner version 2.1.0 (you will also need package:build_runner_core 7.1.0 but that should be handled for you by other constraints).

Note that if you use build_test, you will also need to update to version 2.1.3 of that to get support.

Big thanks to @simolus3 for the implementation! 🥳 🚀

@jakemac53
Copy link
Contributor

Btw the docs for the new feature are here https://github.com/dart-lang/build/blob/master/docs/writing_a_builder.md#capture-groups

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P2 A bug or feature request we're likely to work on type-enhancement A request for a change that isn't a bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants