Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GenerateProtoTask should support incremental build and more thorough clean #33

Closed
Shad0w1nk opened this issue Aug 9, 2015 · 17 comments · Fixed by #636
Closed

GenerateProtoTask should support incremental build and more thorough clean #33

Shad0w1nk opened this issue Aug 9, 2015 · 17 comments · Fixed by #636
Assignees
Milestone

Comments

@Shad0w1nk
Copy link

The current implementation of GenerateProtoTask will not support scenario where a proto file is renamed or even deleted. In that scenario, the old generated file will be left behind and may cause unwanted build success or failure (by duplication). The task should take an IncrementalTaskInputs parameters to remove stalled output. It will also be able to build only the proto file that as changed - very useful in big proto source set.

@zhangkun83
Copy link
Collaborator

We should probably add @Input and @OutputDirectory annotations to the two tasks too.

@zhangkun83
Copy link
Collaborator

@brunobowden

@zhangkun83 zhangkun83 self-assigned this Aug 20, 2015
@zhangkun83
Copy link
Collaborator

WRT deleted or moved source files, currently protoc doesn't provide any information on what files it will generate, and I don't foresee this will happen easily. We cannot precisely delete the corresponding output files, and we can only delete the whole output directory. This leads to my concern about the safety of deleting the whole directory (#24). @ejona86 what's your take on this?

@brunobowden
Copy link

Gradle does provide information like this. We already use it in the J2ObjC Gradle Plugin:
https://github.com/j2objc-contrib/j2objc-gradle/blob/master/src/main/groovy/com/github/j2objccontrib/j2objcgradle/tasks/TranslateTask.groovy#L100 - though to do this properly with the plugin, you also need to know what are the expected output files and how to remove them.

My suggestion is that you start with a system where you just do a clean build if there's been a significant enough change. Most proto changes are within a file or a new file, which should be fine. It's removing a file that'll likely cause issues.

Bruno

@Shad0w1nk
Copy link
Author

Several ways can give us the knowledge. The easiest way would be providing a patch to protoc to give it to us. I don't see any verbose flag, could it be something the guys at protoc would be willing to add or accept a pull request for?

If that is out of the question, we can always use hooking in the process to figure out the output. However this is hard to deal with the various OS.

We could also output the files into a known to be clean folder and move them to the target folder. If we build each file individually (invoking the compiler for each file) then we could figure out what are the output for each input files.

Solution 1 would be ideal and optimal. Solution 3 is a quick and easy hack. I would try to avoid solution 2 as it contains too much support burden. Let me know what you guys think.

@brunobowden
Copy link

I was a little surprised that the output isn't that predictable. Can each
.proto file have multiple message's and hence multiple output java files?

My concern is that getting any information from the protoc could be flaky.
Outputting the files to a known clean folder would always be robust. It
gets trickier within Gradle as the outputs should point to the final
destination. So if any file is altered after the build, then doing another
build will restore the file to its' original state.

On Thu, Aug 20, 2015 at 7:39 PM Daniel Lacasse notifications@github.com
wrote:

Several ways can give us the knowledge. The easiest way would be providing
a patch to protoc to give it to us. I don't see any verbose flag, could it
be something the guys at protoc would be willing to add or accept a pull
request for?

If that is out of the question, we can always use hooking in the process
to figure out the output. However this is hard to deal with the various OS.

We could also output the files into a known to be clean folder and move
them to the target folder. If we build each file individually (invoking the
compiler for each file) then we could figure out what are the output for
each input files.

Solution 1 would be ideal and optimal. Solution 3 is a quick and easy
hack. I would try to avoid solution 2 as it contains too much support
burden. Let me know what you guys think.


Reply to this email directly or view it on GitHub
#33 (comment)
.

@Shad0w1nk
Copy link
Author

A verbose flag from protoc could help us to piece together what happen to each input file. This can be flaky as we depend on purely informational output from the compiler.

The clean folder output will work best with least work and wouldn't affect the up to date information as we don't tag the clean temporary folder with Gradle as output. We only tag the target folder or the target file (final destination) as output and the move occurs as the task action instead of a seperate task. Gradle execute the @TaskAction then snapshot the output. The intermediate destination would be invisible to Gradle. However, this method do introduce more IO corresponding to the move. Although, as long as we do an actual move as opposed to writing a new file and deleting the old one, the performance impact should be very minimal as the file system will optimize the IO operations.

Without depending on a modification to protoc, the clean folder technique would fix the problem. I would probably put the associating knowledge base file in the target output folder as it shield against someone manually deleting the build directory when the output dir is not in that directory. So the clean and incremental compilation will keep on working.

Don't hesitate to ask any questions.

@zhangkun83 zhangkun83 changed the title GenerateProtoTask should support incremental build GenerateProtoTask should support incremental build and more thorough clean Aug 28, 2015
@zhangkun83
Copy link
Collaborator

Just confirmed with protobuf team. Adding the "verbose flag" to protoc is not a trivial change, as it requires changes to all code generators and plugins.

However, I think this could work:

When build:

  1. Generate to a clean temporary directory
  2. Save the list of generated files in a manifest file under build
  3. Move the generated files to the target directory

When clean:

  1. Find the manifest file under build, and delete the generated files in the target directory
  2. Run built-in clean task, which deletes build

It gets trickier within Gradle as the outputs should point to the final
destination. So if any file is altered after the build, then doing another
build will restore the file to its' original state.

@brunobowden I am not sure I get it. Is my proposal affected by the issue you described?

@Shad0w1nk
Copy link
Author

Your proposition is very good and quite easy to implement. I would suggest to put the manifest in the target directory instead. This way if someone deletes the build directory manually (without going through gradle), no generated files will be leak as the next time gradle runs, it will be able to figure out which file were not deleted by the manual cleanup. The plugin will then be able to recover and remove dangling files. Typical scenario would be:
1- Run the generation task once which create files inside a target folder outside of the build directory.
2- Manually delete the build directory but not the external target folder with the generated files.
3- Rename or delete a proto file.
4- Run the generation task again which should pickup the change and remove the dangling files.

This way the plugin will never fall into a unknown state where it could be difficult for a new user to debug to the expense of creating an extra metadata file in the target folder. I think it's a fair trade for stability.

@brunobowden
Copy link

Sorry for the belated follow up on this. Firstly, this system would work easily without a manifest file, if everything was isolated to the build directory. Is that possible? That way, it's much easier to manage state. We do this with the J2ObjC Gradle Plugin for example. The idea is that the proto files are the only thing checked in the repository and then everything else is generated from that.

Here's a few things to help:

  1. Firstly use Gradle's built in system for task specific temporary directories:
AbstractTask.getTemporaryDir()
// Points to: build/tmp/<TaskName>
  1. Secondly, Gradle appears to keep track of all the output files. It gets the correct list, even if there are other tasks or content written to the @OutputDirectory by other tasks. Since it's internal state, I'm not sure we can rely on it with confidence.
AbstractTask.getOutputs().getPreviousFiles()

IncrementalTaskInputs will certainly improve the performance but doesn't solve the wider issue.

@brunobowden
Copy link

That last link is where I've asked a question publicly about this issue:
http://stackoverflow.com/questions/33358115/tracking-and-deleting-output-files-from-a-gradle-task

@brunobowden
Copy link

There's now an answer to the post that I agree with:

http://stackoverflow.com/questions/33358115/tracking-and-deleting-output-files-from-a-gradle-task/33361143#33361143

The plugin will work best if all output is done to the build directory, as specified by the AbstractTask.getTemporaryDir(). For example with Dagger2, it populate the generated code to build/classes/main where it's automatically used.

For the protobuf plugin, if you get IncrementalTaskInputs and nothing has been deleted (i.e. only edit and new protos), can you be efficient and recalculate only those outputs? I'm wondering if just an edit can delete a generated file, which is a hassle. If you get an incremental delete... or no incremental information, then you have to delete everything and regenerate.

It should be a requirement that the target folder contains nothing except for the output. Though this can be configured to be outside the build folder, I think that should get a warning and encouragement to change it.

vjeyakum pushed a commit to TetrationAnalytics/protoc-jar-maven-plugin that referenced this issue May 10, 2016
- Generate to a temp directory
- Recursively copy files whose contents have changed

This was the suggestion here, to get around the difficulty of figuring
out dependencies of a protobuf, and its target output files, as
described here:
google/protobuf-gradle-plugin#33
vjeyakum added a commit to TetrationAnalytics/protoc-jar-maven-plugin that referenced this issue May 10, 2016
- Generate to a temp directory
- Recursively copy files whose contents have changed

This was the suggestion here, to get around the difficulty of figuring
out dependencies of a protobuf, and its target output files, as
described here:
google/protobuf-gradle-plugin#33
@shevek
Copy link

shevek commented Dec 2, 2017

You don't even have to do incremental to fix the primary issue: don't get ratholed on it. Just making it specify its output files, and have stable generation will mean that even if protoc reruns, gradle will see that the output files haven't changed and won't recompile them unnecessarily.

Right now, a global clean and recompile of the protoc files would be better than the current situation - this is the only gradle plugin in a VERY LARGE project which can't elegantly update its outputs based on a change of inputs, at any cost.

@shevek
Copy link

shevek commented Jan 1, 2018

Somewhat related to #180 - correcting the gradle annotations will fix (most of) both.

@shevek
Copy link

shevek commented Jan 4, 2018

Possible workaround, although highly non-ideal in a Gradle world:

        generateProtoTasks {
                all().each { task ->
                        task.doFirst {
                                task.allOutputDirs*.delete()
                        }
                }
        }

@NikolayMetchev
Copy link

Possible workaround, although highly non-ideal in a Gradle world:

        generateProtoTasks {
                all().each { task ->
                        task.doFirst {
                                task.allOutputDirs*.delete()
                        }
                }
        }

with the latest Gradle 6.4.1 I get the following error if I try and use the workaround:

Could not get unknown property 'allOutputDirs' for task ':proto:generateProto' of type com.google.protobuf.gradle.GenerateProtoTask.

@NikolayMetchev
Copy link

This worked for me:

generateProtoTasks {
        all().each { task ->
            task.doFirst {
                delete(task.outputs)
            }
        }
}

ejona86 added a commit to ejona86/protobuf-gradle-plugin that referenced this issue Nov 5, 2022
Fixes google#33 for users not changing generatedFilesBaseDir. Stop documenting
generatedFilesBaseDir since it produces poor results, but keep it
functional with its existing Copy semantics. Fixes google#332
ejona86 added a commit that referenced this issue Jan 5, 2023
Fixes #33 for users not changing generatedFilesBaseDir. Stop documenting
generatedFilesBaseDir since it produces poor results, but keep it
functional with its existing Copy semantics. Fixes #332

This deprecates the setter, but I don't know how to make this trigger a warning,
for either Groovy and Kotlin. Both seem to ignore it.

Turn off UnnecessaryObjectReferences because it isn't providing value.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants