Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove shared getDockerBuildTask to enable parallel docker builds #16384

Merged
merged 18 commits into from
Sep 10, 2022

Conversation

evantahler
Copy link
Contributor

@evantahler evantahler commented Sep 6, 2022

After a hot tip from the creator of the Docker Gradle plugin we are using, I removed our shared getDockerBuildTask gradle task and moved it into each project. This did allow our docker builds to run in parallel!
* Before - 18:30
* After - 8:40

Note how in the build scans the docker tasks in "after" ran in parallel 🥳

To achieve ^ cleanly, we define logic in root build.gradle to create a task in each subproject if the subproject contains dockerImageName in the gradle.properties file.

Some caveats:

  • we also define a copyGeneratedTar task that is applied to all subprojects with the dockerImageName property. Although this do not need to be used in each subproject.
  • we leave it up to each subproject to define what task is depends on since some projects have more custom copy logic.

@github-actions github-actions bot added area/platform issues related to the platform area/server area/worker Related to worker area/frontend Related to the Airbyte webapp labels Sep 6, 2022
@evantahler
Copy link
Contributor Author

evantahler commented Sep 6, 2022

@davinchia I tried to sort out how we might share this Gradle method outside of the main build.gradle file (https://stackoverflow.com/questions/18715137/extract-common-methods-from-gradle-build-script#18718802) and I really can't get anything to work... any tips?

@davinchia
Copy link
Contributor

Some notes:

  • I think this is happening because we are lazily creating a task, which is not evaluated until Gradle is in the execution phase. Because of this, Gradle doesn't know it can parallelise the task.
  • I think the way around this is to create a task in the buildSrc that isn't lazily evaluated. Didn't have time today to look into that.

@evantahler
Copy link
Contributor Author

@davinchia I've hit the end of my Gradle abilities (I tried messing around in buildSrc for a while). I've cleaned things up a bit, but there's still a lot of repeated code.

I suggest we merge this in now, so we get the speed-up, and then in the future when you've got some time, a refactor can take place.

@evantahler evantahler marked this pull request as ready for review September 8, 2022 17:08
@evantahler evantahler requested review from a team as code owners September 8, 2022 17:08
@davinchia
Copy link
Contributor

@evantahler i'm pairing with @jdpgrailsdev later. I want to spend some time tomorrow trying something before we merge this in. That sound ok?

@evantahler evantahler temporarily deployed to more-secrets September 8, 2022 17:10 Inactive
@evantahler evantahler self-assigned this Sep 8, 2022
@evantahler evantahler temporarily deployed to more-secrets September 8, 2022 22:36 Inactive
@github-actions github-actions bot added the area/documentation Improvements or additions to documentation label Sep 9, 2022
@davinchia davinchia temporarily deployed to more-secrets September 9, 2022 04:07 Inactive
@davinchia davinchia temporarily deployed to more-secrets September 9, 2022 04:10 Inactive
buildArgs.put('JDK_IMAGE', openjdkImage)
buildArgs.put('VERSION', buildVersion)
}
localDockerBuildTask.dependsOn(copyScripts)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jdpgrailsdev I believe what we did today works!

One hiccup is we have submodules whose copy tasks are on scripts instead of the generated tar. I have some idea on how to fix this but want to do in person as it's faster.

@davinchia davinchia temporarily deployed to more-secrets September 9, 2022 04:14 Inactive
build.gradle Outdated
Comment on lines 289 to 293
// If subprojects have the dockerImageName property configured in their gradle property file,
// register:
// 1) A task to copy the generated tar.
// 2) A task to build the docker image that depends on the generated tar.
// 3) Make the docker image task depend on the default assemble task.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fancy!

@github-actions github-actions bot removed the area/documentation Improvements or additions to documentation label Sep 9, 2022
@davinchia davinchia temporarily deployed to more-secrets September 9, 2022 21:19 Inactive
@davinchia davinchia temporarily deployed to more-secrets September 9, 2022 21:26 Inactive
@github-actions github-actions bot added the area/documentation Improvements or additions to documentation label Sep 9, 2022
@davinchia davinchia temporarily deployed to more-secrets September 9, 2022 21:34 Inactive
Copy link
Contributor

@git-phu git-phu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks great!

the old def Task getDockerBuildTask always felt like a hack anyway

@evantahler
Copy link
Contributor Author

Thanks for getting this done @davinchia!

method takes 2 arguments:
- The image name, for example if `foo` is given as an image name, the image `airbyte/foo` will be created
- The project directory folder. It is needed because the `getDockerBuildTask` is declared in the rootProject
The top level `build.gradle` file defines several convenient tasks for building a docker image.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@git-phu how does this look?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks great! Just one comment

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@davinchia davinchia temporarily deployed to more-secrets September 10, 2022 00:54 Inactive
Copy link
Contributor

@git-phu git-phu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks great! Just one comment

docs/contributing-to-airbyte/developing-on-docker.md Outdated Show resolved Hide resolved
method takes 2 arguments:
- The image name, for example if `foo` is given as an image name, the image `airbyte/foo` will be created
- The project directory folder. It is needed because the `getDockerBuildTask` is declared in the rootProject
The top level `build.gradle` file defines several convenient tasks for building a docker image.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks great! Just one comment

Co-authored-by: Peter Hu <peter@airbyte.io>
@davinchia davinchia merged commit 834ac1a into master Sep 10, 2022
@davinchia davinchia deleted the evan/gradle-docker-speedup branch September 10, 2022 01:40
@davinchia davinchia temporarily deployed to more-secrets September 10, 2022 01:42 Inactive
@subodh1810
Copy link
Contributor

subodh1810 commented Sep 15, 2022

@evantahler @davinchia I think this PR broke my local build, for all the gradle modules which dont have the application plugin configured (Applying the Application plugin also implicitly applies the Distribution plugin), the task distTar is unknown and since the new subProject logic makes the copyGeneratedTar task depend on the task distTar (which is not available), the build is fails. I dont know if its just me or others are facing this as well and how is this working on the CI.

    if (subproj.hasProperty('dockerImageName')) {
        project.logger.lifecycle("configuring docker task for $subproj.name")

        // Although this task is defined for every subproject with the dockerImageName property,
        // It is not necessarily used for all subprojects. Non-TAR producing projects can ignore this.
        tasks.register("copyGeneratedTar", Copy) {
            dependsOn copyDocker
            dependsOn distTar

The error that am getting (this is just for airbyte-cli but the same error pops up for all the modules which dont have the application plugin)

Could not create task ':airbyte-cli:copyGeneratedTar'.
> Could not get unknown property 'distTar' for task ':airbyte-cli:copyGeneratedTar' of type org.gradle.api.tasks.Copy.

I fixed it by adding the distribution plugin to all such modules (see PR #16781) , but am not sure if thats the right thing to do. I think we might need to re-tweak the logic of copyGeneratedTar and not make it depend on the distTar task for all the modules.

Let me know what you think of it.

@davinchia
Copy link
Contributor

@subodh1810 this isn't showing up for me or on the build runners. Wonder if it's specific to you. Can you try ./gradlew clean build to see if that fixes it?

robbinhan pushed a commit to robbinhan/airbyte that referenced this pull request Sep 29, 2022
…irbytehq#16384)

Define logic in root build.gradle to create a task in each subproject if the subproject contains dockerImageName in the gradle.properties file.

Some caveats:
- We also define a copyGeneratedTar task that is applied to all subprojects with the dockerImageName property.
- This does not need to be used in each subproject. We leave it up to each subproject to define what task is depends on since some projects have more custom copy logic.
jhammarstedt pushed a commit to jhammarstedt/airbyte that referenced this pull request Oct 31, 2022
…irbytehq#16384)

Define logic in root build.gradle to create a task in each subproject if the subproject contains dockerImageName in the gradle.properties file.

Some caveats:
- We also define a copyGeneratedTar task that is applied to all subprojects with the dockerImageName property.
- This does not need to be used in each subproject. We leave it up to each subproject to define what task is depends on since some projects have more custom copy logic.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/documentation Improvements or additions to documentation area/frontend Related to the Airbyte webapp area/platform issues related to the platform area/server area/worker Related to worker team/platform-move
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants