Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CALCITE-5785] fmppMain tasks use absolute paths for input hashes and don't work with remote build caching #3272

Closed
wants to merge 1 commit into from

Conversation

akshayabd
Copy link
Contributor

The :babel:fmppMain and :server:fmppMain tasks currently do not work with remote build caching.

One of the inputs is a directory and currently it's configured to look at the absolute path instead of the relative path. This causes different input hashes to be computed on different machines, hence the tasks do not work correctly with remote build caching.

This PR addresses the issue by making the hashes for input directories look at the relative path.

@akshayabd
Copy link
Contributor Author

cc @clayburn @runningcode

@akshayabd akshayabd changed the title [CALCITE-5785] Fix fmppMain tasks to work with remote build caching [CALCITE-5785] Update fmppMain tasks to work with remote build caching Jun 17, 2023
@akshayabd
Copy link
Contributor Author

I've updated the commit message to match guidelines - is there a way to re-trigger the check?

@akshayabd akshayabd changed the title [CALCITE-5785] Update fmppMain tasks to work with remote build caching [CALCITE-5785] [CALCITE-5785] Update fmppMain tasks to use relative paths for input hashes so they work with remote build caching Jun 17, 2023
@akshayabd akshayabd changed the title [CALCITE-5785] [CALCITE-5785] Update fmppMain tasks to use relative paths for input hashes so they work with remote build caching [CALCITE-5785] Update fmppMain tasks to use relative paths for input hashes so they work with remote build caching Jun 17, 2023
@ILuffZhe
Copy link
Contributor

I've updated the commit message to match guidelines - is there a way to re-trigger the check?

Hi @akshayabd . You can ask on the dev list to get a CI approval for your first contribution.

@libenchao
Copy link
Member

I've approved the CI for this PR. To re-trigger the CI if there is no code changes, you can add a empty commit to your branch, see more in the contributing guide

@libenchao
Copy link
Member

The intention of this PR is clear, and it looks great to me. However, I'm not familiar with gradle build caches, I'm curious that is there anything that could show the effect of this change?

@akshayabd
Copy link
Contributor Author

Currently you can reproduce this issue by doing the following:

  1. Clone the project in 2 different directories
  2. Disable local caching for both clones and ensure remote caching is enabled
  3. Execute a build in the first clone
  4. Execute a build in the second clone at the same commit

You will notice that most tasks that are cacheable are not executed in the 2nd build, the outputs are fetched from the remote cache. However the fmppMain tasks, though they are marked as cacheable, will re-execute and the outputs are not fetched from the remote cache.

This is because the input cache key uses the absolute paths of the codegen directory, which will be different for the 2 clones. The cache key should use the relative paths. You can read more about this below in the Gradle docs:

https://docs.gradle.org/current/userguide/build_cache.html#header
"Wrongly declared task inputs can lead to cache misses especially when containing volatile data or absolute paths."

And here:
https://docs.gradle.org/current/userguide/build_cache.html#using_annotations
"However, input files are identified by default by their absolute path. So if the cache needs to be shared between several developers or machines using different paths, that won’t work as expected. So we also need to set the path sensitivity. In this case, the relative path of the input files can be used to identify them."

I've also attached a couple of screenshots from build scans in the JIRA issue that show the input cache key differences:
https://issues.apache.org/jira/browse/CALCITE-5785

--

At the end of the day this PR impacts build performance only. The fmppMain tasks will not be executed if the outputs can be gotten from the cache - which will now correctly happen because of how the input cache key is computed.

@akshayabd
Copy link
Contributor Author

@ILuffZhe @libenchao I've sent a message on the dev list (dev@calcite.apache.org) to introduce myself

@julianhyde
Copy link
Contributor

@akshayabd, You changed 'fix' to 'update' but you missed the point. The point is to describe the problem, not the fix, in your jira summary and commit message.

The problem, is that remote caching doesn't work, right?

Can a committer please fix the commit message and jira summary?

@akshayabd akshayabd changed the title [CALCITE-5785] Update fmppMain tasks to use relative paths for input hashes so they work with remote build caching [CALCITE-5785] fmppMain tasks use absolute paths for input hashes and don't work with remote build caching Jun 19, 2023
@akshayabd
Copy link
Contributor Author

@julianhyde Thanks - I've amended the commit message and PR title - please let me know if there are other changes that need to be made

@sonarcloud
Copy link

sonarcloud bot commented Jun 19, 2023

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

No Coverage information No Coverage information
No Duplication information No Duplication information

@akshayabd
Copy link
Contributor Author

Looks like the change has been merged in:
f48393c

I'm going to close this PR.

@akshayabd akshayabd closed this Jun 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants