Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compare task input properties as binary #962

Closed
wants to merge 23 commits into from
Closed

Conversation

lacasseio
Copy link
Contributor

@lacasseio lacasseio commented Dec 2, 2016

There are various use cases where user defined Serializable task input
properties caused ClassNotFoundException (#784). This is
solved by hashing the task input properties and saving those hash for
the up-to-date check during future build session (#919).
This new behavior deprecates equals method on any user-defined type
used as task input properties.

Open Issues

  • How do we announce the breaking change regarding equals method on user-defined types?

This change is Reviewable

Copy link
Member

@adammurdoch adammurdoch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't just silently ignore equals() on custom types, as this is a serious breaking change that people won't notice. Things will just silently sometimes be out-of-date.

To deal with this, for types (or every type) that have a custom equals() implementation, we should store the serialized object (not the hash). When we need to compare 2 values, we would deserialise the value using the new value's Classloader and compare both the binary form and call equals(). We would warn the user when these give different results, but honour the result of equals().

In 5.0 we would stop using equals() and use the binary form only.


println "Flushing InMemoryTaskArtifactCache"
gradle.taskGraph.whenReady {
gradle.services.get(InMemoryTaskArtifactCache).invalidateAll()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't use internal API in these tests. Instead, we should reproduce the issue in the same way a user would, which is to change the classpath for the build script and then run the build again. The simplest option is to simply change the build script to add/remove/change some code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried appending code to the build script among other similar change but I couldn't reproduce what InMemoryTaskArtifactCache.invalidateAll does. In this case, I want to force Gradle to grab what is on disk and deserialize everything again in order to catch the issues where the wrong class loader is used. I copied over these line from a previous test.

We could use --no-daemon flag, however, I wasn't able to make it work correctly. This issue will usually happen when the cache is flushed by the daemon which I haven't found a trivial deterministic way to execute.


InputPropertiesSerializer(ClassLoader classloader) {
this.serializer = new MapSerializer<String, Object>(BaseSerializerFactory.STRING_SERIALIZER, new DefaultSerializer<Object>(classloader));
this.serializer = new MapSerializer<String, HashCode>(BaseSerializerFactory.STRING_SERIALIZER, new DefaultSerializer<HashCode>(classloader));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need to use DefaultSerializer any more, as it uses java serialisation which is really inefficient. We should use the more efficient serialisation we use elsewhere for hashes, implemented in HashCodeSerializer.

@lacasseio
Copy link
Contributor Author

Thanks @adammurdoch for the very good code review. I'm thinking of implementing a wrapper for the input property to take care of the logic for choosing the binary form vs the actual Object serialization/deserialization.

My only concern is how to correctly detect that a type chain overrides the equals method. For example, a String class overrides it but it's fine alright so we shouldn't emit a warning or care about. Suppose class Foo doesn't override the equals method but his parent class, which happen to be another custom type, override the equals method, how can we make sure we detect this case. Also, if we parent chain goes all the way up to a class outside of the user's control, how do we deal with this case.

Because of this concern, I'm leaning toward wrapping the input property in a class that would deserialize as needed using the classloader of the object we are trying to compare. It was one of the solutions we discuss.

What do you think should be the way to go given my concern above?

@adammurdoch
Copy link
Member

We don't really care whether equals() is overridden or not, and who overrides it. What we care about is the case where a.equals(b) and hash(a).equals(hash(b)) return a different value. So, we could just do this check for all values regardless of their equals() implementation.

For certain types, we already know that a.equals(b) and hash(a).equals(hash(b)) return the same result for all values: instances of String, Number, File, collections and maps (but not arrays) of these things. So for these types we can short-circuit the deserialization, which is expensive both in time and memory usage and just store and compare the hash.

A wrapper to take care of the type specific behaviour makes a lot of sense here.

Thinking about the overridden equals() case, it actually wouldn't make sense to use the default Object.equals() for a value you are using as a task input, as if you did not the task would always be out-of-date. So let's not bother checking whether equals() is overridden or not, as it almost always will be.

@adammurdoch
Copy link
Member

Actually, collections and maps don't always return the same value for equals and comparing the hash. Only those that maintain order do.

@eriwen eriwen modified the milestones: 4.0, 3.3 RC1 Dec 9, 2016
@eriwen eriwen assigned lacasseio and unassigned eriwen Dec 12, 2016
Daniel Lacasse added 4 commits December 27, 2016 13:26
There is various use case where user defined Serializable task input
properties caused ClassNotFoundException (#784). This is
solved by hashing the task input properties and saving those hash for
the up-to-date check during future build session (#919).
This new behavior deprecates equals method on any user-defined type
used as task input properties.
The wrapper object handle the serialization as well as deserialization
of the task input property. It also choose between a binary only
implementation or a full type implementation for equality.
Copy link
Contributor Author

@lacasseio lacasseio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change requested have been done except for the cache invalidation. I would need some help to correctly fix.


println "Flushing InMemoryTaskArtifactCache"
gradle.taskGraph.whenReady {
gradle.services.get(InMemoryTaskArtifactCache).invalidateAll()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried appending code to the build script among other similar change but I couldn't reproduce what InMemoryTaskArtifactCache.invalidateAll does. In this case, I want to force Gradle to grab what is on disk and deserialize everything again in order to catch the issues where the wrong class loader is used. I copied over these line from a previous test.

We could use --no-daemon flag, however, I wasn't able to make it work correctly. This issue will usually happen when the cache is flushed by the daemon which I haven't found a trivial deterministic way to execute.

@lacasseio
Copy link
Contributor Author

As discussed, the implementation wraps the task input properties inside a class that either saves only the hash of the property or also save the serialized form of the property object. During the equality check, the property from the previous execution is deserialized and compared with it's matching property from the current run. If the hash and the Objects.equals value differs, a deprecation warning is printed.

@lacasseio
Copy link
Contributor Author

@bmuschko Would you mind having a look at this PR for any low hanging fix to let @adammurdoch focus on the higher level review of the fix?

enum FooType { FOO }

rootProject {
afterEvaluate {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need afterEvaluate here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right. Replaced with tasks.matching.


given:
initScript << """
enum FooType { FOO }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe enums still have a problem when used as input property. Maybe we can just use class FooType implements Serializable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure which problem you are referring regarding enums as input property. This test does fail without the fix and succeeds after. Could you elaborate on this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We used to have an issue with enums for input properties (see https://issues.gradle.org/browse/GRADLE-3537 and https://issues.gradle.org/browse/GRADLE-3018). Even though these issues are fixed, @eriwen recently mentioned that we have some sort of issue lurking. Maybe he can elaborate on it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree they are linked to what this PR is trying to fix. One issue got fixed by a previous work that addressed buildSrc custom type. This PR will go one step further and address the custom type declared in build.gradle script. Using either enum or Serializable class will give the same result. I choose enum as it was more compact.

Would you want me to add an additional test specifically for Serializable class?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An enum implements Serializable. So in general I'd say: "not needed". However, given the history of issues we had with enums I'd like to see another tests with a class that explicitly implements Serializable.


task createFile(type: MyTask) {
ext.outputFile = file('output.txt')
outputs.file(outputFile)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's define the output as property in MyTask + annotation as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, done.


foo = FooType.FOO

doLast {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be a @TaskAction in MyTask to have all the imperative code in the task definition.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

buildFile << """
import org.gradle.api.internal.changedetection.state.InMemoryTaskArtifactCache

println "Flushing InMemoryTaskArtifactCache"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for println.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done and removed from the other test too.

}
}

task createFile {
Copy link
Contributor

@bmuschko bmuschko Dec 29, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest breaking out the definition of the task into a task type and the task usage. You can represent both build script snippets by a method and reuse them across multiple test cases as you see fit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what is the advantage of what you are suggesting. Although the type has the same name, they have a different purpose. The FooType in this test exercise a custom equals implementation which is irrelevant and possibly erroneous for the tests above. The same goes for the task type and usage. I do want to configure a task that exercises the up-to-date check but I don't necessary want a custom type as the task type class loader did have an effect on the ClassNotFoundException issue we are trying to solve. I think it would be wise to leave it as is. What do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was mainly thinking about the task type here. If we can't use a task type, would it be possible to reuse some of the task code you have in place?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hesitant to reuse the task code as I have the impression it would make the code less readable to save 6 lines of code in total. The overhead of the generic function would be greater that the complexity and number of line saved. I'm open to doing the work, I just can't see how to do it in a clean and useful way.

To put it differently, what was the one thing you found was harder to understand in the new test and, once understood, what could have made it easier to understand the first time?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we'll want to strike the right balance between reusability and readability. If you feel that it makes the test code much harder to read then we can keep as is.

- Remove technical detail from deprecation warning
- Move test helper function into abstract fixture
- Fix failure for TestReportIntegrationTest
- Simplify InputProperties.create logic
Copy link
Contributor Author

@lacasseio lacasseio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed another commit addressing the code review.

@@ -66,6 +67,15 @@ private static boolean isBinaryComparableProperty(Object inputProperty) {
Object item = processingQueue.pop();

Class cls = item.getClass();
try {
// Given the equals wasn't override, use binary comparison.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

try {
// Given the equals wasn't override, use binary comparison.
if (cls.getMethod("equals", Object.class).getDeclaringClass().equals(Object.class)) {
continue;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great idea and I feel the same about continue. I did a small refactor to make the code cleaner and address your comment.

return 0
}
}

def "create a full type wrapper for custom Serializable input property"() {
static class SerializableTypeWithCustomEquals implements Serializable, Comparable<SerializableTypeWithCustomEquals> {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -398,6 +398,7 @@ apply from:'scriptPlugin.gradle'
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -544,6 +544,8 @@ apply from:'scriptPlugin.gradle'
then:
executedTasks == [':createFile']
skippedTasks.empty // The equals implementation for custom type always return false
outputContains "Custom equals implementation on task input properties has been deprecated and is scheduled to be removed in Gradle 4.0. " +
"In the future, Gradle will be hashing the input property object and comparing this hash"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you are right. I removed the technical details.

// Helper function that generate the FooType definition that can be used in the exact same way
private static String createFooTypeDefinitionAsEnum() {
return """
|enum FooType {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They just serve for whitespace management, the stripMargin remove them and every whitespace in front. I will remove them as it cause confusion.


// Helper function that generate the FooType definition that can be used in the exact same way
private static String createFooTypeDefinitionAsEnum() {
return """
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -537,4 +545,36 @@ apply from:'scriptPlugin.gradle'
executedTasks == [':createFile']
skippedTasks.empty // The equals implementation for custom type always return false
}

// Helper function that generate the FooType definition that can be used in the exact same way
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@lacasseio
Copy link
Contributor Author

Looking more into the logic to support Sets and Maps, there is a slim chance that we may wrongly detect a type to be binary comparable compatible if the first entry contains objects that don't override equals but other class hierarchy compatible to the generic type of the collections does override equals. It could cause issues when comparing for equality. I will push a commit to print the deprecation warning only if the equals of the object is overridden. This will ensure that custom type is always correctly compared.

Daniel Lacasse added 3 commits January 11, 2017 12:21
Types with default equals won't be binary compared.
The warning logging is quite complex to get right. Let's have a seperate
PR to address this specific feature.
Copy link
Contributor

@bmuschko bmuschko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for the fixes!

Copy link
Member

@adammurdoch adammurdoch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs some further changes. We need to make sure that the actual input property value is discarded as soon as we inspect the input properties at the start of the up-to-date checks. From there on, we only work with the hash and/or the serialized form. This includes:

  • not using the input property value to build the cache key
  • not retaining the input property value in the in-memory cache of the task history.

Also, as part of this change, we need to issue a deprecation warning when newInputValue.equals(deserializedPreviousValue) and newInputValueHashCode.equals(previousValueHashCode) give different results.

Can you also remove the AsyncCacheAccessContext thing, as this should no longer be required.

We should also add some test coverage that these custom types can be compared across builds where the build script changes, as this is the original issue.

@@ -39,8 +37,7 @@ class TaskInputPropertiesIntegrationTest extends AbstractIntegrationSpec {
"""

when: fails "foo"
then: failure.assertHasDescription("Could not add entry ':foo' to cache taskHistory.bin")
then: failure.assertHasCause("Unable to store task input properties. Property 'b' with value 'xxx' cannot be serialized.")
then: failure.assertHasDescription("Unable to hash task input properties. Property 'b' with value 'xxx' cannot be serialized.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also get rid of the call to TaskHistoryStore#flush() in DefaultGradleLauncher, as this should no longer be required.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have the feeling that removing the call to TaskHistoryStore#flush() isn't as simple as it may be. Given the following comment in the code:

// This is not strictly necessary, as the caches are closed immediately after this. Calling flush here rethrows any write failures inside the context of the build
// Instead, failures thrown when stopping or closing a service should be treated as build failures
gradle.getServices().get(TaskHistoryStore.class).flush();

It seems to imply the flush() is required to collect any write failures and treat them as build failures. Removing that call may result in some feature loose. I would suggest addressing this independently of this PR as more code would need to be changed somewhat unrelated to this code.

given:
buildFile << """
import org.gradle.api.internal.changedetection.state.InMemoryTaskArtifactCache

println "Flushing InMemoryTaskArtifactCache"
gradle.taskGraph.whenReady {
gradle.services.get(InMemoryTaskArtifactCache).invalidateAll()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should not need this any more (otherwise we haven't actually fixed the problem).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the cache invalidation. Without the fix and the cache invalidation, the test won't fail. Thanks to the invalidation, we are able to make those test fail. That's the reason why I feel uneasy with removing these lines.

import org.gradle.api.internal.changedetection.state.InMemoryTaskArtifactCache

gradle.taskGraph.whenReady {
gradle.services.get(InMemoryTaskArtifactCache).invalidateAll()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should not need this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Same concern as above.

tryDeserialize(this, rhs);
} catch (IOException e) {
// In presence of corruption, both object cannot assert to be equal
return false;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we shouldn't discard the exceptions here, they perhaps should be logged in some form. It would be good if this information ended up in the 'running task a because ...' reason message we log (and send elsewhere), eg 'running task a because the previous value of property b could not be deserialized'

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is a very good idea. I will shelve this for today and will look more in depth how to achieve this.

import java.io.Serializable;

class DefaultInputProperty implements InputProperty {
private transient Object inputProperty;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should not retain the input property value.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was using it as a cache for 2 reasons:

  1. We don't need to deserialize the object if it was already deserialized before.
  2. In order to deserialize the input property, we need the right class loader (which was the original issue). In the implemented solution, we know that at least one of the two DefaultInputProperty will have the instance of a deserialized input property (as it's the current build session). We simply use the class loader of that object to deserialize the other one.

Reason 1 is probably negligible. Could you elaborate on how to solve reason 2 without causing ClassNotFoundException?

}
}

private static boolean isBinaryComparableProperty(Object inputProperty) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would do this when inspecting the task property, so that we do the inspection once, rather than once per task property value.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea. I wasn't sure how to cleanly organize two distinct logic for equality inside a single class. The binary comparison logic is straightforward, however, the default logic is quite complex. Given that you suggest not keeping an instance of inputProperty inside DefaultInputProperty, how can we move the isBinaryComparableProperty logic inside DefaultInputProperty.equals without having access to inputProperty.getClass()?


InputPropertiesSerializer(ClassLoader classloader) {
this.serializer = new MapSerializer<String, Object>(BaseSerializerFactory.STRING_SERIALIZER, new DefaultSerializer<Object>(classloader));
this.serializer = new MapSerializer<String, InputProperty>(BaseSerializerFactory.STRING_SERIALIZER, new DefaultSerializer<InputProperty>(classloader));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should not use default java serialization any more, as it's very, very inefficient. We simply need to serialize the hash and serialized bytes for each entry

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sense, since I looked a lot at the Serializer lately, I know what you mean. I will make the change.


import java.io.Serializable;

public interface InputProperty extends Serializable {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not Serializable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above, I will make the change.

try {
HasherUtil.putObject(hasher, value);
} catch (NotSerializableException e) {
throw new UncheckedIOException(e);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should't happen. we should be appending the hash of the property value

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you elaborate on this, I'm not sure I fully understand?

The previous code would simply try to cast all Object to Serializable and use SerializationUtils.serialize as a last resort action. However, if the class isn't Serializable it will throw a ClassCastException. I simply tried to mimic the behavior from how input property was previously serialized without changing the behavior for BuildCacheKey. If the class isn't Serializable it throws a NotSerializableException so we can handle it as needed for BuildCacheKey and InputProperty.

@adammurdoch
Copy link
Member

Looks like we removed the deprecation warning. The logic is actually quite simple:

  1. we have the new value.
  2. deserialise the previous value
  3. call equals() on the new and previous values
  4. call equals() on the new and previous hashes (or binary content or both)
  5. if the result of 3 and 4 are not the same, complain.

@wolfs
Copy link
Member

wolfs commented Jan 15, 2017

@lacasseio We want to extract the calculation of the build cache key anyway. See #1175. We should probably sync on Monday.

@adammurdoch
Copy link
Member

The key concept to keep in mind is that the hash of the input property value is input to both the build cache key calculation and the incremental build state. The input property value is input to neither.

In other words, we need to use exactly the same logic to decide whether a property value has changed over time, regardless of whether we're using the build output cache or the output from a previous local build.

@lacasseio
Copy link
Contributor Author

Given the amount of work left to do to deliver this PR and how unsure I am with the exact impact of the issue we are trying to fix, there are three choices I see to avoid dragging this work any longer:

  1. We shelve the work. The ClassNotFoundException issue seems to impact a minority of the users as there isn't much activity on the bug report. It would enable us to focus on higher priority work. We can revisit this issue if it becomes a problem for more users or it may naturally get resolved as more development in that area is done.
  2. We strictly address the ClassNotFoundException issue by deserializing the input data only during the equality check. As we don't try to do any strategic work, the scope becomes more manageable and stay focus on fixing the original issue. Any strategical work is deprioritized and scheduled at a later date.
  3. We deem the strategical work as high priority/value and fix it now. We move forward by addressing all the issue raised here and merge this PR with its present scope.

Strategic work is important but I can't assert how much time is left on this PR as I have been wrong multiple time since this work started. Could @bmuschko and @eriwen (anyone else is welcome to pitch in too) comment on the priority this work should be attributed and what solution strategy we should be looking at?

@bmuschko
Copy link
Contributor

@lacasseio I think it would be crucial to know how much effort is left to get this PR production-ready to come to a conclusion. Maybe @adammurdoch can give some insight here.

What we do know is that we have higher priority items in the queue that need attention. I'd suggest we identify if we can improve the situation for the end user right now with the code we have in place. That would give us the opportunity to merge and follow up with a new issue to address any additional concerns. If the code in the PR is not production-ready then we might have to shelve it or find a new owner.

Copy link
Contributor Author

@lacasseio lacasseio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @adammurdoch for the awesome code review. Could you elaborate a bit more on some point so I can better estimate how much more work is required on this PR. Thanks!

given:
buildFile << """
import org.gradle.api.internal.changedetection.state.InMemoryTaskArtifactCache

println "Flushing InMemoryTaskArtifactCache"
gradle.taskGraph.whenReady {
gradle.services.get(InMemoryTaskArtifactCache).invalidateAll()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the cache invalidation. Without the fix and the cache invalidation, the test won't fail. Thanks to the invalidation, we are able to make those test fail. That's the reason why I feel uneasy with removing these lines.

import org.gradle.api.internal.changedetection.state.InMemoryTaskArtifactCache

gradle.taskGraph.whenReady {
gradle.services.get(InMemoryTaskArtifactCache).invalidateAll()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Same concern as above.

import java.io.Serializable;

class DefaultInputProperty implements InputProperty {
private transient Object inputProperty;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was using it as a cache for 2 reasons:

  1. We don't need to deserialize the object if it was already deserialized before.
  2. In order to deserialize the input property, we need the right class loader (which was the original issue). In the implemented solution, we know that at least one of the two DefaultInputProperty will have the instance of a deserialized input property (as it's the current build session). We simply use the class loader of that object to deserialize the other one.

Reason 1 is probably negligible. Could you elaborate on how to solve reason 2 without causing ClassNotFoundException?

tryDeserialize(this, rhs);
} catch (IOException e) {
// In presence of corruption, both object cannot assert to be equal
return false;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is a very good idea. I will shelve this for today and will look more in depth how to achieve this.

}
}

private static boolean isBinaryComparableProperty(Object inputProperty) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea. I wasn't sure how to cleanly organize two distinct logic for equality inside a single class. The binary comparison logic is straightforward, however, the default logic is quite complex. Given that you suggest not keeping an instance of inputProperty inside DefaultInputProperty, how can we move the isBinaryComparableProperty logic inside DefaultInputProperty.equals without having access to inputProperty.getClass()?


InputPropertiesSerializer(ClassLoader classloader) {
this.serializer = new MapSerializer<String, Object>(BaseSerializerFactory.STRING_SERIALIZER, new DefaultSerializer<Object>(classloader));
this.serializer = new MapSerializer<String, InputProperty>(BaseSerializerFactory.STRING_SERIALIZER, new DefaultSerializer<InputProperty>(classloader));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sense, since I looked a lot at the Serializer lately, I know what you mean. I will make the change.


import java.io.Serializable;

public interface InputProperty extends Serializable {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above, I will make the change.

try {
HasherUtil.putObject(hasher, value);
} catch (NotSerializableException e) {
throw new UncheckedIOException(e);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you elaborate on this, I'm not sure I fully understand?

The previous code would simply try to cast all Object to Serializable and use SerializationUtils.serialize as a last resort action. However, if the class isn't Serializable it will throw a ClassCastException. I simply tried to mimic the behavior from how input property was previously serialized without changing the behavior for BuildCacheKey. If the class isn't Serializable it throws a NotSerializableException so we can handle it as needed for BuildCacheKey and InputProperty.

@@ -39,8 +37,7 @@ class TaskInputPropertiesIntegrationTest extends AbstractIntegrationSpec {
"""

when: fails "foo"
then: failure.assertHasDescription("Could not add entry ':foo' to cache taskHistory.bin")
then: failure.assertHasCause("Unable to store task input properties. Property 'b' with value 'xxx' cannot be serialized.")
then: failure.assertHasDescription("Unable to hash task input properties. Property 'b' with value 'xxx' cannot be serialized.")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have the feeling that removing the call to TaskHistoryStore#flush() isn't as simple as it may be. Given the following comment in the code:

// This is not strictly necessary, as the caches are closed immediately after this. Calling flush here rethrows any write failures inside the context of the build
// Instead, failures thrown when stopping or closing a service should be treated as build failures
gradle.getServices().get(TaskHistoryStore.class).flush();

It seems to imply the flush() is required to collect any write failures and treat them as build failures. Removing that call may result in some feature loose. I would suggest addressing this independently of this PR as more code would need to be changed somewhat unrelated to this code.

@adammurdoch
Copy link
Member

adammurdoch commented Jan 16, 2017

@lacasseio some comments:

I have the feeling that removing the call to TaskHistoryStore#flush() isn't as simple as it may be.

I added that comment. That call was added to make a test pass, and with the changes in the PR this call will not be required, for us to report problems serializing user provided values, which is the failure this call is trying to deal with.

Other failures, which we're not concerned about here, will still be reported.

I removed the cache invalidation. Without the fix and the cache invalidation, the test won't fail. Thanks to the invalidation, we are able to make those test fail.

This means we're missing some other coverage. We shouldn't have functional tests that use internal APIs and do not exercise Gradle in a way that a user would not.

This is the basic problem here:

  • Run the build using the daemon
  • Change script or plugin
  • Run the build using the same daemon
  • Depending on the implementation of the value's equals() method fails with a ClassCastException or other linkage exception.

This is the test case I would use.

Reason 1 is probably negligible. Could you elaborate on how to solve reason 2 without causing ClassNotFoundException?

Exactly the way you have, just without retaining the deserialized value in the entry.

Given that you suggest not keeping an instance of inputProperty inside DefaultInputProperty, how can we move the isBinaryComparableProperty logic inside DefaultInputProperty.equals without having access to inputProperty.getClass()?

Don't use equals() for the comparison.

InputPropertiesTaskStateChanges is the class ultimately responsible for doing the comparison. I'd do the work there.

The previous code would simply try to cast all Object to Serializable and use SerializationUtils.serialize as a last resort action.

This is the wrong place to do the serialization, it's too late and it duplicates the other work that happens earlier. The serialization and hashing of arbitrary objects should happen as part of calculating the input properties, so that this result can be reused both as part of the cache key and in the task history for later local builds. When we're building the cache key, we already have a hash for the input property value, and this is what we should be adding to the cache key.

Daniel Lacasse added 4 commits January 17, 2017 07:51
A new API for DiffUtil was added to support this move. This new API
abstract the equality check into an Equalizer interface.
@lacasseio
Copy link
Contributor Author

Thanks for all the detail you added in the code review @adammurdoch. I pushed most of the change to address your code review comment. I didn't polish the code and test as I want to double check with you to see if those changes align with what you had in mind.

With those changes, HasherUtil will be removed and all related change reverted. A new API was added to DiffUtil to abstract the equality check through a new interface Equalizer. The equality logic was moved to a custom Equalizer inside InputPropertiesTaskStateChanges. The InputProperty are simple container for the hash, serialized bytes and, only for the current execution, the input property Object. If you give me the ok on those changes, I will move forward with polishing and we should be able to finish the PR.

Copy link
Member

@adammurdoch adammurdoch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. A couple of things I think we should change.

return inputProperty.getRawValue();
}

private static boolean isEqualsMethodOverriden(Object obj) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is going to be a performance issue, as lookup in reflection is quite slow, and we're doing this check for both values for every input property of every task. We could either cache the result of the lookup, or remove the check (and just warn people that 'equals()' returns a different result to comparing the hashes).


@Override
public InputProperty read(Decoder decoder) throws Exception {
byte[] serializedBytes = SERIALIZER.read(decoder);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can be decoder.readBinary()

@Override
public InputProperty read(Decoder decoder) throws Exception {
byte[] serializedBytes = SERIALIZER.read(decoder);
HashCode hash = HashCode.fromBytes(SERIALIZER.read(decoder));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should reuse HashCodeSerializer

@eriwen eriwen removed this from the 4.0 milestone Feb 3, 2017
@adammurdoch
Copy link
Member

This can be closed now. I ended up implementing this fix in a somewhat different way, closer to what I needed for caching dependency transforms.

@lacasseio
Copy link
Contributor Author

Thanks a lot @adammurdoch, let's close it now.

@lacasseio lacasseio closed this Feb 27, 2017
@lacasseio lacasseio deleted the dl-issue-919 branch May 31, 2017 22:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
in:core DO NOT USE
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants