Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Class name handling for JDK unmodifiable Collection types changed in 2.9.3, 2.8.11 #1868

Closed
rwinch opened this issue Dec 20, 2017 · 20 comments
Closed
Milestone

Comments

@rwinch
Copy link

rwinch commented Dec 20, 2017

It appears that the default typing in jackson-databind-2.8.11 and jackson-databind-2.9.3 is inconsistent with prior versions. For example, the following test works in 2.9.2 and fails in 2.9.3 and 2.9.4-SNAPSHOT. The test also works in 2.8.10 but fails in 2.8.11

@Test
public void passesWith2dot9dot2AndFailsWith2dot9dot3() throws Exception {
	ObjectMapper mapper = new ObjectMapper();
	mapper.enableDefaultTyping(ObjectMapper.DefaultTyping.NON_FINAL, JsonTypeInfo.As.PROPERTY);

	Set<String> theSet = Collections.unmodifiableSet(Collections.singleton("a"));

	String actualJson = mapper.writeValueAsString(theSet);

	assertEquals("[\"java.util.Collections$UnmodifiableSet\",[\"a\"]]", actualJson);
}

The failure is:

org.junit.ComparisonFailure: 
Expected :["java.util.Collections$UnmodifiableSet",["a"]]
Actual   :["java.util.HashSet",["a"]]

This issue appears to be related to the build failures in spring-projects/spring-security#4918

@rwinch rwinch changed the title Jackson's default typing breaks passivity in 2.9.3 Jackson's default typing made non-passive changes in 2.9.3 Dec 20, 2017
@cowtowncoder
Copy link
Member

cowtowncoder commented Dec 20, 2017

I presume this is due to fix #1823 that is required to solve a different problem.
No tests exist to try to determine that unmodifiable set (list, map) would be retained as such, but if this used to work (surprising in a way, wouldn't have assumed it did, being internal JDK class), it'd make sense to try to make it work again.

Fortunately fix referenced only has to handle singleton case specifically, but not sure what good heuristic would be here. Chances are something else could yet fail with changes.

Going forward I would strongly recommend against counting on ability to retain specific aspects like this, given that JDK may choose to represent unmodifiable collections using some type that can not be instantiated from outside (for example, by not exposing zero-arg constructor method).

@rwinch
Copy link
Author

rwinch commented Dec 21, 2017

Thanks for the fast reply.

Going forward I would strongly recommend against counting on ability to retain specific aspects like this, given that JDK may choose to represent unmodifiable collections using some type that can not be instantiated from outside (for example, by not exposing zero-arg constructor method).

As I'm sure you know, passivity is important to ensure that serialized/deserialized values continue to work consistently. Overall the goal is to ensure if the user serializes Collections.unmodifiableSet that deserializing it produces an unmodifiable Set. The original test I produced for this issue is my attempt to simplify the problem.

More concretely, my goal is to ensure that this test will pass.

@Test
public void passesWith2dot9dot2AndFailsWith2dot9dot3() throws Exception {
	ObjectMapper mapper = new ObjectMapper();
	mapper.enableDefaultTyping(ObjectMapper.DefaultTyping.NON_FINAL, JsonTypeInfo.As.PROPERTY);
	SimpleModule module = new SimpleModule();
	Class type = Collections.unmodifiableSet(Collections.emptySet()).getClass();
	module.addDeserializer(type, new UnmodifiableSetDeserializer());
	mapper.registerModule(module);

	Set<String> theSet = Collections.unmodifiableSet(Collections.singleton("a"));

	String actualJson = mapper.writeValueAsString(theSet);

	Set<String> result = mapper.readValue(actualJson, Set.class);

	try {
		result.add("should fail");
		fail("Expected Exception");
	} catch (UnsupportedOperationException  success) {}
}

static class UnmodifiableSetDeserializer extends JsonDeserializer<Set> {

	@Override
	public Set deserialize(JsonParser jp, DeserializationContext ctxt) throws IOException,
			JsonProcessingException {
		ObjectMapper mapper = (ObjectMapper) jp.getCodec();
		JsonNode node = mapper.readTree(jp);
		Set<Object> resultSet = new HashSet<Object>();
		if (node != null) {
			if (node instanceof ArrayNode) {
				ArrayNode arrayNode = (ArrayNode) node;
				Iterator<JsonNode> nodeIterator = arrayNode.iterator();
				while (nodeIterator.hasNext()) {
					JsonNode elementNode = nodeIterator.next();
					resultSet.add(mapper.readValue(elementNode.traverse(mapper), Object.class));
				}
			} else {
				resultSet.add(mapper.readValue(node.traverse(mapper), Object.class));
			}
		}
		return Collections.unmodifiableSet(resultSet);
	}
}

Do you have suggestions on how to do this better?

@cowtowncoder
Copy link
Member

Just to make it clear, I don't argue against reasonable desire to preserve immutability as a property. It is clear user would want that. It is more that this is difficult for Jackson to support (more on this below).
if I was to suggest a way to achieve the goal of retaining immutability itself I would have setter take Collection (or List or Set etc) and create immutable instance, and not rely on Jackson being able to retain exact specific implementation dependent collection (etc) type:

public void setValues(Set<ValueType> v) {
    this.values = Collections. unmodifiableSet(v);
}

But be that as it may -- I think it is possible to make behavior work in 2.9.4 like it did in 2.9.2.

What follows is longer explanation of my thinking; feel free to read or ignore. I wrote it to clarify my thinking for myself. :)

So why is it difficult for Jackson to retain unmodifiability wrt unmodifiable sets created this way?
Unfortunately the way JDK chose to do this is to essentially hide all traces at runtime -- it is only implied by users calling a method to create an instance, and invisible after that.
There is nothing on declaration of the type, properties or attributes that indicates desired property of immutability -- while users intent is clear at source code, given an arbitrary collection there is no good way of figuring this out. It is basically an implementation detail, and such may or may not be preserved.

For example of something that is properly exposed, consider TreeSet and TreeMap -- in these cases property is indicated by type.
This can be (and is) supported as expected, even without polymorphic handling.

For polymorphic handling it is possible to retain exact impl class name, and if JDK maintainers keep no-args constructor (... which they have no real requirement to do; it's not instantiated directly by users), it can work. It's just that it seems fragile.

@rwinch
Copy link
Author

rwinch commented Dec 21, 2017

Just to make it clear, I don't argue against reasonable desire to preserve immutability as a property. It is clear user would want that.

Thanks. If I did imply otherwise, it was not my intention (sorry if it came across wrong). Thank you again for your responsiveness.

and not rely on Jackson being able to retain exact specific implementation dependent collection (etc) type:

The problem is that I don't know if it is always immutable or not. I need it to work for immutable and not immutable collections. As a framework, Spring Security does not know what the user is serializing and what they want.

So why is it difficult for Jackson to retain unmodifiability wrt unmodifiable sets created this way?

I full agree with this difficulty. The only way I could figure out how to do this was to rely on the implementation specific class. Obviously this is not ideal, but it seems the only way to do this since the JDK doesn't support

@cowtowncoder
Copy link
Member

Hmmh. After trying this out, I am not 100% sure that I should change the behavior, after all.

The problem here is this: although it is possible to force use of the original class, as-is, this will then mean that by default deserialization will fail, as there is no zero-arg constructor available.
So the change would mean that without custom deserializers there will be a new type of failure.
This could be overcome, perhaps, by yet another tweak to default handlers.

On the other hand I guess above problem is not new in the sense that up until 2.9.2 such deserialization would fail regardless. It just seems wrong to cause another kind of failure, just to support use case where a custom deserializer was used -- something that was unlikely to have been widely used.

So I will have to think about what to do here. I can honestly say that feature (ability to be able to deserialize unmodifiable collections as deemed by JDK) was not something that was planned.
That it happened to work is almost unfortunate since I feel torn about what is the best behavior.

@rwinch
Copy link
Author

rwinch commented Jan 3, 2018

Thanks for the feedback.

I'll stay tuned for your additional thoughts. If you decide to go against restoring the previous behavior, perhaps you can help guide me to a solution that preserves backward compatibility across jackson versions?

@rwinch
Copy link
Author

rwinch commented Jan 3, 2018

To be explicit, I also wanted to point out that the issue also appears to impact 2.8.11 (I also updated the original comment to include that information).

@rwinch rwinch changed the title Jackson's default typing made non-passive changes in 2.9.3 Jackson's default typing made non-passive changes in 2.8.11 & 2.9.3 Jan 3, 2018
@cowtowncoder
Copy link
Member

@rwinch thanks; it does make sense 2.8.11 affected since the same fix to allow working of singleton sets/lists was backported.

At this point, I think I will make the change, considering that your use case was affected (even if it could be seen as essentially unintentional support :) ), and since I do not anyone could have relied on new behavior of 2.9.3 wrt immutable sets/lists. So I think it is reasonable to do this for 2.9.

There is the bigger question here, I think, of the fact that sometimes exact specific class is not what is desired, but rather preservation of logical type that is more granular than declared (static type that property has), but not the same as actual runtime implementation type. I will think about what could be done for 3.0.

And last point I think I'll mention is that this might be something where use of Guava (or one of other container libs) might help: they tend to have explicit immutable types. Those would be retained.

Anyway: I hope to get the fix in 2.9.4, having thought it through. Thank you for your help here and apologies for breakage.

@cowtowncoder cowtowncoder changed the title Jackson's default typing made non-passive changes in 2.8.11 & 2.9.3 Class name handling for JDK unmodifiable Collection types changed in 2.9.3, 2.8.11 Jan 4, 2018
@cowtowncoder cowtowncoder added this to the 2.9.4 milestone Jan 4, 2018
@cowtowncoder
Copy link
Member

Fixed to the degree that serialization now works as expected (I think); let me know if you find issues with the fix wrt 2.9.4-SNAPSHOT (to become 2.9.4).

I did not backport this in 2.8 since it is unlikely there will be new full 2.8 releases, only possible micro-patches.

cowtowncoder added a commit that referenced this issue Jan 4, 2018
@wilkinsona
Copy link

I did not backport this in 2.8 since it is unlikely there will be new full 2.8 releases, only possible micro-patches.

Without a 2.8 release, users of 2.8.x face a tough choice of staying on 2.8.10 and avoiding the problem described in this issue, or upgrading to 2.8.11 and hoping that everything else they use can tolerate the behaviour change. Perhaps there could be a micro patch, or perhaps there's a way to modify libraries such as Spring Security so that they're compatible with 2.8.10 and 2.8.11?

@cowtowncoder
Copy link
Member

@wilkinsona at this point I'd have to consider whether users are actually relying on this behavior. It is theoretically of course possible to do micro-patches, it is just that the whole reliance on behavior is unfortunate, as per earlier discussion.

I am not 100% sure if there is a way for libraries to easily override behavior. I know it is possible to override ClassNameIdResolver itself, just not sure if that can be done globally.

@rwinch
Copy link
Author

rwinch commented Jan 5, 2018

I can confirm that this specific issue has been resolved, but I am now running into other issues with 2.9.4-SNAPSHOT that were not present in 2.9.2. I will follow up with additional issues as I sort through the problems.

I'd like to echo @wilkinsona sentiment. Serialization/Deserialization really needs to be extremely passive. If there are applications writing JSON using jackson 2.9.2 and update to a patch release, I'd expect that the newer patch is able to deserialize the values serialized by jackson 2.9.2. This is necessary when the JSON is stored in data stores and deserialized later such as when using Jackson for writing session data using Spring Session.

If users try and update to a newer patch or micro patch and it is broken due to passivity concerns they won't be able to update. This puts them in a pinch if a vulnerability is found. Rather than just updating the patch release, they must first figure out how to resolve the breaking change.

it is just that the whole reliance on behavior is unfortunate, as per earlier discussion.

This does seem to be the crux of the problem. Is there a way we could resolve this? I'm fine if we need to adjust how things work, but I want to ensure passivity for users. Part of my concern is that just because Jackson can serialize deserialize the result doesn't make it passive. Once the data is written, the result might be parsed using another library and expect a specific format.

@cowtowncoder
Copy link
Member

@rwinch I agree in that patches should not break things. This case is unusually complex due to various reasons discussed already. If I had known of side effects I would have been more hesitant with the original fix; but as usual since specific behavior wrt unmodifiable collection is not defined specifically as supported, nor are there unit tests to verify they would work in specific way.
This is also why test contributions are so important.

But at this point I guess I am then bit more interested in one other tangent: as you probably know, use of class-name - based polymorphic type handling is a security risk. I am curious as to what spring-security uses it for? Or is usage by users?
So is the problem only with tests, with something framework itself requires, or something reported by users?

@rwinch
Copy link
Author

rwinch commented Jan 5, 2018

But at this point I guess I am then bit more interested in one other tangent: as you probably know, use of class-name - based polymorphic type handling is a security risk. I am curious as to what spring-security uses it for? Or is usage by users?

Spring Security provides Jackson Module to register serialization support for Spring Security classes that are typically stored in session. This is so that users can easily persist session information using Jackson when using distributed sessions (i.e. Spring Session).

The support is intended to be for scenarios where the user is in control of serialization and deserialization of the data (i.e. session information cannot be read/written by an untrusted user). In the event users want to take advantage of the support, they explicitly register the Modules provided by Spring Security. Spring Security takes extra precautions by registering a TypeIdResolver that only allows white listed class names to be deserialized.

Since the data is persisted, we have lots of tests to ensure that serializing / deserializing with an older version of Spring Security / Spring Session produces compatible results. We need to ensure that this is the case since there may be instances of writing the data with one version of Jackson and reading with another. We have also had reports of users writing with Jackson and reading with other libraries all together, so we try to be very strict on compatibility.

@cowtowncoder
Copy link
Member

Ok. In specific case of unmodifiable sets, lists, JDK does not really guarantee that implementation might not change. So that could be bit brittle. Are the classname(s) dynamically detected by creating instances (similar to how test does it)? Or put another way, are these types automatically white-listed?
I guess they must be, since by default deserialization would fail so there needs to be custom deserializer registered too.

I think I am open to a micro-patch here, just want to understand specific problem case.

@rwinch
Copy link
Author

rwinch commented Jan 5, 2018

Are the classname(s) dynamically detected by creating instances (similar to how test does it)? Or put another way, are these types automatically white-listed?

Thanks for the reply. Yes they are detected at startup and a custom deserializer is registered much like the example in #1868 (comment)

cowtowncoder added a commit that referenced this issue Jan 5, 2018
@cowtowncoder
Copy link
Member

Ok. Due to totality of concerns, that other bug, I did go ahead and add patch in 2.8 after all.
I could use help in verifying this, whenever you get a chance. But until proven otherwise I think backport works.

@cowtowncoder
Copy link
Member

Reopening as I am reworking the fix. Updated version now works for empty, singleton; need to finish for unmodifiable.

@rwinch
Copy link
Author

rwinch commented Jan 30, 2018

@cowtowncoder Sorry for the delayed response. With some modifications, I am able to get things working with both 2.9.4 and 2.8.11.1-SNAPSHOT

@cowtowncoder
Copy link
Member

@rwinch Ok good. I'll have to start thinking of 2.8.11.1 at some point. Thank you for verifying.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants