New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

java.lang.String and java.lang.Integer are reported as not immutable. #4

Open
Grundlefleck opened this Issue Dec 5, 2012 · 11 comments

Comments

Projects
None yet
3 participants
@Grundlefleck
Contributor

Grundlefleck commented Dec 5, 2012

Original author: Grundlefleck@gmail.com (December 21, 2009 00:39:30)

java.lang.String is not reported as immutable because it lazily initialises
a field, which is therefore not declared as final. This could be quite a
tricky thing to detect correctly. I believe an adequate way to detect this
pattern is:
when a field is not private
find where it can be reassigned
check it can only be assigned once
search for a conditional that will assess whether to assign
if condition is checking it against the default value for its type
declare it as immutable

This is probably the most complicated issue for detecting mutability, and
this particular logic is probably full of holes.

java.lang.Integer is not reported as immutable because it has a primitive
array for a field. Currently, for a class to be declared immutable one of
the conditions is that each field is an immutable type. I decided that
primitive arrays are inherently mutable, which causes the result on
Integer. The solution for this specific issue would be to scan the class
and check that the primitive array is not mutated outwith the constructor.
A more general technique would be scanning the class to check that the
field is not mutated - immutable or not. This is more complex than is
necessary, I feel.

There can never be a 1.0 release while these classifications remain.

Original issue: http://code.google.com/p/mutability-detector/issues/detail?id=3

@ghost ghost assigned Grundlefleck Dec 5, 2012

@Grundlefleck

This comment has been minimized.

Contributor

Grundlefleck commented Dec 5, 2012

From Maaarti...@gmail.com on July 01, 2010 19:31:15

java.lang.Integer is not reported as immutable because it has a primitive
array for a field.

Not really, the field is static, so there's no way to change the object itself. You can safely skip over all statics.

A more general technique would be scanning the class to check that the
field is not mutated - immutable or not. This is more complex than is
necessary, I feel.

Sure, for final immutable fields you don't need to do anything. If there is any non-private field which is non-final or mutable, you're done.

For non-final fields you need to check all assignments outside of the constructors and initializers. If it happens to be in a private method, than you can still check (recursively) if this method can be called after initialization.

For mutable field you need to check if they can be actually mutated, which is the most complicated part since you'd need to know what methods are mutators. For arrays, it's simple.

There's a lot of gotchas there:

  • mutable field exposed to the outside
  • the same for parts of it
  • mutable field initialized from the outside
  • a non-final immutable class subclassed as mutable
  • surely more
@Grundlefleck

This comment has been minimized.

Contributor

Grundlefleck commented Dec 5, 2012

From Grundlefleck@gmail.com on July 01, 2010 20:44:52
Firstly, thank you for taking the time to comment. It's much appreciated.

Not really, the field is static, so there's no way to change the object itself. You can safely skip over all statics.

You're absolutely right, I'm surprised I missed that. Changes to enact that should be minimal, and Integer will be deemed immutable without too much work. It does raise the issue of instances accessing static objects in their methods, but that's an issue for another day, I think :-)

For non-final fields you need to check all assignments outside of the constructors and initializers. If it happens to be in a private method, than you can still check (recursively) if this method can be called after initialization.

Assigning to a field outwith the constructor is already checked for. I have yet to introduce checking in the initialisation block. Assigning a non-final field in a private method is definitely okay, but I would hazard a guess it's not a very common use case. I fear detecting this recursively may be a bit more tricky than it sounds, with a couple of gotchas. It should be checked for though.

For mutable field you need to check if they can be actually mutated, which is the most complicated part since you'd need to know what methods are mutators. For arrays, it's simple.

Once the mutations on primitive arrays can be detected, I don't think it will be too difficult to store whether a method is a mutator are not. Eventually it'll boil down to a field reassignment or an array mutation, or something similar which can be picked up, it's just a case of building up the map of method->mutator? as the analysis is done.

There's a lot of gotchas there:

  • mutable field exposed to the outside
  • the same for parts of it
  • mutable field initialized from the outside
  • a non-final immutable class subclassed as mutable
  • surely more

Some of these are checked for already (publishing mutable references) while others are slightly more tricky (inheritance provides a few tricky problems). Subclassing is something I'm going to have to think over a bit more deeply - all suggestions are welcome :-)

There's still the issue of lazy initialisation to consider as well, as with j.l.String.

Thanks for your input - it is appreciated.

Best regards,
Graham

@Grundlefleck

This comment has been minimized.

Contributor

Grundlefleck commented Dec 5, 2012

From Maaarti...@gmail.com on July 02, 2010 09:19:08

Not really, the field is static, so there's no way to change the object itself. You can safely skip over all statics.

You're absolutely right, I'm surprised I missed that. Changes to enact that should be minimal, and Integer will be deemed immutable without too much work. It does raise the issue of instances accessing static objects in their methods, but that's an issue for another day, I think :-)

I wasn't as right as I thought. Consider this:

public final class WannabeImmutable {
public String getValue() {
return map.get(this);
}
public void setValue(String value) {
map.put(this, value);
}

private final static Map<WannabeImmutable, String> map = new HashMap<WannabeImmutable, String>();

}

The WannabeImmutable is just a overcomplicated implementation of a single property bean. It's not immutable and even less thread-safe. Those static fields in Integer are not even private, so you could change the behavior of Integer if you could place your class in java.lang. You can't, but you can use reflection; however, using reflection you can do everything, so it doesn't count.

The immutability is a complicated concept and has many possible definitions.

For non-final fields you need to check all assignments outside of the constructors and initializers. If it happens to be in a private method, than you can still check (recursively) if this method can be called after initialization.

Assigning to a field outwith the constructor is already checked for. I have yet to introduce checking in the initialization block. Assigning a non-final field in a private method is definitely okay,

Unless this method gets called in a non-private one, possibly indirectly.

but I would hazard a guess it's not a very common use case.

I use initializer helpers when things get too complicated.

I fear detecting this recursively may be a bit more tricky than it sounds, with a couple of gotchas. It should be checked for though.

I don't think it's that complicated. First, use the bytecode analysis to find out what methods modifies fields, exposes them to the outside, and what other methods they call. Second, propagate it through the call graph.

For mutable field you need to check if they can be actually mutated, which is the most complicated part since you'd need to know what methods are mutators. For arrays, it's simple.

Once the mutations on primitive arrays can be detected, I don't think it will be too difficult to store whether a method is a mutator are not. Eventually it'll boil down to a field reassignment or an array mutation, or something similar which can be picked up, it's just a case of building up the map of method->mutator? as the analysis is done.

Agreed.

There's a lot of gotchas there: ...

Some of these are checked for already (publishing mutable references) while others are slightly more tricky (inheritance provides a few tricky problems).

I add some more things to think about:

  • Nested classes
  • Static methods
  • Native methods

Subclassing is something I'm going to have to think over a bit more deeply - all suggestions are welcome :-)

For a non-final class you can at best say "immutable unless subclassed at runtime", which is important information, too.

There's still the issue of lazy initialisation to consider as well, as with j.l.String.

Right. IMHO the only possible solution is to simply declare String as immutable. Unless you're going to make order of magnitude more complicated analysis.

Don't miss what Reiner wrote in the Lombok discussion.

@Grundlefleck

This comment has been minimized.

Contributor

Grundlefleck commented Dec 5, 2012

From Grundlefleck@gmail.com on July 04, 2010 14:01:31

[snip] using reflection you can do everything, so it doesn't count.

Totally agree, when reflection comes into it, all bets are off. If you can change the String "abc" to actually contain "xyz" then what chance is have I got? :-P

Assigning to a field outwith the constructor is already checked for. I have yet
to introduce checking in the initialization block. Assigning a non-final field in a private method is definitely okay,

Unless this method gets called in a non-private one, possibly indirectly.

but I would hazard a guess it's not a very common use case.

I use initializer helpers when things get too complicated.

I would say that if you're pursuing immutability, you always want to assign to fields within the constructor. Even if you want to delegate some of the complexity to helper methods, you still don't want to assign to fields within them. I.e.: instead of:

private String someField;
public ConstructorOfMyClass() {
assignOneOfMyFields();
}
private void assignOneOfMyFields() {
this.someField = "initialised";
}

The better solution is:

private final String someField;
public ConstructorOfMyClass() {
someField = valueToInitialiseSomeFieldWith();
}
private String valueToInitialiseSomeFieldWith() {
return "initialised";
}

I find this solution better because someField can be declared as final. This has better semantics for an immutable class, both for developers and the Java memory model. Of the immutable implementations I've seen, the latter is much more common than the former. Though I recognise that may not be the case for others. I had once considered the notion of reporting on bad practice or smells - do you think this kind of thing would be a candidate?

I add some more things to think about:

  • Nested classes
  • Static methods
  • Native methods

Yep, these are all subtle points. Nested classes expose their containing class. Static methods may depend on mutable global state. Native methods are totally undiscoverable. Lots to consider :-)

Subclassing is something I'm going to have to think over a bit more deeply - all suggestions are welcome :-)

For a non-final class you can at best say "immutable unless subclassed at runtime", which is important information, too.

This is not quite what I meant (though I did talk about this over at the lombok mailing list). What I mean is, you have a class A which is mutable for some reason, there's potential to have a class B which "cancels out" the mutability of A, and is in fact immutable. Consider if A is a data class (all getters and setters) and B is a nasty subclass which throws exceptions for all setter methods - that's effectively immutable. It's this kind of relationship (and there are many more like it) which is complicated to automatically recognise.

Right. IMHO the only possible solution is to simply declare String as immutable. Unless you're going to make order of magnitude more complicated analysis.

I'd really like to avoid that. Mainly because what String does is not uncommon, or tricky - it's just lazily loading and caching a method result. A tool which couldn't recognise this pattern in a developer's code is always going to be limited. For me, though I envisage there is an order of magnitude more complexity as you say, it's the path I want to take. That's why I say there can't be a 1.0 release until this can be handled to some degree.

Don't miss what Reiner wrote in the Lombok discussion.

Thanks, I had almost forgotten that I posted there, and there was some very interesting points made.

Thanks again,
Graham

@Grundlefleck

This comment has been minimized.

Contributor

Grundlefleck commented Dec 5, 2012

From motlin on August 08, 2012 04:18:59
Although I agree that it would be best to detect the lazy initialization pattern more generally, I'd recommend adding a hard-coding for String now. String is by far the most common use of this pattern, so it would make the library more useful in tests now, and you can take your time figuring out the more general algorithm.

@jufickel

This comment has been minimized.

Contributor

jufickel commented Jan 16, 2013

Regarding the lazy initialization issue I accept the dare. Where would be the best place to integrate the check (org.mutabilitydectector.checkers) and what would be the best approach to commence the implementation?

@Grundlefleck

This comment has been minimized.

Contributor

Grundlefleck commented Jan 16, 2013

Hi Juergen,

I'm working from memory, but I believe the best place is in "SetterMethodChecker". This is the one that raises the warning about reassignment. "NonFinalFieldChecker" will also raise a warning, but will result in EFFECTIVELY_IMMUTABLE, rather than NOT_IMMUTABLE. A fork with pull requests would be awesome!

However, more generally, I would advise against developing code against benchmark test classes for this case. I recommend trying to think about the problem in a more abstract, "academic" way before writing a line of code. It may not need to be implemented as a "sound" analysis, but it should give you background on the kind of false positives and false negatives you can get. Particularly if this is in Master's thesis territory, your paper may want to be more abstract. But you'll be the best judge of that :)

I would say the workflow should be:

  • add a class to src/test/benchmarks/ (some appropriate package in there)
  • write a failing unit test for SetterMethodChecker which runs on the benchmark class
  • implement the code in SetterMethodChecker

I'm not sure how familiar with TDD you are, but any code contribution will be expected to come with a high level of test coverage.

In terms of the implementation of a Checker, they are all tightly bound to ASM's visitor model. This implies that ASM will read the classfile, and your implementation of the visitor will respond to particular values in the classfile. For example, you can write a visitor method which will be invoked when ASM comes across a PUTFIELD instruction. Either before that point, or in response to it, you will want to perform some kind of data flow analysis to figure out what was put in the field, how it was constructed, etc.

This is very high level, not too detailed, but will hopefully let you get started. Happy to continue answering your questions :)

@Grundlefleck

This comment has been minimized.

Contributor

Grundlefleck commented May 18, 2013

Hey @jufickel, have you been able to make any progress? Be good to hear how your work is going :)

@jufickel

This comment has been minimized.

Contributor

jufickel commented May 18, 2013

Hi Graham! Indeed there is some progress. I will check in my work during next week; then we should have a look at it. :-)

@mvanotti

This comment has been minimized.

mvanotti commented Mar 14, 2016

@jufickel @Grundlefleck what's the status of this issue? There is still no fix for the lazy initialization case?

@Grundlefleck

This comment has been minimized.

Contributor

Grundlefleck commented Mar 15, 2016

Hey @mvanotti,

There is still no fix for this case. There was a contribution made by @jufickel ages ago, and for some reason I can't remember, it is not enabled by default. java.lang.String/Integer are hardcoded as immutable, but without enabling a specific code path, there is no attempt to detect lazy initialisation.

Is this issue causing you problems as a user, or do you have another interest that caused you to enquire?

Cheers!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment