New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid returning null lists #106
Conversation
Consistency wise, I'd prefer to either have
Seeing many places with
|
Another idea (sorry for not coming up with this earlier) is
and then we end up with
which is shorter and better when it comes to communicating the intention. |
I'm sorry for bringing this up again, but I just realized and can't remember if we fully discussed this: with this solution, we're introducing an inconsistency in that with those changes, you cannot add new elements to a list you did not set explicitly. |
The reason for the different patterns is because some parameters are final, in that case I go for the if/else Don't be sorry for improving things :) Yes, it is true that you cannot add elements to a list that you did not set, however you probably should not expect this to work: you cannot be sure we return the internal list instead of a copy. If we want to give the possibility for add to work I think we can instantiate a new mutable list instead of using Collections.emptyList, however there were some concerns with that in the past (lot of allocations). |
In the meantime I will introduce the method:
|
But the I'm fine with returning immutable lists; but then let's document that. |
About documenting the fact we could return an immutable list: would be enough to do that in the wiki or in the documentation at package level or do we have to repeat that for each list setters? P.S. thanks for helping in improving this patch! |
Judging from me, anything else than at each getter (<- not setter) is 2015-02-23 16:07 GMT+01:00 Federico Tomassetti notifications@github.com:
|
Ok, would it work to add this comment on each setters for lists?
|
I would omit the brackets. Like Federico Tomassetti notifications@github.com schrieb am Mon Feb 23 2015
|
As I explained on the chat, I think it is counter-intuitive (and inconsistent) to return mutable list and immutable empty lists. The point of returning something else than null is to avoid some null check to users.
to
I find the latter worse for the reason that it is an non-natural idiom for a java coder whereas the first one is (unfortunately, but that's life...). So I would advise to either:
|
I would vote for the mutable lists, possibly lazy initialized in the getters |
I was thinking about externalizing all the new XXXNode(...) of the grammar in a factory that the parser have a pointer to. This is easy to do, just a little painful. However, we could add a postProcessList(...) method to the factory (by default noop) called for every list arguments before calling the constructors and have two subclasses one that ensureNotNull the lists with empty immutable lists and one that ensureNotNull the lists with empty mutable lists. Would that fit the bill ? Edit: I must add that my idea for the node factory comes from the fact that in the next major release we could have two AST, an immutable and a mutable one. Details to be thought about but voilà! |
Can someone show how that node factory is supposed to work? |
After half a year of thinking about this issue, I would suggest to go for immutable lists lazily created (so created when someone call the getter). In our own code we had several bugs caused by this null lists and I think the same apply to the code written by users, so my personal opinion is that we should get rid of null lists. However this is a potential concern for performance (we are instantiating a lot of lists potentially). Returning the immutable list could reduce the performance hit but it is true that it could be confusing so we should probably avoid that. Do we want to have a last round of opinion and close this PR if we do not find an acceptable compromise? |
I feel it would lead to more complex code than simply initializing everything with an empty list, but I'm glad to pull anything that will kill those nulls. Nulls must die. |
Is it possible to quantify the performance degradation similar to in #113 ? We're just guessing otherwise, there may be an obvious answer with some form a of bench marking. |
I measured performance like this: public static void main(String[] args) throws IOException, ParseException {
long a = System.currentTimeMillis();
for (int i=0;i<50;i++) {
JavaParser.parse(new File("javaparser-core/src/main/java/com/github/javaparser/JavaParser.java"));
JavaParser.parse(new File("javaparser-core/src/main/java/com/github/javaparser/ASTHelper.java"));
JavaParser.parse(new File("javaparser-core/src/main/java/com/github/javaparser/ast/visitor/CloneVisitor.java"));
}
long b = System.currentTimeMillis();
System.out.println(b-a);
} With master I get: 1544, 1608, 1622 |
Note that some tests are broken, so if we think this is the way to go I will fix them and squash some commits before merging it. |
Now we only need some power users to tell us what this increase would mean to them :) Or maybe we can decide on how fast we want to be. Just to make this thing SMART. |
Assuming these measurements are meaningful (and they are probably not, we did not consider the initial "heating" of the JVM) we have an increase in time ranging from 4% to 14% (considering the best and the worst combinations of measures). To me it seems adequate given the amount of possible issues solved. We could maybe carve out some improvements with lazy initialization. |
Now tests are passing. Note that the issue was due to the fact that DumpVisitor is checking for null instead of null or empty list. While we fixed several of these issues in the past there were still a few here and there. I suspect similar errors could be done by our users. |
e286d81
to
9a06349
Compare
9a06349
to
73cd6c2
Compare
I would perform some measurement again with lazy instantiation |
I measured the last version, which uses mutable lists and lazy loading (i.e., we check for null before returning the value in the getter). I used this program: https://gist.github.com/ftomassetti/74217a4063da29bf3ced With this patch it parsed 44340 in:
|
@ftomassetti Maybe you should also measure printing in your micro-benchmark. As lazy initialization will be triggered at tree traversal, printing performance might reflect it. |
@ptitjes this is a fair point, but if IO is involved I think it is slow enough that the creation of lists should not be relevant. I will measure again, dumping to file. There were also setters where the call to |
@ftomassetti You're right! I forgot that. Maybe tweaking a little SourcePrinter to use a Writer and giving it a noop Writer might be worth trying ? |
I updated the script and dumped to file (I imagine this is a more typical scenario that dumping and throwing away the result). It took 26000 on master and 26513 on this branch. |
Well, IMO this is really not a big overhead! This lazy-initializing solves the null encountering danger without sacrificing the mutability of the AST in JavaParser. This is worth it! And you can still choose to later specialize the list by adding the parent as an argument to (maybe renamed) |
Merging with the approval of @matozoid who is hopefully winning his match of Exploding Kittens |
Issue103: apply JavaParser code style to all source code
We have discussed about it on Gitter, I hope I did not forget any of the advices I received if I did please forgive me and remember me and I will update the pull requests.