Optimize find ancestor #4294

magicwerk · 2024-01-18T23:35:10Z

HasParentNode.findAncestor() uses Arrays.stream() to handle the varargs types parameters.
This may be convenient, but is both inefficient in terms of performance and memory consumption.
The proposed new implementation uses the good old iteration which improves numbers by factor 2, see JMH benchmark below.

Eval_FindAncestor.testImproved                     thrpt    2  34631766.865           ops/s
Eval_FindAncestor.testImproved:gc.alloc.rate       thrpt    2     10830.080          MB/sec
Eval_FindAncestor.testImproved:gc.alloc.rate.norm  thrpt    2       328.000            B/op
Eval_FindAncestor.testImproved:gc.count            thrpt    2        34.000          counts
Eval_FindAncestor.testImproved:gc.time             thrpt    2        24.000              ms

Eval_FindAncestor.testCurrent                      thrpt    2  18087357.962           ops/s
Eval_FindAncestor.testCurrent:gc.alloc.rate        thrpt    2      9795.026          MB/sec
Eval_FindAncestor.testCurrent:gc.alloc.rate.norm   thrpt    2       568.000            B/op
Eval_FindAncestor.testCurrent:gc.count             thrpt    2        36.000          counts
Eval_FindAncestor.testCurrent:gc.time              thrpt    2        22.000              ms

jlerbsc · 2024-01-28T08:43:40Z

Thank you for this suggestion. Before accepting, could you share your test application?

magicwerk · 2024-01-28T19:47:25Z

sure, here we go!

generally streams should be used with care as they are not only slower than the traditional alternatives, but also create temporary objects which cannot be optimized away by the JIT until now (including Java 21).

	public static class Eval_FindAncestor {

		@State(Scope.Benchmark)
		public static class CheckState {
			CompilationUnit cu = parse(
					"class Foo {\n" +
							"    void foo() {\n" +
							"        try {\n" +
							"        } catch (Exception e) {\n" +
							"        } finally {\n" +
							"            try {\n" +
							"            } catch (Exception e) {\n" +
							"                foo();\n" +
							"            } finally {\n" +
							"            }\n" +
							"        }\n" +
							"\n" +
							"    }\n" +
							"}\n");

			// find the method call expression foo()
			MethodCallExpr methodCallExpr = cu.findFirst(MethodCallExpr.class).orElse(null);
		}

		@Benchmark
		public Object testCurrent(CheckState state) {
			BlockStmt block = findAncestorCurrent(state.methodCallExpr, x -> true, BlockStmt.class).orElse(null);
			return block;
		}

		@Benchmark
		public Object testImproved(CheckState state) {
			BlockStmt block = findAncestorImproved(state.methodCallExpr, x -> true, BlockStmt.class).orElse(null);
			return block;
		}

		<N> Optional<N> findAncestorCurrent(Node node, Predicate<N> predicate, Class<N>... types) {
			if (!node.hasParentNode())
				return Optional.empty();
			Node parent = node.getParentNode().get();
			Optional<Class<N>> oType = Arrays.stream(types).filter(type -> type.isAssignableFrom(parent.getClass()) && predicate.test(type.cast(parent)))
					.findFirst();
			if (oType.isPresent()) {
				return Optional.of(oType.get().cast(parent));
			}
			return parent.findAncestor(predicate, types);
		}

		<N> Optional<N> findAncestorImproved(Node node, Predicate<N> predicate, Class<N>... types) {
			if (!node.hasParentNode())
				return Optional.empty();
			Node parent = node.getParentNode().get();
			for (Class<N> type : types) {
				if (type.isAssignableFrom(parent.getClass()) && predicate.test(type.cast(parent))) {
					return Optional.of(type.cast(parent));
				}
			}
			return parent.findAncestor(predicate, types);
		}
	}

jlerbsc · 2024-01-29T07:56:59Z

You highlight the result of a micro-benchmark which shows an overload linked to the use of streams. But I don't think that eliminating the use of streams in this case will significantly improve the way JP works.

Furthermore, although the results seem correct, your test case isn't because in the case of the "current" implementation the JP logic is tested twice, and in the case of the "improved" implementation the JP logic is systematically executed.

We would like to thank you for your investigative work, but we are not going to accept your proposal until we can demonstrate a significant improvement in JP's performance. In the case you highlight, the use of streams makes the processing a little more readable.

However, I'll leave this PR open if you'd like to add to your demonstration or correct your test case.

magicwerk · 2024-01-29T08:52:19Z

First of all I have to say that I am somehow embarassed that the benchmark I sent to you is not fully correct, sorry for that.
Second it is true, that eliminating the use of streams will not signifiantly improve the way JP works.
However I think as JP is a library designed for general use, it should do its work in the most efficient work possible, e.g. reduce the allocation load and the pressure to the GC the best it can - which is demonstrated by the benchmark.
If I use the JP to analyze a huge amount of classes, I can see quite some pressure on GC and the proposed change would help alleviate this a little bit.

You can find the corrected benchmark appended together with the updated performance figures. As I did not fully understand all your comments, I also added testCurrentJavaparser to prove that findAncestorCurrent and the implementation actually in the library behave identical.

After the correction, the performance and allocation load are now improved even more compared to the current implementation, now by factor 4. Of course these numbers depend on the code snippet used etc, but it nevertheless shows that an improvement is possible by changing a single line of code.
You can find a lot of discussions about how streams should be used, but it seems to be generally accepted, that they should not be used in constructs like tight loops.
I think it will be hard to come up with a demonstration which in not a micro benchmark, so I ask you to review the corrected test case again.
As user of JP, I would definitely prefer the implementation which excels in performance and memory consumption, even if the internals could end up a little less readable.

Eval_FindAncestor.testCurrent                               thrpt    2  10152629.747           ops/s
Eval_FindAncestor.testCurrent:gc.alloc.rate                 thrpt    2      5497.955          MB/sec
Eval_FindAncestor.testCurrent:gc.alloc.rate.norm            thrpt    2       568.000            B/op
Eval_FindAncestor.testCurrent:gc.count                      thrpt    2        25.000          counts
Eval_FindAncestor.testCurrent:gc.time                       thrpt    2        12.000              ms

Eval_FindAncestor.testCurrentJavaparser                     thrpt    2  10117285.996           ops/s
Eval_FindAncestor.testCurrentJavaparser:gc.alloc.rate       thrpt    2      5478.827          MB/sec
Eval_FindAncestor.testCurrentJavaparser:gc.alloc.rate.norm  thrpt    2       568.000            B/op
Eval_FindAncestor.testCurrentJavaparser:gc.count            thrpt    2        25.000          counts
Eval_FindAncestor.testCurrentJavaparser:gc.time             thrpt    2        12.000              ms

Eval_FindAncestor.testImproved                              thrpt    2  49480564.416           ops/s
Eval_FindAncestor.testImproved:gc.alloc.rate                thrpt    2      4906.273          MB/sec
Eval_FindAncestor.testImproved:gc.alloc.rate.norm           thrpt    2       104.000            B/op
Eval_FindAncestor.testImproved:gc.count                     thrpt    2        22.000          counts
Eval_FindAncestor.testImproved:gc.time                      thrpt    2        10.000              ms

	public static class Eval_FindAncestor {

		@State(Scope.Benchmark)
		public static class CheckState {
			CompilationUnit cu = parse(
					"class Foo {\n" +
							"    void foo() {\n" +
							"        try {\n" +
							"        } catch (Exception e) {\n" +
							"        } finally {\n" +
							"            try {\n" +
							"            } catch (Exception e) {\n" +
							"                foo();\n" +
							"            } finally {\n" +
							"            }\n" +
							"        }\n" +
							"\n" +
							"    }\n" +
							"}\n");

			// find the method call expression foo()
			MethodCallExpr methodCallExpr = cu.findFirst(MethodCallExpr.class).orElse(null);
		}

		@Benchmark
		public Object testCurrentJavaparser(CheckState state) {
			BlockStmt block = state.methodCallExpr.findAncestor(x -> true, BlockStmt.class).orElse(null);
			return block;
		}

		@Benchmark
		public Object testCurrent(CheckState state) {
			BlockStmt block = findAncestorCurrent(state.methodCallExpr, x -> true, BlockStmt.class).orElse(null);
			return block;
		}

		@Benchmark
		public Object testImproved(CheckState state) {
			BlockStmt block = findAncestorImproved(state.methodCallExpr, x -> true, BlockStmt.class).orElse(null);
			return block;
		}

		<N> Optional<N> findAncestorCurrent(Node node, Predicate<N> predicate, Class<N>... types) {
			if (!node.hasParentNode())
				return Optional.empty();
			Node parent = node.getParentNode().get();
			Optional<Class<N>> oType = Arrays.stream(types).filter(type -> type.isAssignableFrom(parent.getClass()) && predicate.test(type.cast(parent)))
					.findFirst();
			if (oType.isPresent()) {
				return Optional.of(oType.get().cast(parent));
			}
			return findAncestorCurrent(parent, predicate, types);
		}

		<N> Optional<N> findAncestorImproved(Node node, Predicate<N> predicate, Class<N>... types) {
			if (!node.hasParentNode())
				return Optional.empty();
			Node parent = node.getParentNode().get();
			for (Class<N> type : types) {
				if (type.isAssignableFrom(parent.getClass()) && predicate.test(type.cast(parent))) {
					return Optional.of(type.cast(parent));
				}
			}
			return findAncestorImproved(parent, predicate, types);
		}
	}

jlerbsc · 2024-01-29T10:44:56Z

Your test case is still not correct. It seems to me that this is what you want to test on the current implementation.

	<N> Optional<N> findAncestorCurrent(Node node, Predicate<N> predicate, Class<N>... types) {
		return node.findAncestor(predicate, types);
	}

We are ready to improve the operation of JP as soon as it makes sense and the improvement is clearly perceptible. In the case you present, the pressure on memory is marginal (gc.count and gc.time), and the memory allocation rate is not a determining factor in the choice of optimisation either. All that's left is the rate of operations per second, which is much higher in the version you're proposing than in the current implementation.

This is probably due to the use of streams. But once again the micro-benchmark raises a problem of consistency because this method is rarely used in loops. Furthermore, when used in a loop, the jit compiler improves the efficiency of the method from the second iteration onwards.

My feeling is that this improvement is very marginal and will not bring any visible results to JP users. If you want to improve JP, I suggest you use a profiler on a concrete case and identify hotspots (memory, cpu, etc.). We can then see how to improve JP's behaviour in these use cases.

magicwerk · 2024-01-29T11:04:37Z

IMHO the test case is correct now: testCurrentJavaparser() tests the current implementation contained in the JP library, testCurrent() tests the same code but copied into the test code as findAncestorCurrent() so it can easily be compared with findAncestorImproved().

As said, it will be hard to come up with a more meaningful benchmark.
JIT is doing a lot of awful optimizations but even with Java 21 it cannot optimize the overhead of streams away.
The micro benchmark shows that performance will be better and less memory allocated with the change proposed.
I agree that the improvement will be marginal and hardly noticeable except if the method is used heavily.
On the other hand, the improvement can be realized by changing 3 lines of code and has no negative effect.

Finally it's up to you to make the decision, so feel free to close the PR.

jlerbsc · 2024-01-29T13:42:29Z

Thank you for your proposal and the time you've devoted to it, but as I've already told you, given that the improvement is marginal, I prefer to focus on visibility rather than the performance of an algorithm.

jlerbsc · 2024-01-29T13:59:22Z

Finally, I'm reconsidering my initial position, even if the improvement is marginal. In fact, the current code is no more readable than the one in your proposal, so we can accept this one. Thank you for your contribution.

magicwerk · 2024-01-29T14:32:55Z

that was now really an unexpected change after the discussion, but thanks for accepting.

jlerbsc · 2024-01-29T14:44:15Z

I simply reread the current code and it didn't seem any easier to read than your proposal. I'm just waiting to be convinced by the new proposals. Yours was just on the edge. Thank you for your insistence, which has enabled me to challenge my position.

magicwerk added 2 commits January 19, 2024 00:21

optimize HasParentNode.findAncestor

dfd8dfb

fix formatting

91eb0fc

jlerbsc closed this Jan 29, 2024

jlerbsc reopened this Jan 29, 2024

Merge branch 'master' into OptimizeFindAncestor

30c7d1a

jlerbsc merged commit ae0a80e into javaparser:master Jan 29, 2024
37 of 38 checks passed

jlerbsc added this to the next release milestone Jan 29, 2024

jlerbsc added the PR: Changed A PR that changes implementation without changing behaviour (e.g. performance) label Jan 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize find ancestor #4294

Optimize find ancestor #4294

magicwerk commented Jan 18, 2024

jlerbsc commented Jan 28, 2024

magicwerk commented Jan 28, 2024 •

edited

jlerbsc commented Jan 29, 2024

magicwerk commented Jan 29, 2024

jlerbsc commented Jan 29, 2024

magicwerk commented Jan 29, 2024

jlerbsc commented Jan 29, 2024

jlerbsc commented Jan 29, 2024

magicwerk commented Jan 29, 2024 •

edited

jlerbsc commented Jan 29, 2024

Optimize find ancestor #4294

Optimize find ancestor #4294

Conversation

magicwerk commented Jan 18, 2024

jlerbsc commented Jan 28, 2024

magicwerk commented Jan 28, 2024 • edited

jlerbsc commented Jan 29, 2024

magicwerk commented Jan 29, 2024

jlerbsc commented Jan 29, 2024

magicwerk commented Jan 29, 2024

jlerbsc commented Jan 29, 2024

jlerbsc commented Jan 29, 2024

magicwerk commented Jan 29, 2024 • edited

jlerbsc commented Jan 29, 2024

magicwerk commented Jan 28, 2024 •

edited

magicwerk commented Jan 29, 2024 •

edited