Skip to content
This repository was archived by the owner on May 16, 2025. It is now read-only.
This repository was archived by the owner on May 16, 2025. It is now read-only.

Identify performance bottlenecks #207

@dr0i

Description

@dr0i

@blackwinter experienced a serious decline of performance when comparing a comprehensive (~ 20M documents) ETL workflow using fix in comparison to a similar morph. Factor was about 100 34 between 34 and 100.
It's very likely that there are a few bottlenecks which we should able be get rid off once identified.
A common approach is to use a profiler to identify bottlenecks.

With #198 there is a possibility to get a Memory Snapshot (hprof) which can be analyzed e.g. further tools like e.g. MAT.
Unfortunately the produced hprofs via #198 are somehow faulty and cannot be loaded into MAT (or the native jhat). It's not clear if this is caused by using gradle (e.g. in conjunction with a preliminary cut-off writing the whole hprof) or a bug in java 1.8 or configuration error (tried several ones - i.a. of course using binary format, and heap=all etc). However, to identify a bottleneck we have to use real workflows with (lots of) real data. I am confident that such a Memory Snapshot is computable by e.g. MAT.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

Status

Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions