Skip to content

1.2.7

Compare
Choose a tag to compare
@codingchili codingchili released this 28 Apr 14:14
· 76 commits to master since this release

Background

Changes to performance

Slightly reduces the memory consumption required by not parsing the full excel into JSON objects at once. With this release we will parse the excel file two times. The first time is to make sure that the file is well formatted before we start importing it. This does not actually create any JSON objects on the heap to save memory. The yield is minimal however, as Apache POI which is used to parse xlsx/xls files consume the majority of the available memory.

Other performance improvements includes concurrent parsing and indexing. While we are waiting for a response we will parse the next N number of rows to be indexed. When the request completes (for each 128 imported items) we check the response code and start indexing the next N items which will already be parsed. Additionally, while parsing each produced JSON object will be streamed into a chunked connection to the ElasticSearch server. This means we can parse the excel file in buckets and still only need to reference 1 JSON object at a time. Additionally (again), we generate the header that is required for each imported element only once per import.

In order to accomplish this we have significantly simplified the source code and documented it accordingly. We added RxJava and turned the FileParser into an observable. Added a new event bus codec so that we can pass the new ImportEvent type over the event bus without serializing it.

A summary of changes

  • added RxJava and turned FileParser into an observable
  • Renamed 'parsing' into 'uploading' in the UI - 90% of the time is actually spent uploading the file.
  • Moved all logging statements into a single class for readability.
  • Replaced custom Atomic reference with java's AtomicReference
  • Moved the CommandLine importer into its own controller.
  • Encapsulate a request into an ImportEvent and pass it over the bus with a custom codec.
  • Cleaned up the code and added javadoc to all classes.

And a summary of the summary

This release includes performance improvements as well as improvements to code quality.