Skip to content

Releases: codingchili/excelastic

1.3.8

20 Dec 13:49
Compare
Choose a tag to compare
  • Verified with ElasticSearch 7.10.1
  • Support for mapping Boolean cells in excel files.
  • Show correct index name after importing completes.
  • Added type inference for plain text fields in excel files.
  • Handle integers without adding decimal places for Excel files.
  • Replaced maven with Gradle 3.7.1
  • Target JVM 11, no support for earlier versions.

1.3.7

26 Nov 08:19
Compare
Choose a tag to compare

Fixes

  • #86 Feature to specify ES pipeline contributed by @octaavio

Now tested on ElasticSearch 7.4.0 / 7.4.2.

1.3.6

19 Feb 17:48
Compare
Choose a tag to compare

Fixes

  • CSVParser: cast ByteBuffers to 'Buffer' to avoid JDK8/9 incompatibilities.

Now works on JRE8 -> JRE11.

1.3.5

11 Feb 15:49
Compare
Choose a tag to compare

New Features

  • none.

Fixes

  • upgrade to apache poi 3.17 -> 4.0.1.
  • upgrade vertx 3.5.4 -> 3.6.3
  • new theme.
  • new docker image (upgraded from 1.3.3 -> 1.3.5)

1.3.4

29 Nov 21:22
Compare
Choose a tag to compare

New Features

  • import index can be locked in the web interface through configuration.

Fixes

  • fixes some issues with line endings for CSV imports.

1.3.3

18 Nov 11:07
Compare
Choose a tag to compare

Verified support for ElasticSearch 7.0.0-alpha1.

New features

  • support all configuration options as environment variables for Docker image.
  • the default index is now configurable.
  • support for basepath/reverse proxy - updated resources/websock URLs to relative.

Fixes

  • now uses LF instead of CR to identify line breaks in CSV.
  • exceptions when there is no desktop environment are logged as a warning instead of a severe/stack trace.

CSV imports should now work much better, tested with sample insurance portfolio here:

1.3.2

30 Oct 19:44
Compare
Choose a tag to compare

New features

  • now shows "verifying" in the website when running the verification task (*).
  • improved performance of CSV verification
  • no longer deletes excel files after import is complete.
  • hide excel options on the website.
  • show registered file extensions on the website.
  • new colorful theme.

Issues resolved

  • fixed some bugs in the new CSV parser that caused buffer overflows etc.
  • removed upper limit of CSV file size, uses an array of memory maps.
  • when running the verification task the system parses the whole file once as fast as it can to validate that
    the file is properly formatted. This is done before the import starts, to make sure its able to parse the whole file.

1.3.1

28 Oct 19:42
Compare
Choose a tag to compare

New features

  • docker support - now on dockerhub codingchili/excelastic
  • deletes xlsx/xls files on the server after parsing. *
  • closes workbooks/files when import is completed.

1.3.0

28 Oct 14:17
Compare
Choose a tag to compare

New Features

  • support for importing CSV files.
  • support for registering custom parsers through ParserFactory.
  • upgraded vertx dependency from 3.5.1 to 3.5.4.

1.2.7

28 Apr 14:14
Compare
Choose a tag to compare

Background

Changes to performance

Slightly reduces the memory consumption required by not parsing the full excel into JSON objects at once. With this release we will parse the excel file two times. The first time is to make sure that the file is well formatted before we start importing it. This does not actually create any JSON objects on the heap to save memory. The yield is minimal however, as Apache POI which is used to parse xlsx/xls files consume the majority of the available memory.

Other performance improvements includes concurrent parsing and indexing. While we are waiting for a response we will parse the next N number of rows to be indexed. When the request completes (for each 128 imported items) we check the response code and start indexing the next N items which will already be parsed. Additionally, while parsing each produced JSON object will be streamed into a chunked connection to the ElasticSearch server. This means we can parse the excel file in buckets and still only need to reference 1 JSON object at a time. Additionally (again), we generate the header that is required for each imported element only once per import.

In order to accomplish this we have significantly simplified the source code and documented it accordingly. We added RxJava and turned the FileParser into an observable. Added a new event bus codec so that we can pass the new ImportEvent type over the event bus without serializing it.

A summary of changes

  • added RxJava and turned FileParser into an observable
  • Renamed 'parsing' into 'uploading' in the UI - 90% of the time is actually spent uploading the file.
  • Moved all logging statements into a single class for readability.
  • Replaced custom Atomic reference with java's AtomicReference
  • Moved the CommandLine importer into its own controller.
  • Encapsulate a request into an ImportEvent and pass it over the bus with a custom codec.
  • Cleaned up the code and added javadoc to all classes.

And a summary of the summary

This release includes performance improvements as well as improvements to code quality.