Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Elasticsearch backed sail store #1662

Closed
hmottestad opened this issue Nov 4, 2019 · 3 comments
Closed

Elasticsearch backed sail store #1662

hmottestad opened this issue Nov 4, 2019 · 3 comments
Assignees
Labels
📶 enhancement issue is a new feature or improvement M1 Fixed in milestone 1
Milestone

Comments

@hmottestad
Copy link
Contributor

hmottestad commented Nov 4, 2019

Create a sail store that stores and queries triples in Elasticsearch 6.x.

Consider

  • Bulkloading
  • optimizing for "hasStatement"
  • consider using cursor
  • considering caching the transaction in a local memorystore
  • consider caching in general for read operations
  • target higher performance for read operations rather than writes
  • this is meant for storing small datasets where transaction support is not needed, like storing reference data used in consortion with your current use of elasticsearch

Use the following for testing:

        <dependency>
            <groupId>pl.allegro.tech</groupId>
            <artifactId>embedded-elasticsearch</artifactId>
            <version>2.10.0</version>
            <scope>test</scope>
        </dependency>
        private static EmbeddedElastic embeddedElastic;

	private static File installLocation = Files.newTemporaryFolder();


	@BeforeClass
	public static void beforeClass() throws IOException, InterruptedException {


		String version = "6.8.3";

		embeddedElastic = EmbeddedElastic.builder()
			.withElasticVersion(version)
			.withSetting(PopularProperties.TRANSPORT_TCP_PORT, 9350)
			.withSetting(PopularProperties.CLUSTER_NAME, "cluster")
			.withInstallationDirectory(installLocation)
			.withDownloadDirectory(new File("tempDownloads"))
//			.withPlugin("analysis-stempel")
//			.withIndex("cars", IndexSettings.builder()
//				.withType("car", getSystemResourceAsStream("car-mapping.json"))
//				.build())
//			.withIndex("books", IndexSettings.builder()
//				.withType(PAPER_BOOK_INDEX_TYPE, getSystemResourceAsStream("paper-book-mapping.json"))
//				.withType("audio_book", getSystemResourceAsStream("audio-book-mapping.json"))
//				.withSettings(getSystemResourceAsStream("elastic-settings.json"))
//				.build())
			.withStartTimeout(5, TimeUnit.MINUTES)
			.build();


		embeddedElastic.start();
	}
@hmottestad hmottestad added the 📶 enhancement issue is a new feature or improvement label Nov 4, 2019
hmottestad added a commit that referenced this issue Nov 6, 2019
Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>
hmottestad added a commit that referenced this issue Nov 6, 2019
Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>
hmottestad added a commit that referenced this issue Nov 6, 2019
Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>
hmottestad added a commit that referenced this issue Nov 7, 2019
Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>
hmottestad added a commit that referenced this issue Nov 7, 2019
Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>
@hmottestad
Copy link
Contributor Author

We can now add triples and query for them. Next will be support for blank nodes and for values. After than maybe bulk loading. Scroll for getting statements and removal of statements.

hmottestad added a commit that referenced this issue Nov 7, 2019
Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>
hmottestad added a commit that referenced this issue Nov 7, 2019
Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>
hmottestad added a commit that referenced this issue Nov 8, 2019
Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>
hmottestad added a commit that referenced this issue Nov 10, 2019
Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>
hmottestad added a commit that referenced this issue Nov 10, 2019
Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>
hmottestad added a commit that referenced this issue Nov 10, 2019
Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>
hmottestad added a commit that referenced this issue Nov 10, 2019
Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>
hmottestad added a commit that referenced this issue Nov 10, 2019
Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>
hmottestad added a commit that referenced this issue Nov 11, 2019
Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>
hmottestad added a commit that referenced this issue Nov 11, 2019
Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>
hmottestad added a commit that referenced this issue Nov 11, 2019
Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>
hmottestad added a commit that referenced this issue Nov 11, 2019
Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>
hmottestad added a commit that referenced this issue Nov 11, 2019
Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>
hmottestad added a commit that referenced this issue Nov 11, 2019
Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>
hmottestad added a commit that referenced this issue Nov 11, 2019
Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>
hmottestad added a commit that referenced this issue Nov 13, 2019
Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>
hmottestad added a commit that referenced this issue Nov 14, 2019
Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>
hmottestad added a commit that referenced this issue Nov 14, 2019
Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>
hmottestad added a commit that referenced this issue Nov 14, 2019
Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>
hmottestad added a commit that referenced this issue Nov 14, 2019
Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>
hmottestad added a commit that referenced this issue Nov 14, 2019
Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>
hmottestad added a commit that referenced this issue Nov 14, 2019
Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>
hmottestad added a commit that referenced this issue Nov 21, 2019
Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>
hmottestad added a commit that referenced this issue Nov 21, 2019
Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>
hmottestad added a commit that referenced this issue Nov 21, 2019
Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>
hmottestad added a commit that referenced this issue Nov 21, 2019
Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>
hmottestad added a commit that referenced this issue Nov 21, 2019
Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>
hmottestad added a commit that referenced this issue Nov 21, 2019
Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>
hmottestad added a commit that referenced this issue Nov 21, 2019
Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>
hmottestad added a commit that referenced this issue Nov 21, 2019
Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>
hmottestad added a commit that referenced this issue Nov 21, 2019
Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>
hmottestad added a commit that referenced this issue Nov 21, 2019
Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>
hmottestad added a commit that referenced this issue Nov 21, 2019
Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>
hmottestad added a commit that referenced this issue Nov 21, 2019
Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>
hmottestad added a commit that referenced this issue Nov 21, 2019
Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>
hmottestad added a commit that referenced this issue Nov 21, 2019
Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>
hmottestad added a commit that referenced this issue Nov 21, 2019
Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>
hmottestad added a commit that referenced this issue Nov 24, 2019
Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>
hmottestad added a commit that referenced this issue Nov 29, 2019
Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>
hmottestad added a commit that referenced this issue Nov 29, 2019
…eration wrapper

Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>
hmottestad added a commit that referenced this issue Dec 1, 2019
Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>
hmottestad added a commit that referenced this issue Dec 3, 2019
* #1662 benchmarks for memory and native stores to see if changes in the other 1662 branche make things faster

Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>

* #1662 initial commit

Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>

* #1662 initial framework

Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>

* now allows data to be added

Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>

* #1662 run up elasticsearch for testing

Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>

* #1662 saving to elasticsearch works now

Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>

* #1662 retrieve statements

Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>

* fixed test

Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>

* #1662 support for bnodes in subject position

Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>

* #1662 support for literals

Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>

* formatter

Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>

* trying to use a shared connection

Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>

* #1662 use scroll and bulk for performance

Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>

* various tweaks

Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>

* formatting

Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>

* #1662 benchmark and client pool

Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>

* #1662 cleanup on GC

Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>

* #1662 sparql test

Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>

* #1662 faster delete and fast clear

Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>

* #1662 considerably faster deletes now

Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>

* fixed compile issues

Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>

* #1662 support read uncommitted, which is needed to support rollback

Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>

* #1662 rollback not working as expected

Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>

* #1662 fixed readcommittedwrapper

Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>

* #1662 hash based object, since they can be very large

Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>

* #1662 a couple more tests and sleep if elasticsearch fails

Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>

* #1662 configurable elasticsearch scroll timeout

Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>

* #1662 handle duplicates

Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>

* #1662 started integrating tests and namespace store

Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>

* #1662 integrating more compliance tests

Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>

* fixed notifying aspect of sail

Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>

* #1662 more passing tests

Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>

* more fixes

Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>

* #1662 introduced new interface for SailSink and reverted the old one

Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>

* #1662 faster delete by query because we don't need to flush

Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>

* trying out a write cache

Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>

* #1662 throw exception on init after shutdown

Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>

* #1662 persitent namespacestore

Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>

* fix null

Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>

* integrate more tests

Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>

* extracted a new extensible store

Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>

* default namespace store

Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>

* simplified code

Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>

* more code simplification

Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>

* more cleanup

Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>

* changes after review

Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>

* fixes from review

Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>

* more changes

Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>

* more fixes

Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>

* more deprecate by query

Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>

* use random ports for elasticsearch

Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>

* another benchmark

Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>

* update benchmarks

Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>

* nativestore performance improvements for isolation none

Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>

* update benchmarks

Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>

* fix

Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>

* updated benchmark

Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>

* updated elasticsearch benchmark

Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>

* fix

Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>

* updated benchmarks

Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>

* faster delete by query for elasticsearch

Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>

* comments

Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>

* removed isolation level check

Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>

* faster tests

Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>

* updated benchmark and renamed some files

Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>

* more benchmarks

Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>

* allow parallel transactions

Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>

* fixed test

Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>

* deprecate the deprecate method in favour of the pure statement based method

Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>

* #1662 more efficient deprecate in MemorySailStore

Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>

* benchmark

Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>

* performance optimization

Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>

* fixes for explicit and snapshot

Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>

* native sail store deprecate by query

Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>

* #1662 added config options for the elasticsearch store

Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>

* #1662 marked more code as experimental and implemented a FiltertingIteration wrapper

Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>

* #1662 created WAL test

Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>

* WAL test for data removal

Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>

* wip

Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>

* minor cleanup and renaming

Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>

* review fixes

Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>
@hmottestad hmottestad added the 📝 Needs documentation Issue requires updates to the project documentation label Dec 3, 2019
@hmottestad hmottestad added this to the 3.1.0 milestone Dec 3, 2019
@hmottestad hmottestad self-assigned this Dec 18, 2019
hmottestad added a commit to eclipse-rdf4j/rdf4j-doc that referenced this issue Dec 18, 2019
Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>
hmottestad added a commit to eclipse-rdf4j/rdf4j-doc that referenced this issue Dec 18, 2019
Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>
hmottestad added a commit to eclipse-rdf4j/rdf4j-doc that referenced this issue Dec 18, 2019
Signed-off-by: Håvard Ottestad <hmottestad@gmail.com>
@hmottestad hmottestad removed the 📝 Needs documentation Issue requires updates to the project documentation label Dec 18, 2019
@abrokenjester abrokenjester added the M1 Fixed in milestone 1 label Dec 19, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
📶 enhancement issue is a new feature or improvement M1 Fixed in milestone 1
Projects
None yet
Development

No branches or pull requests

3 participants