Refactor pluginservice #12367

rmuir · 2015-07-21T11:30:39Z

Today plugins are all dumped into the main classloader and can easily have jar hell dependencies with each other (great example is the python scripting plugin). Also pluginservice is messy, there is way too much leniency, and most logic is accomplished with heuristics.

Instead we need mandatory metadata that is explicit: enough to easily iterate/manipulate/validate plugins from the filesystem. This simplifies the code and we can move forward to make this stuff really reliable. There are currently also a ton of ways to do things, I removed any esoteric options/unnecessary stuff from this code here: back to basics guys.

Instead of dumping all code into one classloader, plugins can just have a standard URLClassLoader with ES loader as the parent. We don't need any special classloader because we simply don't allow conflicts with the parent (jarhell will simply fail).

Unfortunately some plugins might not be so clean, and share hard dependencies with each other. For now those plugins can set isolated to false until they clean their act up. All plugins that do this stuff just share a loader with each other. We should consider this deprecated since it makes things more complicated than it needs to be. Loose coupling, please.

In all cases we aggressively jarhell check, and all plugin manipulation is underneath securitymanager protection.

With this branch all plugin integration tests are passing, and moreover the python script that loads all of them at once finally passes:

[INFO ][plugins                  ] [smoke_tester] loaded [lang-python, analysis-kuromoji, analysis-smartcn, cloud-gce, analysis-stempel, delete-by-query, cloud-aws, analysis-phonetic, cloud-azure, lang-javascript, analysis-icu], sites []

I disabled most of the unit tests, but they can be fixed / new ones written after sleep. There are also a lot of checks we should add in the future, and PluginManager should reuse some of this logic to prevent installing a plugin that isn't going to work or somehow screw the installation.

Closes #11917

uschindler · 2015-07-21T14:03:17Z

+1 Nice work!

…ransportInstanceSingleOperationAction and rely on transport service to execute locally. (forking thread etc.) Change TransportInstanceSingleOperationAction to have shardActionHandler to, so we can execute locally without endless spinning.

In order to unify the handling and reuse the CLITool infrastructure the plugin manager should make use of this as well. This obsolets the -i and --install options but requires the user to use `install` as the first argument of the CLI. This is basically just a port of the existing functionality, which is also the reason why this is not a refactoring of the plugin manager, which will come in a separate commit.

The new plugin manager parser was not called correctly in the scripts. In addition the plugin manager now creates a plugins/ directory in case it does not exist. Also the integration tests called the plugin manager in the deprecated way.

It asks to double check thread pool rejection. I did and don't see problems with it.

…lled from the update cluster state thread This is important for correct handling of the joining thread. This causes assertions to trip in our test runs. See http://build-us-00.elastic.co/job/es_g1gc_master_metal/11653/ as an example Closes elastic#12372

The release and smoke test python scripts used to install plugins in the old fashion. Also the BATS testing suite installed/removed plugins in that way. Here the marvel tests have been removed, as marvel currently does not work with the master branch. In addition documentation has been updated as well, where it was still missing.

There is no need to maintain additional state as to if a primary was allocated post api creation on the index routing table, we hold all this information already in the UnassignedInfo class. closes elastic#12374

- field names cannot be mapped with `.` in them - fixed asciidoc issue where the list was not recognized as a list

…ge path In preparation for the move to building the core zip, tar.gz, rpm, and deb as separate modules, refactored check_license_and_sha.pl to: * accept a license dir and path to the package to check on the command line * to be able to extract zip, tar.gz, deb, and rpm * all packages except rpm will work on Windows

The help files are using a unix based file separator, where as the test relies on the help being based on the file system separator. This commit fixes the test to remove all `\r` characters before comparing strings. The test has also been moved into its own CliToolTestCase, as it does not need to be an integration test.

instead fail the failsafe plugin, which means the external cluster will still get shut down

With the `recover_on_any_node` setting, these unit tests check that the correct node list and versions are returned.

…and do not unnecessarily duplicate Lucene segments in Azure Storage This commit addresses an issue that was leading to snapshot corruption for snapshots stored as blobs in Azure Storage. The underlying issue is that in cases when multiple snapshots of an index were taken and persisted into Azure Storage, snapshots subsequent to the first would repeatedly overwrite the snapshot files. This issue does render useless all snapshots except the final snapshot. The root cause of this is due to String concatenation involving null. In particular, to list all of the blobs in a snapshot directory in Azure the code would use the method listBlobsByPrefix where the prefix is null. In the listBlobsByPrefix method, the path keyPath + prefix is constructed. However, per 5.1.11, 5.4 and 15.18.1 of the Java Language Specification, the reference null is first converted to the string "null" before performing the concatenation. This leads to no blobs being returned and therefore the snapshot mechanism would operate as if it were writing the first snapshot of the index. The fix is simply to check if prefix is null and handle the concatenation accordingly. Upon fixing this issue so that subsequent snapshots would no longer overwrite earlier snapshots, it was discovered that the snapshot metadata returned by the listBlobsByPrefix method was not sufficient for the snapshot layer to detect whether or not the Lucene segments had already been copied to the Azure storage layer in an earlier snapshot. This led the snapshot layer to unnecessarily duplicate these Lucene segments in Azure Storage. The root cause of this is due to known behavior in the CloudBlobContainer.getBlockBlobReference method in the Azure API. Namely, this method does not fetch blob attributes from Azure. As such, the lengths of all the blobs appeared to the snapshot layer to be of length zero and therefore they would compare as not equal to any new blobs that the snapshot layer is going to persist. To remediate this, the method CloudBlockBlob.downloadAttributes must be invoked. This will fetch the attributes from Azure Storage so that a proper comparison of the blobs can be performed. Closes elastic/elasticsearch-cloud-azure#51, closes elastic/elasticsearch-cloud-azure#99

First batch of unit tests to verify the behavior of replica allocator

When performing an operation on a primary, the state is captured and the operation is performed on the primary shard. The original request is then modified to increment the version of the operation as preparation for it to be sent to the replicas. If the request first fails on the primary during the translog sync (because the Engine is already closed due to shadow primaries closing the engine on relocation), then the operation is retried on the new primary after being modified for the replica shards. It will then fail due to the version being incorrect (the document does not yet exist but the request expects a version of "1"). Order of operations: - Request is executed against primary - Request is modified (version incremented) so it can be sent to replicas - Engine's translog is fsync'd if necessary (failing, and throwing an exception) - Modified request is retried against new primary This change ignores the exception where the engine is already closed when syncing the translog (similar to how we ignore exceptions when refreshing the shard if the ?refresh=true flag is used).

…dStores request

rjernst · 2015-07-22T05:42:09Z

LGTM!

Changes related to elastic/elasticsearch#12367

This setting has been removed in elastic#12367

rmuir added 2 commits July 21, 2015 06:58

refactor pluginservice

673454f

fix tests

cd7c8dd

rmuir added the WIP label Jul 21, 2015

rjernst and others added 26 commits July 21, 2015 14:42

Started unit tests for plugin service

880b472

More readMetadata tests and pickiness

0f38e3e

CliTool: Various PluginManager fixes

e572493

The new plugin manager parser was not called correctly in the scripts. In addition the plugin manager now creates a plugins/ directory in case it does not exist. Also the integration tests called the plugin manager in the deprecated way.

Remove left over no commit from TransportReplicationAction

3318537

It asks to double check thread pool rejection. I did and don't see problems with it.

Replace primaryPostAllocated flag and use UnassignedInfo

de299b9

There is no need to maintain additional state as to if a primary was allocated post api creation on the index routing table, we hold all this information already in the UnassignedInfo class. closes elastic#12374

Updating breaking changes

2585431

- field names cannot be mapped with `.` in them - fixed asciidoc issue where the list was not recognized as a list

don't fail junit4 integration tests if there are no tests.

3c27cc3

instead fail the failsafe plugin, which means the external cluster will still get shut down

Unit tests for nodesAndVersions on shared filesystems

cf1d481

With the `recover_on_any_node` setting, these unit tests check that the correct node list and versions are returned.

Replica allocator unit tests

4ac68bb

First batch of unit tests to verify the behavior of replica allocator

[TEST] Verify expected number of nodes in cluster before issuing shar…

ad59af4

…dStores request

Merge branch 'master' into refactor_pluginservice

c992293

fix test

017b341

remove test

c7febd8

fix test

df63ccc

moved PluginInfo and refactored parsing from properties file

9f3afad

rename to classname

04463ae

fix loading to use classname

550e73b

remove test with no methods

07ae396

rjernst added 3 commits July 21, 2015 20:55

Change plugin "plugin" property to "classname"

19aef2f

Add check that plugin must specify at least site or jvm

4a9b5b4

Remove duplicated constant

3511a9c

rmuir added v2.0.0-beta1 and removed WIP labels Jul 22, 2015

Merge branch 'master' into refactor_pluginservice

9453c41

rmuir closed this in 4040f19 Jul 22, 2015

rmuir mentioned this pull request Jul 22, 2015

bin/elasticsearch should include plugins in classpath #11917

Closed

clintongormley added >breaking :Core/Infra/Plugins Plugin API and infrastructure labels Jul 23, 2015

clintongormley changed the title ~~refactor pluginservice~~ Refactor pluginservice Jul 23, 2015

imotov added a commit to imotov/elasticsearch-native-script-example that referenced this pull request Jul 23, 2015

Update the way plugin is loaded in the test

96f8b93

Changes related to elastic/elasticsearch#12367

imotov added a commit to imotov/elasticsearch-native-script-example that referenced this pull request Jul 28, 2015

Switch to elasticsearch-plugin as a parent project

d12279d

Changes related to elastic/elasticsearch#12367

imotov added a commit to imotov/elasticsearch-analysis-morphology that referenced this pull request Jul 28, 2015

Switch to elasticsearch-plugin as a parent project

2527118

Changes related to elastic/elasticsearch#12367

tlrx mentioned this pull request Aug 4, 2015

Remove obsolete plugins.info_refresh_interval setting #12638

Merged

tlrx added a commit to tlrx/elasticsearch that referenced this pull request Aug 4, 2015

Remove obsolete plugins.info_refresh_interval setting

cf6acbd

This setting has been removed in elastic#12367

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor pluginservice #12367

Refactor pluginservice #12367

rmuir commented Jul 21, 2015

uschindler commented Jul 21, 2015

rjernst commented Jul 22, 2015

Refactor pluginservice #12367

Refactor pluginservice #12367

Conversation

rmuir commented Jul 21, 2015

uschindler commented Jul 21, 2015

rjernst commented Jul 22, 2015