New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Azure ARM discovery plugin #22679

Open
wants to merge 26 commits into
base: master
from

Conversation

Projects
None yet
@dadoonet
Copy link
Member

dadoonet commented Jan 18, 2017

Supported settings so far:

cloud:
    azure-arm:
        client_id: FILL_WITH_YOUR_CLIENT_ID
        secret: FILL_WITH_YOUR_SECRET
        tenant_id: FILL_WITH_YOUR_TENANT
        subscription_id: FILL_WITH_YOUR_SUBSCRIPTION_ID

discovery:
    zen.hosts_provider: azure-arm
    azure-arm:
        host:
            type: private_ip
            name: azure-esnode-master-*
            group_name: azure-preprod
            region: westeurope
        refresh_interval: 10s

Closes #19146

* Azure ARM secret
*/
public static final Setting<String> SECRET_SETTING =
Setting.simpleString("cloud.azure-arm.secret", Property.NodeScope, Property.Filtered);

This comment has been minimized.

@rjernst

rjernst Jan 18, 2017

Member

I don't think we should be adding new secret settings that are plain strings.

This comment has been minimized.

@dadoonet

dadoonet Jan 18, 2017

Member

Agreed.

You can filter virtual machines you would like to connect to by entering a name here. It can be a wildcard
like `azure-esnode-*`.

`discovery.azure-arm.host.group_name`::

This comment has been minimized.

@Mpdreamz

Mpdreamz Jan 19, 2017

Member

Can we rename this to discovery.azure-arm.resource_group, group by itself is a tad ambiguous and resource group is a well understood azure concept

This comment has been minimized.

@dadoonet

dadoonet Jan 19, 2017

Member

Sure! Will do!

@dadoonet

This comment has been minimized.

Copy link
Member

dadoonet commented Jan 20, 2017

@Mpdreamz I updated the repo with new changes:

  • Fixed the missing lib issue
  • Renamed group_name to resource_group
  • Fixed an issue when using a wildcard in resource_group
@Mpdreamz

This comment has been minimized.

Copy link
Member

Mpdreamz commented Jan 20, 2017

Now getting security manager exceptions @dadoonet

[2017-01-20T12:52:57,233][WARN ][o.e.d.z.ZenDiscovery ] [data-0] Ping execution failed

│java.util.concurrent.ExecutionException: java.security.AccessControlException: access denied ("java.lang.RuntimePermission" "accessDeclaredMembers")

@jasontedor

This comment has been minimized.

Copy link
Member

jasontedor commented Jan 20, 2017

We need the entire stack trace here.

@dadoonet

This comment has been minimized.

Copy link
Member

dadoonet commented Jan 20, 2017

@Mpdreamz Interesting. Can you start elasticsearch with this option -Djava.security.debug="access,failure"?

ES_JAVA_OPTS='-Djava.security.debug="access,failure"' ./bin/elasticsearch
@Mpdreamz

This comment has been minimized.

Copy link
Member

Mpdreamz commented Jan 20, 2017

My bad @jasontedor: click to expand

[2017-01-20T12:52:57,233][WARN ][o.e.d.z.ZenDiscovery ] [data-0] Ping execution failed
java.util.concurrent.ExecutionException: java.security.AccessControlException: access denied ("java.lang.RuntimePermission" "accessDeclaredMembers")
at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357) ~[?:1.8.0_121]
at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895) ~[?:1.8.0_121]
at org.elasticsearch.discovery.zen.ZenDiscovery.pingAndWait(ZenDiscovery.java:1000) [elasticsearch-6.0.0-alpha1-SNAPSHOT.jar:6.0.0-alpha1-SNAPSHOT]
at org.elasticsearch.discovery.zen.ZenDiscovery.findMaster(ZenDiscovery.java:860) [elasticsearch-6.0.0-alpha1-SNAPSHOT.jar:6.0.0-alpha1-SNAPSHOT]
at org.elasticsearch.discovery.zen.ZenDiscovery.innerJoinCluster(ZenDiscovery.java:372) [elasticsearch-6.0.0-alpha1-SNAPSHOT.jar:6.0.0-alpha1-SNAPSHOT]
at org.elasticsearch.discovery.zen.ZenDiscovery.access$3800(ZenDiscovery.java:80) [elasticsearch-6.0.0-alpha1-SNAPSHOT.jar:6.0.0-alpha1-SNAPSHOT]
at org.elasticsearch.discovery.zen.ZenDiscovery$JoinThreadControl$1.run(ZenDiscovery.java:1176) [elasticsearch-6.0.0-alpha1-SNAPSHOT.jar:6.0.0-alpha1-SNAPSHOT]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:527) [elasticsearch-6.0.0-alpha1-SNAPSHOT.jar:6.0.0-alpha1-SNAPSHOT]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_121]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_121]
at java.lang.Thread.run(Thread.java:745) [?:1.8.0_121]
Caused by: java.security.AccessControlException: access denied ("java.lang.RuntimePermission" "accessDeclaredMembers")
at java.security.AccessControlContext.checkPermission(AccessControlContext.java:472) ~[?:1.8.0_121]
at java.security.AccessController.checkPermission(AccessController.java:884) ~[?:1.8.0_121]
at java.lang.SecurityManager.checkPermission(SecurityManager.java:549) ~[?:1.8.0_121]
at java.lang.Class.checkMemberAccess(Class.java:2348) ~[?:1.8.0_121]
at java.lang.Class.getEnclosingMethod(Class.java:1037) ~[?:1.8.0_121]
at sun.reflect.generics.scope.ClassScope.computeEnclosingScope(ClassScope.java:50) ~[?:?]
at sun.reflect.generics.scope.AbstractScope.getEnclosingScope(AbstractScope.java:78) ~[?:?]
at sun.reflect.generics.scope.AbstractScope.lookup(AbstractScope.java:96) ~[?:?]
at sun.reflect.generics.factory.CoreReflectionFactory.findTypeVariable(CoreReflectionFactory.java:110) ~[?:?]
at sun.reflect.generics.visitor.Reifier.visitTypeVariableSignature(Reifier.java:165) ~[?:?]
at sun.reflect.generics.tree.TypeVariableSignature.accept(TypeVariableSignature.java:43) ~[?:?]
at sun.reflect.generics.visitor.Reifier.reifyTypeArguments(Reifier.java:68) ~[?:?]
at sun.reflect.generics.visitor.Reifier.visitClassTypeSignature(Reifier.java:138) ~[?:?]
at sun.reflect.generics.tree.ClassTypeSignature.accept(ClassTypeSignature.java:49) ~[?:?]
at sun.reflect.generics.repository.ClassRepository.getSuperclass(ClassRepository.java:90) ~[?:?]
at java.lang.Class.getGenericSuperclass(Class.java:777) ~[?:1.8.0_121]
at com.fasterxml.jackson.core.type.TypeReference.(TypeReference.java:33) ~[jackson-core-2.8.6.jar:2.8.6]
at com.microsoft.rest.serializer.JacksonMapperAdapter$1.(JacksonMapperAdapter.java:179) ~[?:?]
at com.microsoft.rest.serializer.JacksonMapperAdapter.deserialize(JacksonMapperAdapter.java:179) ~[?:?]
at com.microsoft.rest.ServiceResponseBuilder.buildBody(ServiceResponseBuilder.java:289) ~[?:?]
at com.microsoft.rest.ServiceResponseBuilder.build(ServiceResponseBuilder.java:141) ~[?:?]
at com.microsoft.azure.management.compute.implementation.VirtualMachinesInner.listDelegate(VirtualMachinesInner.java:1101) ~[?:?]
at com.microsoft.azure.management.compute.implementation.VirtualMachinesInner.access$600(VirtualMachinesInner.java:46) ~[?:?]
at com.microsoft.azure.management.compute.implementation.VirtualMachinesInner$36.call(VirtualMachinesInner.java:1091) ~[?:?]
at com.microsoft.azure.management.compute.implementation.VirtualMachinesInner$36.call(VirtualMachinesInner.java:1087) ~[?:?]
at rx.internal.operators.OnSubscribeMap$MapSubscriber.onNext(OnSubscribeMap.java:69) ~[?:?]
at retrofit2.adapter.rxjava.RxJavaCallAdapterFactory$RequestArbiter.request(RxJavaCallAdapterFactory.java:173) ~[?:?]
at rx.Subscriber.setProducer(Subscriber.java:211) ~[?:?]
at rx.internal.operators.OnSubscribeMap$MapSubscriber.setProducer(OnSubscribeMap.java:102) ~[?:?]
at retrofit2.adapter.rxjava.RxJavaCallAdapterFactory$CallOnSubscribe.call(RxJavaCallAdapterFactory.java:152) ~[?:?]
at retrofit2.adapter.rxjava.RxJavaCallAdapterFactory$CallOnSubscribe.call(RxJavaCallAdapterFactory.java:138) ~[?:?]
at rx.Observable.unsafeSubscribe(Observable.java:9861) ~[?:?]
at rx.internal.operators.OnSubscribeMap.call(OnSubscribeMap.java:48) ~[?:?]
at rx.internal.operators.OnSubscribeMap.call(OnSubscribeMap.java:33) ~[?:?]
at rx.internal.operators.OnSubscribeLift.call(OnSubscribeLift.java:48) ~[?:?]
at rx.internal.operators.OnSubscribeLift.call(OnSubscribeLift.java:30) ~[?:?]
at rx.internal.operators.OnSubscribeLift.call(OnSubscribeLift.java:48) ~[?:?]
at rx.internal.operators.OnSubscribeLift.call(OnSubscribeLift.java:30) ~[?:?]
at rx.Observable.subscribe(Observable.java:9957) ~[?:?]
at rx.Observable.subscribe(Observable.java:9924) ~[?:?]
at rx.observables.BlockingObservable.blockForSingle(BlockingObservable.java:445) ~[?:?]
at rx.observables.BlockingObservable.single(BlockingObservable.java:342) ~[?:?]
at com.microsoft.azure.management.compute.implementation.VirtualMachinesInner.list(VirtualMachinesInner.java:1006) ~[?:?]
at com.microsoft.azure.management.compute.implementation.VirtualMachinesImpl.listByGroup(VirtualMachinesImpl.java:68) ~[?:?]
at org.elasticsearch.cloud.azure.arm.AzureManagementServiceImpl.lambda$getVirtualMachines$1(AzureManagementServiceImpl.java:103) ~[?:?]
at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_121]
at org.elasticsearch.cloud.azure.arm.AzureManagementServiceImpl.getVirtualMachines(AzureManagementServiceImpl.java:92) ~[?:?]
at org.elasticsearch.discovery.azure.arm.AzureArmUnicastHostsProvider.buildDynamicNodes(AzureArmUnicastHostsProvider.java:100) ~[?:?]
at org.elasticsearch.discovery.zen.UnicastZenPing.ping(UnicastZenPing.java:302) ~[elasticsearch-6.0.0-alpha1-SNAPSHOT.jar:6.0.0-alpha1-SNAPSHOT]
at org.elasticsearch.discovery.zen.UnicastZenPing.ping(UnicastZenPing.java:279) ~[elasticsearch-6.0.0-alpha1-SNAPSHOT.jar:6.0.0-alpha1-SNAPSHOT]
at org.elasticsearch.discovery.zen.ZenDiscovery.pingAndWait(ZenDiscovery.java:993) ~[elasticsearch-6.0.0-alpha1-SNAPSHOT.jar:6.0.0-alpha1-SNAPSHOT]
... 8 more

@dadoonet

This comment has been minimized.

Copy link
Member

dadoonet commented Jan 20, 2017

I opened Azure/autorest-clientruntime-for-java#136 on azure side.
The problem here is that we load jackson with the core security manager which does not allow java.lang.RuntimePermission "accessDeclaredMembers" so even if I change the policy for the plugin, when Azure SDK will call Jackson, it will call it with the core security manager policy.

Let see what Azure team can do to help us in that context.

If they can't do anything, I believe that we would have to shade somehow a version of Azure SDK and Jackson within a single flat jar where we can relocate then Jackson classes... I'm not a fan of this TBH.

@rjernst

This comment has been minimized.

Copy link
Member

rjernst commented Jan 20, 2017

That looks like an issue in jackson itself. The TypeReference ctor calls Class.getGenericSuperclass(), which from an anonymous class requires that permission (and the adaptor in azure creates an anonymous TypeReference instance. It seems like a web of complicated design that may not be possible to untangle (even with a flat jar).

@dadoonet

This comment has been minimized.

Copy link
Member

dadoonet commented Jan 21, 2017

@ejsmith

This comment has been minimized.

Copy link

ejsmith commented Jan 24, 2017

Will this support scalesets?

@hglkrijger

This comment has been minimized.

Copy link

hglkrijger commented Apr 20, 2017

@dadoonet, just wondering if this is still getting merged?

@clintongormley clintongormley added v6.0.0 and removed v6.0.0-alpha1 labels May 3, 2017

dadoonet added some commits Jul 19, 2016

Add Azure ARM discovery plugin
Supported settings so far:

```yml
cloud:
    azure-arm:
        client_id: FILL_WITH_YOUR_CLIENT_ID
        secret: FILL_WITH_YOUR_SECRET
        tenant_id: FILL_WITH_YOUR_TENANT
        subscription_id: FILL_WITH_YOUR_SUBSCRIPTION_ID

discovery:
    zen.hosts_provider: azure-arm
    azure-arm:
        host:
            type: private_ip
            name: azure-esnode-master-*
            group_name: azure-preprod
            region: westeurope
        refresh_interval: 10s
```

Closes #19146
Add missing commons-codec lib
And cleanup a bit gradle build
Move group_name to resource_group
Also adds more Javadoc on settings.
Also fixed a bug when using wildcards for group names. Azure API does not support wildcards so we need to get all the VMs in that case and filter on our side.

@dadoonet dadoonet force-pushed the dadoonet:pr/azure-arm branch from 12d6cf3 to db6ea7b May 24, 2017

dadoonet added some commits Jul 26, 2018

Merge branch 'master' into pr/azure-arm
# Conflicts:
#	docs/reference/cat/plugins.asciidoc
#	settings.gradle
Move all classes under a single package
Adapt code for master changes.

We still have issues when running the code because the Azure client
is not Closeable and it fails when running the test:

```
Suite: org.elasticsearch.discovery.azure.arm.AzureArmClientTests
  1> [2018-07-26T17:27:39,577][WARN ][o.e.b.JNANatives         ] Unable to lock JVM Memory: error=78, reason=Function not implemented
  1> [2018-07-26T17:27:39,584][WARN ][o.e.b.JNANatives         ] This can result in part of the JVM being swapped out.
  1> [2018-07-26T11:27:47,547][INFO ][o.e.d.a.a.AzureArmClientTests] [testConnectWithKeySecret]: before test
  2> [pool-2-thread-1] INFO com.microsoft.aad.adal4j.AuthenticationAuthority - [Correlation ID: 3270e9d0-cd58-4fd9-8511-9cad48e2736f] Instance discovery was successful
  1> [2018-07-26T11:27:56,259][INFO ][o.e.d.a.a.AzureArmClientTests]  -> AzureVirtualMachine{groupName='ELASTIC-SA', name='base6', region='eastus', publicIp='null', privateIp='10.0.0.4', powerState='DEALLOCATED'}
  1> [2018-07-26T11:27:56,259][INFO ][o.e.d.a.a.AzureArmClientTests]  -> AzureVirtualMachine{groupName='LOGSTASH-DEMO', name='logstash', region='centralus', publicIp='13.89.222.47', privateIp='10.0.1.9', powerState='RUNNING'}
  1> [2018-07-26T11:27:56,259][INFO ][o.e.d.a.a.AzureArmClientTests]  -> AzureVirtualMachine{groupName='LOGSTASH-DEMO', name='lsdata-0', region='centralus', publicIp='null', privateIp='10.0.1.6', powerState='RUNNING'}
  1> [2018-07-26T11:27:56,259][INFO ][o.e.d.a.a.AzureArmClientTests]  -> AzureVirtualMachine{groupName='LOGSTASH-DEMO', name='lsdata-1', region='centralus', publicIp='null', privateIp='10.0.1.7', powerState='RUNNING'}
  1> [2018-07-26T11:27:56,260][INFO ][o.e.d.a.a.AzureArmClientTests]  -> AzureVirtualMachine{groupName='LOGSTASH-DEMO', name='lsdata-2', region='centralus', publicIp='null', privateIp='10.0.1.8', powerState='RUNNING'}
  1> [2018-07-26T11:27:56,260][INFO ][o.e.d.a.a.AzureArmClientTests]  -> AzureVirtualMachine{groupName='LOGSTASH-DEMO', name='lskibana', region='centralus', publicIp='13.89.232.140', privateIp='10.0.1.5', powerState='RUNNING'}
  1> [2018-07-26T11:27:56,260][INFO ][o.e.d.a.a.AzureArmClientTests]  -> AzureVirtualMachine{groupName='DPI-ARM-TEST', name='dpi-arm-test', region='null', publicIp='40.89.139.46', privateIp='10.0.2.4', powerState='RUNNING'}
  1> [2018-07-26T11:27:56,295][INFO ][o.e.d.a.a.AzureArmClientTests] [testConnectWithKeySecret]: after test
  2> juil. 26, 2018 5:28:56 PM com.carrotsearch.randomizedtesting.ThreadLeakControl checkThreadLeaks
  2> AVERTISSEMENT: Will linger awaiting termination of 2 leaked thread(s).
  2> juil. 26, 2018 5:29:01 PM com.carrotsearch.randomizedtesting.ThreadLeakControl checkThreadLeaks
  2> GRAVE: 2 threads leaked from SUITE scope at org.elasticsearch.discovery.azure.arm.AzureArmClientTests:
  2>    1) Thread[id=21, name=RxIoScheduler-1 (Evictor), state=TIMED_WAITING, group=TGRP-AzureArmClientTests]
  2>         at java.base@10.0.2/jdk.internal.misc.Unsafe.park(Native Method)
  2>         at java.base@10.0.2/java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:234)
  2>         at java.base@10.0.2/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2117)
  2>         at java.base@10.0.2/java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1182)
  2>         at java.base@10.0.2/java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:899)
  2>         at java.base@10.0.2/java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1061)
  2>         at java.base@10.0.2/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1121)
  2>         at java.base@10.0.2/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
  2>         at java.base@10.0.2/java.lang.Thread.run(Thread.java:844)
  2>    2) Thread[id=20, name=Okio Watchdog, state=WAITING, group=TGRP-AzureArmClientTests]
  2>         at java.base@10.0.2/java.lang.Object.wait(Native Method)
  2>         at java.base@10.0.2/java.lang.Object.wait(Object.java:328)
  2>         at app//okio.AsyncTimeout.awaitTimeout(AsyncTimeout.java:338)
  2>         at app//okio.AsyncTimeout$Watchdog.run(AsyncTimeout.java:313)
  2> juil. 26, 2018 5:29:01 PM com.carrotsearch.randomizedtesting.ThreadLeakControl tryToInterruptAll
  2> INFOS: Starting to interrupt leaked threads:
  2>    1) Thread[id=21, name=RxIoScheduler-1 (Evictor), state=TIMED_WAITING, group=TGRP-AzureArmClientTests]
  2>    2) Thread[id=20, name=Okio Watchdog, state=WAITING, group=TGRP-AzureArmClientTests]
  2> juil. 26, 2018 5:29:04 PM com.carrotsearch.randomizedtesting.ThreadLeakControl tryToInterruptAll
  2> GRAVE: There are still zombie threads that couldn't be terminated:
  2>    1) Thread[id=21, name=RxIoScheduler-1 (Evictor), state=TIMED_WAITING, group=TGRP-AzureArmClientTests]
  2>         at java.base@10.0.2/jdk.internal.misc.Unsafe.park(Native Method)
  2>         at java.base@10.0.2/java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:234)
  2>         at java.base@10.0.2/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2117)
  2>         at java.base@10.0.2/java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1182)
  2>         at java.base@10.0.2/java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:899)
  2>         at java.base@10.0.2/java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1061)
  2>         at java.base@10.0.2/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1121)
  2>         at java.base@10.0.2/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
  2>         at java.base@10.0.2/java.lang.Thread.run(Thread.java:844)
  2>    2) Thread[id=20, name=Okio Watchdog, state=WAITING, group=TGRP-AzureArmClientTests]
  2>         at java.base@10.0.2/java.lang.Object.wait(Native Method)
  2>         at java.base@10.0.2/java.lang.Object.wait(Object.java:328)
  2>         at app//okio.AsyncTimeout.awaitTimeout(AsyncTimeout.java:338)
  2>         at app//okio.AsyncTimeout$Watchdog.run(AsyncTimeout.java:313)
  2> REPRODUCE WITH: ./gradlew :plugins:discovery-azure-arm:test -Dtests.seed=25079E754DA2AE86 -Dtests.class=org.elasticsearch.discovery.azure.arm.AzureArmClientTests -Dtests.security.manager=true -Dtests.locale=fr-FR -Dtests.timezone=Europe/Paris
  2> REPRODUCE WITH: ./gradlew :plugins:discovery-azure-arm:test -Dtests.seed=25079E754DA2AE86 -Dtests.class=org.elasticsearch.discovery.azure.arm.AzureArmClientTests -Dtests.security.manager=true -Dtests.locale=fr-FR -Dtests.timezone=Europe/Paris
  2> NOTE: test params are: codec=Asserting(Lucene70): {}, docValues:{}, maxPointsInLeafNode=1723, maxMBSortInHeap=7.495275164367896, sim=RandomSimilarity(queryNorm=false): {}, locale=en-NR, timezone=America/Indiana/Vincennes
  2> NOTE: Mac OS X 10.13.6 x86_64/Oracle Corporation 10.0.2 (64-bit)/cpus=4,threads=3,free=416855592,total=536870912
  2> NOTE: All tests run in this JVM: [AzureArmClientTests]
ERROR   0.00s J0 | AzureArmClientTests (suite) <<< FAILURES!
   > Throwable #1: com.carrotsearch.randomizedtesting.ThreadLeakError: 2 threads leaked from SUITE scope at org.elasticsearch.discovery.azure.arm.AzureArmClientTests:
   >    1) Thread[id=21, name=RxIoScheduler-1 (Evictor), state=TIMED_WAITING, group=TGRP-AzureArmClientTests]
   >         at java.base@10.0.2/jdk.internal.misc.Unsafe.park(Native Method)
   >         at java.base@10.0.2/java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:234)
   >         at java.base@10.0.2/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2117)
   >         at java.base@10.0.2/java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1182)
   >         at java.base@10.0.2/java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:899)
   >         at java.base@10.0.2/java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1061)
   >         at java.base@10.0.2/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1121)
   >         at java.base@10.0.2/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
   >         at java.base@10.0.2/java.lang.Thread.run(Thread.java:844)
   >    2) Thread[id=20, name=Okio Watchdog, state=WAITING, group=TGRP-AzureArmClientTests]
   >         at java.base@10.0.2/java.lang.Object.wait(Native Method)
   >         at java.base@10.0.2/java.lang.Object.wait(Object.java:328)
   >         at app//okio.AsyncTimeout.awaitTimeout(AsyncTimeout.java:338)
   >         at app//okio.AsyncTimeout$Watchdog.run(AsyncTimeout.java:313)
   >    at __randomizedtesting.SeedInfo.seed([25079E754DA2AE86]:0)Throwable #2: com.carrotsearch.randomizedtesting.ThreadLeakError: There are still zombie threads that couldn't be terminated:
   >    1) Thread[id=21, name=RxIoScheduler-1 (Evictor), state=TIMED_WAITING, group=TGRP-AzureArmClientTests]
   >         at java.base@10.0.2/jdk.internal.misc.Unsafe.park(Native Method)
   >         at java.base@10.0.2/java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:234)
   >         at java.base@10.0.2/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2117)
   >         at java.base@10.0.2/java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1182)
   >         at java.base@10.0.2/java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:899)
   >         at java.base@10.0.2/java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1061)
   >         at java.base@10.0.2/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1121)
   >         at java.base@10.0.2/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
   >         at java.base@10.0.2/java.lang.Thread.run(Thread.java:844)
   >    2) Thread[id=20, name=Okio Watchdog, state=WAITING, group=TGRP-AzureArmClientTests]
   >         at java.base@10.0.2/java.lang.Object.wait(Native Method)
   >         at java.base@10.0.2/java.lang.Object.wait(Object.java:328)
   >         at app//okio.AsyncTimeout.awaitTimeout(AsyncTimeout.java:338)
   >         at app//okio.AsyncTimeout$Watchdog.run(AsyncTimeout.java:313)
   >    at __randomizedtesting.SeedInfo.seed([25079E754DA2AE86]:0)
Completed [2/2] on J0 in 86.95s, 1 test, 2 errors <<< FAILURES!
```
Update to Azure 1.13.0
Reduce the number of dependencies as we can directly call
the ComputeManagement class without needing to call Azure main class
which is calling a lot of features we don't need.
@dadoonet

This comment has been minimized.

Copy link
Member

dadoonet commented Jul 27, 2018

@elasticmachine retest this please

Fix missing security policy
Note that this seems to be working correctly with a manual test.

But this is generating this log that we probably would like to avoid:

```
[2018-07-27T14:57:12,316][INFO ][o.e.t.TransportService   ] [ws4qSER] publish_address {127.0.0.1:9300}, bound_addresses {[::1]:9300}, {127.0.0.1:9300}
[pool-2-thread-1] INFO com.microsoft.aad.adal4j.AuthenticationAuthority - [Correlation ID: f0d2f463-36db-4316-8316-f9e6ec6216cd] Instance discovery was successful
[2018-07-27T14:57:28,081][INFO ][o.e.c.s.MasterService    ] [ws4qSER] zen-disco-elected-as-master ([0] nodes joined)[, ], reason: master node changed {previous [], current [{ws4qSER}{ws4qSER2QuajZxlFrIsZGw}{ziPXKAzDQZGO8xx2SL
```

As its format is different than the standard one.
@dadoonet

This comment has been minimized.

Copy link
Member

dadoonet commented Jul 27, 2018

This is now to work well from an elasticsearch node:

[2018-07-27T15:15:16,608][DEBUG][o.e.d.a.a.AzureArmUnicastHostsProvider] Ignoring machine [base6/10.0.0.4] because of [DEALLOCATED] power status
[2018-07-27T15:15:16,609][DEBUG][o.e.d.a.a.AzureArmUnicastHostsProvider] found networkAddress for [logstash]: [10.0.1.9]
[2018-07-27T15:15:16,609][TRACE][o.e.d.a.a.AzureArmUnicastHostsProvider] adding 10.0.1.9, transport_address 10.0.1.9:9300
[2018-07-27T15:15:16,609][DEBUG][o.e.d.a.a.AzureArmUnicastHostsProvider] found networkAddress for [lsdata-0]: [10.0.1.6]
[2018-07-27T15:15:16,610][TRACE][o.e.d.a.a.AzureArmUnicastHostsProvider] adding 10.0.1.6, transport_address 10.0.1.6:9300
[2018-07-27T15:15:16,610][DEBUG][o.e.d.a.a.AzureArmUnicastHostsProvider] found networkAddress for [lsdata-1]: [10.0.1.7]
[2018-07-27T15:15:16,610][TRACE][o.e.d.a.a.AzureArmUnicastHostsProvider] adding 10.0.1.7, transport_address 10.0.1.7:9300
[2018-07-27T15:15:16,610][DEBUG][o.e.d.a.a.AzureArmUnicastHostsProvider] found networkAddress for [lsdata-2]: [10.0.1.8]
[2018-07-27T15:15:16,611][TRACE][o.e.d.a.a.AzureArmUnicastHostsProvider] adding 10.0.1.8, transport_address 10.0.1.8:9300
[2018-07-27T15:15:16,611][DEBUG][o.e.d.a.a.AzureArmUnicastHostsProvider] found networkAddress for [lskibana]: [10.0.1.5]
[2018-07-27T15:15:16,611][TRACE][o.e.d.a.a.AzureArmUnicastHostsProvider] adding 10.0.1.5, transport_address 10.0.1.5:9300
[2018-07-27T15:15:16,611][DEBUG][o.e.d.a.a.AzureArmUnicastHostsProvider] found networkAddress for [dpi-arm-test]: [10.0.2.4]
[2018-07-27T15:15:16,612][TRACE][o.e.d.a.a.AzureArmUnicastHostsProvider] adding 10.0.2.4, transport_address 10.0.2.4:9300
[2018-07-27T15:15:16,612][DEBUG][o.e.d.a.a.AzureArmUnicastHostsProvider] 6 hosts(s) added

Not sure why the build is not working well yet.
Let's figure this out when the review is done.

In the meantime, someone from @elastic/microsoft team would like to test this new plugin?

@dadoonet

This comment has been minimized.

Copy link
Member

dadoonet commented Aug 6, 2018

@elasticmachine retest this please

@dadoonet

This comment has been minimized.

Copy link
Member

dadoonet commented Aug 6, 2018

I don't know why org.elasticsearch.index.engine.InternalEngineTests.testSeqNoAndCheckpoints is failing. Does not seem related to my change.

Anyway, someone from the @elastic/es-distributed team would like to review?

@ywelsch

This comment has been minimized.

Copy link
Contributor

ywelsch commented Aug 6, 2018

Merging in latest master should fix this (see #32430)

@russcam

This comment has been minimized.

Copy link
Contributor

russcam commented Aug 6, 2018

@dadoonet I'd like to test this with the Azure ARM template; is it possible to build a version for 6.3.1?

@dadoonet

This comment has been minimized.

Copy link
Member

dadoonet commented Aug 7, 2018

@ywelsch I was sure I did. And actually I did not. 😄

I think I have a trickier thing to solve now.
When I run this locally with JDK10, it passes.

./gradlew :plugins:discovery-azure-arm:thirdPartyAudit

But with JDK8, this is failing with:

08:45:41 Execution failed for task ':plugins:discovery-azure-arm:thirdPartyAudit'.
08:45:41 > Invalid exclusions, nothing is wrong with these classes: [javax/activation/ActivationDataFlavor.class, javax/activation/DataContentHandler.class, javax/activation/DataHandler.class, javax/activation/DataSource.class, javax/activation/FileDataSource.class, javax/activation/FileTypeMap.class]

(Source: https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+pull-request/14717/console)

Do you know how I can make this check optional depending on the JVM version? Unless that's a bad idea and that should be solved in another way?

@ywelsch

This comment has been minimized.

Copy link
Contributor

ywelsch commented Aug 7, 2018

Do you know how I can make this check optional depending on the JVM version? Unless that's a bad idea and that should be solved in another way?

yes, it should be made optional depending on JVM version, see the build file for discovery-azure-classic (or other build files).

@dadoonet

This comment has been minimized.

Copy link
Member

dadoonet commented Aug 7, 2018

Thanks @ywelsch that worked! Should I rebase and squash before anyone review it?

@dadoonet dadoonet requested a review from elastic/es-distributed Aug 7, 2018

@jasontedor jasontedor removed the request for review from elastic/es-distributed Aug 28, 2018

@rjernst rjernst removed the review label Oct 10, 2018

@ofer-velich

This comment has been minimized.

Copy link

ofer-velich commented Dec 4, 2018

wondering when discovery-azure plugin should be available?
what should I do for discovery if I'm running elasticsearch on azure arm instances in the meantime?
tnx

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment