You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We rent multiple machines on google cloud for running the FATE project. In the beginning, we used machines with 2 vCPUs. While running the cluster code for hetero_logistic_regression, we get an Error from arbiter when sending public-keys to guest and host(console.log in federation):
[ERROR] 2019-03-18T07:11:15,361 [transferJobSchedulerExecutor-2] [SendProcessor:94] - java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
at java.util.ArrayList.rangeCheck(ArrayList.java:657)
at java.util.ArrayList.get(ArrayList.java:433)
at com.webank.ai.fate.driver.federation.transfer.service.impl.DefaultProxySelectionService.select(DefaultProxySelectionService.java:81)
at com.webank.ai.fate.driver.federation.transfer.communication.processor.SendProcessor.run(SendProcessor.java:70)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
[ERROR] 2019-03-18T07:11:24,396 [transferJobSchedulerExecutor-2] [GrpcChannelFactory:119] - [COMMON][CHANNEL][ERROR] Error getting ManagedChannel after retries
[ERROR] 2019-03-18T07:11:24,397 [transferJobSchedulerExecutor-2] [TransferJobScheduler:127] - [FEDERATION][SCHEDULER] processor failed: transferMetaId: cxz-HeteroLRTransferVariable.paillier_pubkey-HeteroLRTransferVariable.paillier_pubkey.0-2-arbiter-1-guest, exception: java.lang.RuntimeException: should never get here
at com.webank.ai.fate.core.factory.GrpcStubFactory.createGrpcStub(GrpcStubFactory.java:47)
at com.webank.ai.fate.core.factory.GrpcStubFactory.createGrpcStub(GrpcStubFactory.java:56)
at com.webank.ai.fate.core.api.grpc.client.GrpcAsyncClientContext.createStub(GrpcAsyncClientContext.java:207)
at com.webank.ai.fate.core.api.grpc.client.GrpcStreamingClientTemplate.calleeStreamingRpc(GrpcStreamingClientTemplate.java:106)
at com.webank.ai.fate.core.api.grpc.client.GrpcStreamingClientTemplate.calleeStreamingRpcWithImmediateDelayedResult(GrpcStreamingClientTemplate.java:149)
at com.webank.ai.fate.driver.federation.transfer.api.grpc.client.ProxyClient.unaryCall(ProxyClient.java:98)
at com.webank.ai.fate.driver.federation.transfer.api.grpc.client.ProxyClient.requestSendEnd(ProxyClient.java:121)
at com.webank.ai.fate.driver.federation.transfer.communication.processor.SendProcessor.run(SendProcessor.java:98)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
and we tried the same code using machines with 8 vCPUs, the ERROR cannot be reproduced.
Here are the configurations of the two groups of machines:
The former ones: Google Cloud n1-standard-2 machines with 2 vCPUs and 7.5GB RAM
The latter ones: Google Cloud n1-standard-8 machines with 8 vCPUs and 30GB RAM
The text was updated successfully, but these errors were encountered:
We rent multiple machines on google cloud for running the FATE project. In the beginning, we used machines with 2 vCPUs. While running the cluster code for hetero_logistic_regression, we get an Error from arbiter when sending public-keys to guest and host(console.log in federation):
[ERROR] 2019-03-18T07:11:15,361 [transferJobSchedulerExecutor-2] [SendProcessor:94] - java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
at java.util.ArrayList.rangeCheck(ArrayList.java:657)
at java.util.ArrayList.get(ArrayList.java:433)
at com.webank.ai.fate.driver.federation.transfer.service.impl.DefaultProxySelectionService.select(DefaultProxySelectionService.java:81)
at com.webank.ai.fate.driver.federation.transfer.communication.processor.SendProcessor.run(SendProcessor.java:70)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
[ERROR] 2019-03-18T07:11:24,396 [transferJobSchedulerExecutor-2] [GrpcChannelFactory:119] - [COMMON][CHANNEL][ERROR] Error getting ManagedChannel after retries
[ERROR] 2019-03-18T07:11:24,397 [transferJobSchedulerExecutor-2] [TransferJobScheduler:127] - [FEDERATION][SCHEDULER] processor failed: transferMetaId: cxz-HeteroLRTransferVariable.paillier_pubkey-HeteroLRTransferVariable.paillier_pubkey.0-2-arbiter-1-guest, exception: java.lang.RuntimeException: should never get here
at com.webank.ai.fate.core.factory.GrpcStubFactory.createGrpcStub(GrpcStubFactory.java:47)
at com.webank.ai.fate.core.factory.GrpcStubFactory.createGrpcStub(GrpcStubFactory.java:56)
at com.webank.ai.fate.core.api.grpc.client.GrpcAsyncClientContext.createStub(GrpcAsyncClientContext.java:207)
at com.webank.ai.fate.core.api.grpc.client.GrpcStreamingClientTemplate.calleeStreamingRpc(GrpcStreamingClientTemplate.java:106)
at com.webank.ai.fate.core.api.grpc.client.GrpcStreamingClientTemplate.calleeStreamingRpcWithImmediateDelayedResult(GrpcStreamingClientTemplate.java:149)
at com.webank.ai.fate.driver.federation.transfer.api.grpc.client.ProxyClient.unaryCall(ProxyClient.java:98)
at com.webank.ai.fate.driver.federation.transfer.api.grpc.client.ProxyClient.requestSendEnd(ProxyClient.java:121)
at com.webank.ai.fate.driver.federation.transfer.communication.processor.SendProcessor.run(SendProcessor.java:98)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
and we tried the same code using machines with 8 vCPUs, the ERROR cannot be reproduced.
Here are the configurations of the two groups of machines:
The former ones: Google Cloud n1-standard-2 machines with 2 vCPUs and 7.5GB RAM
The latter ones: Google Cloud n1-standard-8 machines with 8 vCPUs and 30GB RAM
The text was updated successfully, but these errors were encountered: