New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FLUO-1002 Create integration test for `FluoAdmin.remove()` #1043

Merged
merged 9 commits into from Jul 9, 2018

Conversation

Projects
None yet
3 participants
@kpm1985
Contributor

kpm1985 commented Jun 19, 2018

This is a work in progress for #1002 . I added a check to FluoAdminImpl so that in line with the previous discussion it won't let you successfully call remove with the oracle running. Created a utility class.

return curator.checkExists().forPath(zPath) != null
&& !curator.getChildren().forPath(zPath).isEmpty();
} catch (Exception e) {
throw new RuntimeException(e);

This comment has been minimized.

@ctubbsii

ctubbsii Jun 19, 2018

Member

This is a very overly broad catch statement, and a very generic RTE. Is there any way these can be made more narrow? Like, just catch the checked exceptions that are thrown, and throw a specific RTE, like IllegalStateException or IllegalArgumentException?

This comment has been minimized.

@kpm1985

kpm1985 Jun 19, 2018

Contributor

Absolutely, maybe I should refactor FluoAdminImpl.java that I stole that from also?

public static boolean oracleExists(CuratorFramework curator) {
try {
return curator.checkExists().forPath(ZookeeperPath.ORACLE_SERVER) != null
&& !curator.getChildren().forPath(ZookeeperPath.ORACLE_SERVER).isEmpty();
} catch (Exception e) {
throw new RuntimeException(e);
}
}

This comment has been minimized.

@ctubbsii

ctubbsii Jun 20, 2018

Member

Sure. I didn't notice that, but it would be useful there also.

} catch (FluoException e) {
}
// TODO assert null or something below instead of sys.out

This comment has been minimized.

@ctubbsii

ctubbsii Jun 19, 2018

Member

Can these 2 new TODO statements be resolved before adding this code?

This comment has been minimized.

@kpm1985

kpm1985 Jun 19, 2018

Contributor

Absolutely, I want to use asserts and kill the sys.out statements I left from debugging.

/*
* Gets the leading Participant from ZooKeeper at the ZookeeperPath.ORACLE_SERVER
*/
public static Participant getLeadingOracle(FluoConfiguration config) {

This comment has been minimized.

@keith-turner

keith-turner Jun 20, 2018

Contributor

This does not seem to be used.

oserver.start();
// write some data into the table and test remove again

This comment has been minimized.

@keith-turner

keith-turner Jun 20, 2018

Contributor

It would be good to use the Fluo API to read and write data because if for some reason Fluo API calls fail after remove, we want to know. Later in the test, I think it would still be good to use an Accumulo scanner to ensure the table is empty though.

This comment has been minimized.

@kpm1985

kpm1985 Jun 22, 2018

Contributor

This test is throwing IllegalArgumentException after I try to do this using the fluo api.

Received timestamp request with a Fluo application ID [a600fa56-cede-48f0-a06a-5cf41486c119] that does not match the application ID [838c2e17-b9c8-4a06-a8e9-9fcd79e0f9d5] of the Oracle

This is coming from OracleServer.getTimestampImpl(java.238). The oracle is started but is not matching the application id so that seems to be the next thing to investigate.

count = 0; // reset the count
scan = conn.createScanner(getCurTableName(), Authorizations.EMPTY);
for (Entry<Key, Value> entry : scan) {

This comment has been minimized.

@keith-turner

keith-turner Jun 20, 2018

Contributor

This loop could be replaced with count = Iterables.size(scan) using Guava. Also, I don't think entry will or should be null, so do not need to check that.

In addition to this Accumulo scan, it would also be nice to scan using Fluo.

This comment has been minimized.

@kpm1985

kpm1985 Jun 20, 2018

Contributor

@keith-turner Great call! I was just trying to get the size without eclipse complaining I didn't reference 'scan' so I did that stupid check, very much work in progress. Now I know a better way, thanks!

public static boolean pathExist(CuratorFramework curator, String zPath) {
try {
return curator.checkExists().forPath(zPath) != null;
} catch (Exception e) {

This comment has been minimized.

@ctubbsii

ctubbsii Jun 20, 2018

Member

The more specific throw is good, but I think we can have a more specific catch here (and anywhere else where we catch (Exception e). With Java 7 and later, we have the option of catching multiple checked exceptions if we need to (multi-catch).

This comment has been minimized.

@ctubbsii

ctubbsii Jun 20, 2018

Member

If Curator is throwing "Exception", then there's not much we can do, except suggest a patch to Curator.

@kpm1985

This comment has been minimized.

Contributor

kpm1985 commented Jul 9, 2018

@keith-turner I think you're right, things blow up after the oracle server is restarted. Here is a sample of the stack trace, the whole trace is a just multiples of these:

2018-07-09 04:50:57,345 [server.AbstractNonblockingServer$FrameBuffer] ERROR: Unexpected throwable while invoking!
java.lang.IllegalArgumentException: Received timestamp request with a Fluo application ID [9b625f6b-b3cd-4149-a8ec-db67eb06c73c] that does not match the application ID [06f89fc6-c7ff-4afa-b94d-de8c362eed74] of the Oracle
	at org.apache.fluo.core.oracle.OracleServer.getTimestampsImpl(OracleServer.java:238)
	at org.apache.fluo.core.oracle.OracleServer.getTimestamps(OracleServer.java:223)
	at org.apache.fluo.core.thrift.OracleService$Processor$getTimestamps.getResult(OracleService.java:271)
	at org.apache.fluo.core.thrift.OracleService$Processor$getTimestamps.getResult(OracleService.java:254)
	at org.apache.fluo.core.shaded.thrift.ProcessFunction.process(ProcessFunction.java:39)
	at org.apache.fluo.core.shaded.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
	at org.apache.fluo.core.shaded.thrift.server.AbstractNonblockingServer$FrameBuffer.invoke(AbstractNonblockingServer.java:516)
	at org.apache.fluo.core.shaded.thrift.server.Invocation.run(Invocation.java:18)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)

@kpm1985 kpm1985 requested a review from keith-turner Jul 9, 2018

admin.initialize(opts);
}
oserver.start();

This comment has been minimized.

@keith-turner

keith-turner Jul 9, 2018

Contributor

@kpm1985 I looked into this and found the problem. The oserver object has an internal Environment object which caches the application ID from before the init. To get around this, need to create a completely new Oracle Server. That can be done with the the following code.

   //this will force reread of all info from zookeeper.
    Environment env2 = new Environment(config);
    OracleServer oserver2 = new OracleServer(env2);
    oserver2.start();

At the end of the test, can do the following :

    oserver2.stop();
    env2.close();

This comment has been minimized.

@kpm1985

kpm1985 Jul 9, 2018

Contributor

@keith-turner That was just what was needed!

@@ -116,6 +116,20 @@ public static CuratorFramework newCurator(String zookeepers, int timeout, String
}
}
public static boolean pathExist(CuratorFramework curator, String zPath) {

This comment has been minimized.

@keith-turner

keith-turner Jul 9, 2018

Contributor

This new method does not seem to be used.

@@ -211,6 +212,10 @@ public void remove() {
"The Zookeeper connection string (set by 'fluo.connection.zookeepers') "
+ " must have a chroot suffix.");
if (OracleServerUtils.oracleExists(getAppCurator())) {

This comment has been minimized.

@keith-turner

keith-turner Jul 9, 2018

Contributor

This is a very nice check. Could drop You from error message Must stop the oracle server to remove an application

@kpm1985

This comment has been minimized.

Contributor

kpm1985 commented Jul 9, 2018

What is the policy or preferences regarding squashing commits?

@keith-turner

This comment has been minimized.

Contributor

keith-turner commented Jul 9, 2018

What is the policy or preferences regarding squashing commits?

I don't think there is a policy. My personal preference is usually to squash to a single commit, unless there are multiple authors.

@kpm1985 kpm1985 merged commit 1cc9276 into apache:master Jul 9, 2018

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details

@kpm1985 kpm1985 deleted the kpm1985:FLUO-1002 branch Jul 9, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment