Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

web search client crashing randomly #135

Closed
jakubkrzywda opened this issue Jun 7, 2018 · 3 comments
Closed

web search client crashing randomly #135

jakubkrzywda opened this issue Jun 7, 2018 · 3 comments
Assignees

Comments

@jakubkrzywda
Copy link

I am experiencing issues when running the newest version of Cloudsuite’s Web Search benchmark (docker images from dockerhub).

All the client runs start in a similar fashon:

Starting Faban Server
Please point your browser to http://b81b4338b185:9980/
Buildfile: /usr/src/faban/search/build.xml

init:
    [mkdir] Created dir: /usr/src/faban/search/build/classes

compile:
    [javac] /usr/src/faban/search/build.xml:35: warning: 'includeantruntime' was not set, defaulting to build.sysclasspath=last; set to false for repeatable builds
    [javac] Compiling 1 source file to /usr/src/faban/search/build/classes
    [javac] 
    [javac]           WARNING
    [javac] 
    [javac] The -source switch defaults to 1.8 in JDK 1.8.
    [javac] If you specify -target 1.5 you now must also specify -source 1.5.
    [javac] Ant will implicitly add -source 1.5 for you.  Please change your build file.
    [javac] warning: [options] bootstrap class path not set in conjunction with -source 1.5
    [javac] warning: [options] source value 1.5 is obsolete and will be removed in a future release
    [javac] warning: [options] target value 1.5 is obsolete and will be removed in a future release
    [javac] warning: [options] To suppress warnings about obsolete options, use -Xlint:-options.
    [javac] 4 warnings

bench.jar:
    [mkdir] Created dir: /usr/src/faban/search/build/lib
      [jar] Building jar: /usr/src/faban/search/build/lib/search.jar

deploy.jar:
      [jar] Building jar: /usr/src/faban/search/build/search.jar

deploy:

BUILD SUCCESSFUL
Total time: 9 seconds
Print= :1
Jun 06, 2018 1:18:36 PM com.sun.faban.common.RegistryImpl main
INFO: Registry started.
Usage: AgentImpl <driverName> <agentId> <masterMachine>
Jun 06, 2018 1:18:39 PM com.sun.faban.common.RegistryImpl reregister
INFO: Registering Master on 172.19.0.3
Jun 06, 2018 1:18:39 PM com.sun.faban.driver.engine.MasterImpl runBenchmark
INFO: RunID for this run is : 1
Jun 06, 2018 1:18:39 PM com.sun.faban.driver.engine.MasterImpl runBenchmark
INFO: Output directory for this run is : /usr/src/outputFaban/1
Jun 06, 2018 1:18:40 PM com.sun.faban.common.RegistryImpl getServices
INFO: Get services by type: SearchDriverAgent
Jun 06, 2018 1:18:40 PM com.sun.faban.common.RegistryImpl getServices
WARNING: Registry.getServices : Cannot find Service type : SearchDriverAgent
Jun 06, 2018 1:18:40 PM com.sun.faban.driver.engine.MasterImpl configure
WARNING: Cannot find SearchDriverAgents. Not starting SearchDriver.
Jun 06, 2018 1:18:40 PM com.sun.faban.driver.util.Timer idleTimerCheck
INFO: SearchDriverAgent[0]: Performing idle timer check
Jun 06, 2018 1:18:40 PM com.sun.faban.driver.util.Timer idleTimerCheck
INFO: SearchDriverAgent[0]: Idle timer characteristics:
Accuracy=3,
min. invocation cost=32,
med. invocation cost (math)=44.0,
med. invocation cost (phys)=44,
avg. invocation cost=48.354,
max. invocation cost=358,
variance of invocation cost=251.35208400000352.
Jun 06, 2018 1:18:43 PM com.sun.faban.driver.engine.AgentImpl run
INFO: SearchDriverAgent[0]: Successfully started 10 driver threads.

However, later only every third/fourth run of Web Search client is successful, the rest are failing with one of the three errors (pretty randomly):

  1. AgentThread Error
Jun 06, 2018 4:47:15 PM com.sun.faban.driver.engine.TimeThread doRun
SEVERE: SearchDriverAgent[0].7: Error initializing driver object.
java.lang.NullPointerException
	at com.sun.org.apache.xerces.internal.dom.DeferredElementNSImpl.synchronizeData(DeferredElementNSImpl.java:108)
	at com.sun.org.apache.xerces.internal.dom.ElementNSImpl.getNamespaceURI(ElementNSImpl.java:250)
	at com.sun.org.apache.xml.internal.dtm.ref.dom2dtm.DOM2DTM.addNode(DOM2DTM.java:263)
	at com.sun.org.apache.xml.internal.dtm.ref.dom2dtm.DOM2DTM.nextNode(DOM2DTM.java:524)
	at com.sun.org.apache.xml.internal.dtm.ref.DTMDefaultBase._nextsib(DTMDefaultBase.java:567)
	at com.sun.org.apache.xml.internal.dtm.ref.DTMDefaultBase.getNextSibling(DTMDefaultBase.java:1142)
	at com.sun.org.apache.xml.internal.dtm.ref.DTMDefaultBaseTraversers$ChildTraverser.next(DTMDefaultBaseTraversers.java:465)
	at com.sun.org.apache.xpath.internal.axes.AxesWalker.getNextNode(AxesWalker.java:337)
	at com.sun.org.apache.xpath.internal.axes.AxesWalker.nextNode(AxesWalker.java:365)
	at com.sun.org.apache.xpath.internal.axes.WalkingIterator.nextNode(WalkingIterator.java:197)
	at com.sun.org.apache.xpath.internal.axes.NodeSequence.nextNode(NodeSequence.java:344)
	at com.sun.org.apache.xpath.internal.axes.NodeSequence.item(NodeSequence.java:539)
	at com.sun.org.apache.xpath.internal.objects.XNodeSet.str(XNodeSet.java:281)
	at com.sun.org.apache.xpath.internal.jaxp.XPathImpl.getResultAsType(XPathImpl.java:309)
	at com.sun.org.apache.xpath.internal.jaxp.XPathImpl.evaluate(XPathImpl.java:274)
	at com.sun.org.apache.xpath.internal.jaxp.XPathImpl.evaluate(XPathImpl.java:363)
	at com.sun.faban.driver.engine.DriverContext.getXPathValue(DriverContext.java:412)
	at sample.searchdriver.SearchDriver.<init>(SearchDriver.java:86)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
	at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
	at java.lang.Class.newInstance(Class.java:442)
	at com.sun.faban.driver.engine.TimeThread.doRun(TimeThread.java:73)
	at com.sun.faban.driver.engine.AgentThread.run(AgentThread.java:202)

Jun 06, 2018 4:47:16 PM com.sun.faban.driver.engine.AgentImpl kill
WARNING: SearchDriverAgent[0]: Killing benchmark run
Jun 06, 2018 4:47:16 PM com.sun.faban.driver.engine.AgentThread run
SEVERE: SearchDriverAgent[0].6: null
java.lang.InterruptedException
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:998)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
	at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231)
	at com.sun.faban.driver.engine.AgentThread.waitStartTime(AgentThread.java:470)
	at com.sun.faban.driver.engine.TimeThread.doRun(TimeThread.java:96)
	at com.sun.faban.driver.engine.AgentThread.run(AgentThread.java:202)
  1. TimeThread Error
Jun 07, 2018 8:26:37 AM com.sun.faban.driver.engine.TimeThread doRun
SEVERE: SearchDriverAgent[0].5: Error initializing driver object.
java.io.FileNotFoundException:  (No such file or directory)
	at java.io.FileInputStream.open0(Native Method)
	at java.io.FileInputStream.open(FileInputStream.java:195)
	at java.io.FileInputStream.<init>(FileInputStream.java:138)
	at java.io.FileInputStream.<init>(FileInputStream.java:93)
	at sample.searchdriver.SearchDriver.<init>(SearchDriver.java:90)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
	at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
	at java.lang.Class.newInstance(Class.java:442)
	at com.sun.faban.driver.engine.TimeThread.doRun(TimeThread.java:73)
	at com.sun.faban.driver.engine.AgentThread.run(AgentThread.java:202)

Jun 07, 2018 8:26:38 AM com.sun.faban.driver.engine.AgentImpl kill
WARNING: SearchDriverAgent[0]: Killing benchmark run
Jun 07, 2018 8:26:38 AM com.sun.faban.driver.engine.AgentThread run
SEVERE: SearchDriverAgent[0].9: null
java.lang.InterruptedException
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:998)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
	at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231)
	at com.sun.faban.driver.engine.AgentThread.waitStartTime(AgentThread.java:470)
	at com.sun.faban.driver.engine.TimeThread.doRun(TimeThread.java:96)
	at com.sun.faban.driver.engine.AgentThread.run(AgentThread.java:202)
  1. MasterImpl Error
Jun 07, 2018 11:49:36 AM com.sun.faban.driver.engine.MasterImpl executeRun
INFO: Started all threads; run commences in 2998 ms
Jun 07, 2018 11:49:41 AM sample.searchdriver.SearchDriver doGet
SEVERE: ERROR!

Jun 07, 2018 11:49:41 AM com.sun.faban.driver.engine.AgentThread validateTimeCompletion
SEVERE: SearchDriverAgent[0].4.doGet: Transport incomplete! Please ensure transport exception is thrown from operation.
Jun 07, 2018 11:49:41 AM com.sun.faban.driver.engine.AgentThread run
SEVERE: SearchDriverAgent[0].4: SearchDriverAgent[0].4.doGet: Transport incomplete! Please ensure transport exception is thrown from operation.
com.sun.faban.driver.FatalException: SearchDriverAgent[0].4.doGet: Transport incomplete! Please ensure transport exception is thrown from operation.
	at com.sun.faban.driver.engine.AgentThread.validateTimeCompletion(AgentThread.java:532)
	at com.sun.faban.driver.engine.TimeThread.doRun(TimeThread.java:173)
	at com.sun.faban.driver.engine.AgentThread.run(AgentThread.java:202)

Jun 07, 2018 11:49:42 AM com.sun.faban.driver.engine.AgentImpl kill
WARNING: SearchDriverAgent[0]: Killing benchmark run
Jun 07, 2018 11:49:42 AM com.sun.faban.driver.engine.AgentThread logError
WARNING: SearchDriverAgent[0].14.doGet: java.lang.RuntimeException: Sleep interrupted. Run terminating.
Note: Error not counted in result.
Either transaction start or end time is not within steady state.
java.lang.RuntimeException: java.lang.RuntimeException: Sleep interrupted. Run terminating.
	at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1506)
	at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1492)
	at sun.net.www.protocol.http.HttpURLConnection.getHeaderField(HttpURLConnection.java:3036)
	at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:489)
	at com.sun.faban.driver.transport.sunhttp.SunHttpTransport.readURL(SunHttpTransport.java:177)
	at com.sun.faban.driver.transport.sunhttp.SunHttpTransport.readURL(SunHttpTransport.java:191)
	at com.sun.faban.driver.transport.sunhttp.SunHttpTransport.readURL(SunHttpTransport.java:217)
	at sample.searchdriver.SearchDriver.doGet(SearchDriver.java:197)
	at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at com.sun.faban.driver.engine.TimeThread.doRun(TimeThread.java:169)
	at com.sun.faban.driver.engine.AgentThread.run(AgentThread.java:202)
Caused by: java.lang.RuntimeException: Sleep interrupted. Run terminating.
	at com.sun.faban.driver.util.Timer.wakeupAt(Timer.java:405)
	at com.sun.faban.driver.engine.DriverContext.recordStartTime(DriverContext.java:599)
	at com.sun.faban.driver.transport.util.TimedOutputStream.write(TimedOutputStream.java:126)
	at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
	at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
	at java.io.PrintStream.flush(PrintStream.java:338)
	at sun.net.www.MessageHeader.print(MessageHeader.java:301)
	at sun.net.www.http.HttpClient.writeRequests(HttpClient.java:644)
	at sun.net.www.http.HttpClient.writeRequests(HttpClient.java:655)
	at sun.net.www.protocol.http.HttpURLConnection.writeRequests(HttpURLConnection.java:693)
	at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1585)
	at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1492)
	at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:480)
	... 9 more

If I simply remove the client container and try running it again then, after few attempts, eventually I achieve a successful run, without applying any changes to the setup.

Any ideas how to make the environment more reliable? I was running experiments with an older version of web search benchmark in the past and didn't experience issues like this.

I'm running the client on:

@Hnefi Hnefi assigned Hnefi and sid1994 and unassigned Hnefi Jun 12, 2018
@sid1994
Copy link
Collaborator

sid1994 commented Jun 13, 2018

Hi @jakubkrzywda

Can you please confirm that after the few unsuccessful runs, once you get the clients to work, they do so for any future runs as well?

I suspect that the client is failing because the server takes a bit more time to answer the initial requests. For the time being, I would suggest you do a couple of extra runs so that the server stabilizes, and then you can run your experiment.

We might be facing some timeout issues in the client because of the recent Java version change. We are working on fixing this!

@jakubkrzywda
Copy link
Author

Hi @sid1994
Thanks for the reply!

Unfortunately, I cannot confirm that. Even after few successful benchmark runs, I get erroneous ones again. The benchmark is unstable and it's impossible to use it for experiments in the current state.

Please let me know when you manage to fix the issue.

@sid1994
Copy link
Collaborator

sid1994 commented Jun 25, 2018

Hi @jakubkrzywda

We have fixed the client docker images now. Please try out the new ones.
If you have any more concerns, I would be happy to help you with them.
Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants