Fixed issue regarding getting request body on query POST API. #2447

Merged
merged 1 commit into from Jul 14, 2017

Conversation

Projects
None yet
3 participants
Collaborator

francares commented Jul 14, 2017

There is a need to integrate query API with some of the components I'm developing and while I was testing with different versions of Cromwell (0.24, 0.28 and 0.29) I found out the following:

  • 0.24/0.28: when Cromwell is under load and trying to query +100 workflows, Cromwell (Slick) throws stack overflow exception.
  • 0.29: seems to work fine but I found that query POST API was returning everything since query params where empty and request body was parsed.

This is the fix that worked for me.

@francares francares Fixed issue regarding parsing request body on query POST API.
640068f

francares added the in review label Jul 14, 2017

@francares francares requested review from geoffjentry and mcovarr Jul 14, 2017

Member

geoffjentry commented Jul 14, 2017

@francares thanks. for the earlier versions you should be able to fix that by tweaking db-batch-size up higher (try 200 or 500 and see how it goes) in the metadata service conf. Although now that I say it IIRC you all are using a metadata service other than the default? If so this won't help.

The 29 one is just a bug on my part, thanks

Member

geoffjentry commented Jul 14, 2017 edited by briandmaher

👍

Approved with PullApprove

Contributor

mcovarr commented Jul 14, 2017

@francares are you seeing a stack overflow exception or a Slick "task rejected from queue" exception?

Member

geoffjentry commented Jul 14, 2017

@mcovarr oh good catch. Further evidence that perhaps this is something different altogether

Collaborator

francares commented Jul 14, 2017 edited

Uncaught error from thread [cromwell-system-akka.dispatchers.api-dispatcher-57] shutting down JVM since 'akka.jvm-exit-on-fatal-error' is enabled for ActorSystem[cromwell-system]
java.lang.StackOverflowError
	at scala.collection.immutable.Set$EmptySet$.seq(Set.scala:68)
	at scala.collection.SetLike$class.$plus$plus(SetLike.scala:141)
	at scala.collection.AbstractSet.$plus$plus(Set.scala:47)
	at slick.compiler.ExpandSums.slick$compiler$ExpandSums$$tr$1(ExpandSums.scala:27)
	at slick.compiler.ExpandSums$$anonfun$7.apply(ExpandSums.scala:32)
	at slick.compiler.ExpandSums$$anonfun$7.apply(ExpandSums.scala:32)
	at slick.util.ConstArray.endoMap(ConstArray.scala:122)
	at slick.ast.Node$class.mapChildren(Node.scala:51)
	at slick.ast.Apply.mapChildren(Node.scala:547)
	at slick.compiler.ExpandSums.slick$compiler$ExpandSums$$tr$1(ExpandSums.scala:32)
	at slick.compiler.ExpandSums$$anonfun$7.apply(ExpandSums.scala:32)
	at slick.compiler.ExpandSums$$anonfun$7.apply(ExpandSums.scala:32)
	at slick.util.ConstArray.endoMap(ConstArray.scala:122)
	at slick.ast.Node$class.mapChildren(Node.scala:51)
	at slick.ast.Apply.mapChildren(Node.scala:547)
	at slick.compiler.ExpandSums.slick$compiler$ExpandSums$$tr$1(ExpandSums.scala:32)
	at slick.compiler.ExpandSums$$anonfun$7.apply(ExpandSums.scala:32)
	at slick.compiler.ExpandSums$$anonfun$7.apply(ExpandSums.scala:32)
	at slick.util.ConstArray.endoMap(ConstArray.scala:122)
	at slick.ast.Node$class.mapChildren(Node.scala:51)
	at slick.ast.Apply.mapChildren(Node.scala:547)
	...

It's easy to reproduce, just run +100 hello world workflows and then query for all those workflows ids using query POST API.

Contributor

mcovarr commented Jul 14, 2017 edited by briandmaher

👍

Approved with PullApprove

Member

geoffjentry commented Jul 14, 2017

@francares Thanks. I'm a little concerned that that just magically disappeared with 29 and that perhaps the bug fix was bug, not feature. :)

Collaborator

francares commented Jul 14, 2017

@geoffjentry I tried to reproduce with 29 but I was not able with 500 hello world workflows. Anyway, if it happens again I'll let you know...

Member

geoffjentry commented Jul 14, 2017

@francares Cool. My main concern there was that when i did the akka http conversion that I "fixed" it by giving bad results, so as long as the results look good I'll stand down my fretting :)

Contributor

mcovarr commented Jul 14, 2017

Hmm that is definitely different from the "task rejected from queue" errors. And anyway 28 has the larger default metadata batch size changes, so if this really was a different symptom of that problem it shouldn't be happening on 28.

I don't see much different between develop and 28_hotfix that could legitimately explain fixes in the vicinity of Slick. 😕

Collaborator

francares commented Jul 14, 2017

@mcovarr Yeap, it's weird but I ran +1000 workflows a minute ago and I didn't see issues with 29. I tried the same with 28_hotfix and with the first 100 I got the exception.

Contributor

mcovarr commented Jul 14, 2017

Thanks @francares, @geoffjentry is going to make a ticket to investigate this on our side.

@francares francares merged commit 2e32c41 into develop Jul 14, 2017

3 of 4 checks passed

continuous-integration/travis-ci/pr The Travis CI build is in progress
Details
code-review/pullapprove Approved by geoffjentry, mcovarr
Details
continuous-integration/travis-ci/push The Travis CI build passed
Details
coverage/coveralls Coverage decreased (-0.03%) to 67.47%
Details
Member

geoffjentry commented Jul 14, 2017

mcovarr deleted the fmc_queryPostAPIFix branch Jul 17, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment