Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add the remaining odds and ends to Execution[T] #985

Merged
merged 27 commits into from
Aug 4, 2014

Conversation

johnynek
Copy link
Collaborator

I think Execution is in a usable state, and the preferred way to write Library code that needs to do multistep operation.

Please review these last changes.

@johnynek
Copy link
Collaborator Author

This is a very strange failure:

[error]     cascading.tuple.hadoop.TupleSerialization cannot be cast to org.apache.hadoop.io.serializer.Serialization (SerializationFactory.java:64)
[error]     org.apache.hadoop.io.serializer.SerializationFactory.add(SerializationFactory.java:64)
[error]     org.apache.hadoop.io.serializer.SerializationFactory.<init>(SerializationFactory.java:54)
[error]     

But, cascading.tuple.hadoop.TupleSerialization does extend Serialization:

https://github.com/cwensel/cascading/blob/wip-2.6/cascading-hadoop/src/main/shared/cascading/tuple/hadoop/TupleSerialization.java#L85

@johnynek
Copy link
Collaborator Author

BTW, this also passes locally for me.

@ianoc
Copy link
Collaborator

ianoc commented Jul 31, 2014

Fails locally for me:

[error] x An ExecutionJob should
[error] x run correctly
[error] could not build flow from assembly: cascading.tuple.hadoop.TupleSerialization cannot be cast to org.apache.hadoop.io.serializer.Serialization
[error] cascading.flow.planner.FlowPlanner.handleExceptionDuringPlanning(FlowPlanner.java:577)
[error] cascading.flow.hadoop.planner.HadoopPlanner.buildFlow(HadoopPlanner.java:286)
[error] cascading.flow.hadoop.planner.HadoopPlanner.buildFlow(HadoopPlanner.java:80)
[error] cascading.flow.FlowConnector.connect(FlowConnector.java:459)
[error] com.twitter.scalding.ExecutionContext$class.buildFlow(ExecutionContext.scala:46)
[error] com.twitter.scalding.ExecutionContext$$anon$1.buildFlow(ExecutionContext.scala:83)
[error] com.twitter.scalding.ExecutionContext$class.run(ExecutionContext.scala:57)
[error] com.twitter.scalding.ExecutionContext$$anon$1.run(ExecutionContext.scala:83)
[error] com.twitter.scalding.Execution$FlowDefExecution$$anonfun$runStats$13$$anonfun$apply$7.apply(Execution.scala:222)

@johnynek
Copy link
Collaborator Author

Current theory: using Execution, which is using threads due to the scala.concurrent.ExecutionContext can expose some classloader issues. I'm trying to explicitly set the classloader. This is a tricky one for me. I have not really debugged classpath issues before.

@cwensel any comments? Is calling cascading from multiple threads going to work given all the classloader work going on internally?

@johnynek
Copy link
Collaborator Author

Keeps passing for me. :/ @ianoc , can you try this patch?

@cwensel
Copy link

cwensel commented Jul 31, 2014

Cascading has been embedded in a few multithreaded contexts. Spring source guys requested some minor changes years ago, and Lingual JDBC runs under threads (connection pooling etc).

Make sure you are setting the Context ClassLoader where appropriate. Cascading either calls it directly, or Hadoop itself when doing class loading (where we delegate too) relies on it.

@johnynek
Copy link
Collaborator Author

@cwensel can you take a look at this stack:
https://travis-ci.org/twitter/scalding/jobs/31297341#L3977

(line 3977)

We don't do much reflection at all (two calls, both get the context loader) in our code. What is looks to me like is that the classloader that hadoop is using is somehow inconsistent. I've tried a few ideas to fix it, but nothing is working out.

The context here is that the job itself is being started in a different thread than all the taps were created.

@cwensel
Copy link

cwensel commented Jul 31, 2014

you can try messing around with having your own UnitOfWorkExecutorStrategy on the Flow and own the context loader the thread executor utilizes.

but to have the problem you are seeing, you would need two peer classloaders loading the same classpath independently. classloaders aren't child first.

maybe try loading org.apache.hadoop.mapred.OutputFormat statically early and force it into a parent to see what happens.

or, you have some messed up dependencies.

fwiw, Lingual jumps through a bunch of hoops to allow for dynamic classloading of Tap/Schemes from remote sources. things work great, even embedded (which is why we have the capability).

ckw

On Jul 30, 2014, at 8:59 PM, P. Oscar Boykin notifications@github.com wrote:

@cwensel can you take a look at this stack:
https://travis-ci.org/twitter/scalding/jobs/31297341#L3977

(line 3977)

We don't do much reflection at all (two calls, both get the context loader) in our code. What is looks to me like is that the classloader that hadoop is using is somehow inconsistent. I've tried a few ideas to fix it, but nothing is working out.

The context here is that the job itself is being started in a different thread than all the taps were created.


Reply to this email directly or view it on GitHub.

Chris K Wensel
chris@concurrentinc.com
http://concurrentinc.com

@johnynek
Copy link
Collaborator Author

Well, setting the context classloader explicitly to the one that created the Configuration in the job works for 2.9.3, but 2.10.4 errors out with this:

/home/travis/build.sh: line 41: 1473 Killed ./sbt -Dlog4j.configuration=file://$TRAVIS_BUILD_DIR/project/travis-log4j.properties ++$TRAVIS_SCALA_VERSION assembly

I tried restarting it twice, and the log ends with that both times.

@johnynek
Copy link
Collaborator Author

I think this works now, but Travis just can't download the jars.

Passes for me and Ian.

I appreciate Travis, but I'd guess the false positive rate is greater than 20%. This dramatically undermines faith in the test failures (and we often ignore real failures because of it).

@@ -136,7 +219,9 @@ object Execution {
def runStats(conf: Config, mode: Mode)(implicit cec: ConcurrentExecutionContext) = {
for {
(flowDef, fn) <- Future(result(conf, mode))
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is no reason for these first two to be in Future threads.

ianoc added a commit that referenced this pull request Aug 4, 2014
Add the remaining odds and ends to Execution[T]
@ianoc ianoc merged commit 2d8aa06 into develop Aug 4, 2014
@ianoc ianoc deleted the back_to_the_future_201407 branch August 4, 2014 23:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants