Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[cascading3] Migrate core, commons and related #1521

Merged
merged 12 commits into from
Apr 13, 2016

Conversation

rubanm
Copy link
Contributor

@rubanm rubanm commented Feb 19, 2016

part of #1465
based on Cyrille's work in #1446

Most of the interesting changes are in:

  • Operations.scala -- to handle both old and new cascading aggregate by thresholds
  • PlatformTest.scala -- some updated tests, hashjoining and then merging the result with one side of the same join is no longer supported in cascading3

Cascading fabric selection changes will be sent in a separate PR.

@johnynek
Copy link
Collaborator

@cchepelov take a look?

@@ -340,7 +336,7 @@ lazy val scaldingCommons = module("commons").settings(
"com.twitter" %% "bijection-core" % bijectionVersion,
"com.twitter" %% "algebird-core" % algebirdVersion,
"com.twitter" %% "chill" % chillVersion,
"com.twitter.elephantbird" % "elephant-bird-cascading2" % elephantbirdVersion,
"com.twitter.elephantbird" % "elephant-bird-cascading3" % elephantbirdVersion,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we move this to where the versions are? elephant-bird-artifact so we can keep all these switches in one place?

@johnynek
Copy link
Collaborator

Does this pass e2e tests or CI at Twitter?

@cwensel
Copy link

cwensel commented Feb 19, 2016

If we can get an isolated Cascading3 test case we can take a stab at promoting this from 'no longer supported' to 'bug' and then to 'resolved'.

@cchepelov
Copy link
Contributor

Hi @posco @rubanm
Great to see a lot of progress! Will have to come back to this next week (away from keyboard this week).

Re. the spurious ".forceToDisk"; indeed, the code should do the right thing without. The transform facility @cwensel wrote about looks like the correct place to put the necessary Boundaries in place.

  -- Cyrille

Le 19 févr. 2016 19:26, à 19:26, "P. Oscar Boykin" notifications@github.com a écrit:

@cchepelov take a look?


Reply to this email directly or view it on GitHub:
#1521 (comment)

rubanm and others added 4 commits March 2, 2016 08:28
Hadoop's -libjars doesn't support wildcards, with large class paths its easy to exhaust the max arg length for linux/os x when running commands. This acts as a filter above our interaction with the generic options parser to expand wildcards
@johnynek
Copy link
Collaborator

@cwensel about the repro: It should be as easy as a cascading HashJoin followed by Merge followed by GroupBy. Sorry kind of swamped...

@rubanm
Copy link
Contributor Author

rubanm commented Apr 11, 2016

@johnynek This branch now passes e2e tests at Twitter (with a related EB change twitter/elephant-bird#465). I'm working on piloting some user jobs.

@sriramkrishnan
Copy link
Collaborator

@rubanm this is pretty amazing work!

@johnynek
Copy link
Collaborator

Amazing!

@johnynek
Copy link
Collaborator

looks good to me to merge into cascading3 branch.

does this have all the changes from current develop branch?

@rubanm
Copy link
Contributor Author

rubanm commented Apr 12, 2016

@johnynek Thanks for the review! RC6 is currently being released to twitter source. I plan to merge develop once that release is done so it's in tandem, with the joinWithTiny fix to follow.

@johnynek
Copy link
Collaborator

@rubanm sounds good. Way to push through on this!

@cchepelov
Copy link
Contributor

Wow, @rubanm, this is great, many thanks for doing the hard work!

With apologies for the lack of testing on my side, been litterally
swamped, and no real end in sight… but using cascading3+tez (yes, my old
build) on a daily basis. Really looking forward to adopting your QA'd
branch.

Le 11/04/2016 17:41, Ruban Monu a écrit :

@johnynek https://github.com/johnynek This branch now passes e2e
tests at Twitter (with a related EB change twitter/elephant-bird#465
twitter/elephant-bird#465). I'm working on
piloting some user jobs.


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#1521 (comment)

@rubanm rubanm merged commit 28daec9 into cascading3 Apr 13, 2016
@rubanm rubanm deleted the rubanm/cascading3/core branch April 13, 2016 02:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants