Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test Cascading Flink on a cluster #12

Closed
fhueske opened this issue Jul 14, 2015 · 3 comments
Closed

Test Cascading Flink on a cluster #12

fhueske opened this issue Jul 14, 2015 · 3 comments
Labels

Comments

@fhueske
Copy link
Contributor

fhueske commented Jul 14, 2015

Cascading-Flink has only been locally tested. We need to run some large-scale cluster tests to see if it works and how it performs.

@fhueske
Copy link
Contributor Author

fhueske commented Aug 18, 2015

We should test the following Cascading operations for performance against equivalent Flink DataSet programs:

  • Single input GroupBy
  • Multi-input GroupBy
  • Secondary-sort GroupBy
  • binary CoGroup InnerJoin
  • ternary CoGroup InnerJoin
  • binary CoGroup BufferJoin
  • ternary CoGroup BufferJoinn
  • binary HashJoin InnerJoin
  • ternary HashJoin InnerJoin

@mxm
Copy link
Contributor

mxm commented Aug 28, 2015

The WordCount performance was about equal to the one of native Flink.

@fhueske
Copy link
Contributor Author

fhueske commented Oct 20, 2015

Successfully executed some medium complex jobs on a 8 node YARN cluster.

@fhueske fhueske closed this as completed Oct 20, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants