Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New operator map partition function #42

Conversation

kfleischmann
Copy link

No description provided.

@@ -23,52 +23,19 @@
import java.util.Map;
import java.util.Set;

import eu.stratosphere.api.common.operators.base.*;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please run "mvn verify" locally. We have maven checkstyle in place. It forbids using star imports.

@asfbot
Copy link

asfbot commented Jun 25, 2014

Ufuk Celebi on dev@flink.incubator.apache.org replies:
You can also do=20

mvn checkstyle:check

if you only want to run checkstyle.
#42 (comment)
stratosphere-compiler/src/main/java/eu/stratosphere/compiler/PactCompiler.=
java ---
It forbids using star imports.
your
feature
please
ticket

@uce
Copy link
Contributor

uce commented Jun 25, 2014

I think this will be a nice addition to the API.

Would it make sense to rename the operator to flatMapPartition? It might be confusing in relation to the existing map and flatMap functions. Map is a 1:1 mapping wheras a flatMap is the 1:n mapping. The new operator is called mapParititons, but is able to collect multiple elements.

*/
@Override
protected void computeOperatorSpecificDefaultEstimates(DataStatistics statistics) {
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you leave this method empty?
I think it should be something similar to the regular map:
this.estimatedNumRecords = getPredecessorNode().getEstimatedNumRecords();

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because we cannot do any predictions about the output of this operator.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In FlatMapNode we use getPredecessorNode().getEstimatedNumRecords() * 5 as an estimate and in MapNode getPredecessorNode().getEstimatedNumRecords(). What's the difference which prevents us from doing the same here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would guess that a PartitionMap is used when multiple values should be somehow (in a non-deterministic way) aggregated. Otherwise one would (and should!) use a Map- or FlatMapFunction.

An estimate which is based on the input card is not a good idea, IMHO.
I would assume that the output card is more likely to be around 1 * DOP, but this is just a gut feeling ;-)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with fabian. Let's leave this empty, because there are really no good assumptions to make for this function.

@rmetzger
Copy link
Contributor

rmetzger commented Jul 3, 2014

I agree with @uce's argumentation regarding the name.
@kfleischmann: The code seems to contain files without license header.

I would like to merge this change soon.


@Override
public DriverStrategy getStrategy() {
return DriverStrategy.MAP;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be MAP_PARTITION.

In your code, I guess the strategy is never taken from here, but still, this is a setup for future bugs.

@StephanEwen
Copy link
Contributor

This is a good addition. There are some comments that need addressing before this is mergeable.

Also, can you undo changes to files that have nothing to do with the code changes:

  • stratosphere-clients/src/test/java/eu/stratosphere/client/testjar/WordCount.java
  • stratosphere-examples/stratosphere-java-examples/src/main/java/eu/stratosphere/example/java/wordcount/WordCount.java

@StephanEwen
Copy link
Contributor

Any updates here?

@kfleischmann
Copy link
Author

how can i revert my changes from the wordcount example? I just revert to the last version, is that correct?

@rmetzger
Copy link
Contributor

Yes, If you want to undo the changes from the last commit, just nuke it away and force push into the branch that the PR is based on. The commit will then disappear.

StephanEwen added a commit to StephanEwen/flink that referenced this pull request Aug 14, 2014
StephanEwen added a commit to StephanEwen/flink that referenced this pull request Aug 14, 2014
@asfgit asfgit closed this in d0cead7 Aug 15, 2014
vasia added a commit to vasia/flink that referenced this pull request Jan 16, 2015
vasia added a commit to vasia/flink that referenced this pull request Jan 22, 2015
vasia added a commit to vasia/flink that referenced this pull request Jan 22, 2015
vasia added a commit to vasia/flink that referenced this pull request Jan 22, 2015
StephanEwen pushed a commit to StephanEwen/flink that referenced this pull request Jan 23, 2015
vasia added a commit to vasia/flink that referenced this pull request Jan 24, 2015
vasia added a commit to vasia/flink that referenced this pull request Jan 24, 2015
vasia added a commit to vasia/flink that referenced this pull request Jan 24, 2015
vasia added a commit to vasia/flink that referenced this pull request Jan 24, 2015
vasia added a commit to vasia/flink that referenced this pull request Jan 25, 2015
vasia added a commit to vasia/flink that referenced this pull request Feb 9, 2015
vasia added a commit to vasia/flink that referenced this pull request Feb 9, 2015
StephanEwen pushed a commit to StephanEwen/flink that referenced this pull request Feb 11, 2015
vasia added a commit to vasia/flink that referenced this pull request Feb 11, 2015
marthavk pushed a commit to marthavk/flink that referenced this pull request Jun 9, 2015
tillrohrmann pushed a commit to tillrohrmann/flink that referenced this pull request Sep 27, 2018
Borrow some code from openjdk's version of this script (from which
this image is downstream) to automatically determine the architectures
supported by the upstream image.

Closes apache#41
zhijiangW pushed a commit to zhijiangW/flink that referenced this pull request Jul 23, 2019
tzulitai pushed a commit to tzulitai/flink that referenced this pull request Jan 15, 2021
morozov pushed a commit to morozov/flink that referenced this pull request Apr 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
7 participants