Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Baseline] Apply Baseline to iceberg-data #156 #198

Merged
merged 7 commits into from Jun 11, 2019

Conversation

Projects
None yet
4 participants
@rdsr
Copy link
Contributor

commented May 28, 2019

No description provided.

@@ -229,7 +231,8 @@ private static Object generatePrimitive(Type.PrimitiveType primitive, Random ran

case DATE:
// this will include negative values (dates before 1970-01-01)
return EPOCH_DAY.plusDays(random.nextInt() % ABOUT_380_YEARS_IN_DAYS);

This comment has been minimized.

Copy link
@rdsr

rdsr May 28, 2019

Author Contributor

If we do not need the negative days here, we can slightly simplify the exp from (random.nextBoolean() ? 1 : -1) * random.nextInt(ABOUT_380_YEARS_IN_DAYS) to random.nextInt(ABOUT_380_YEARS_IN_DAYS)

This comment has been minimized.

Copy link
@aokolnychyi

aokolnychyi Jun 1, 2019

Contributor

Would it be easier to simply suppress RandomModInteger? I think this what was done in RandomAvroData and that's what I did locally in iceberg-spark.

This comment has been minimized.

Copy link
@rdsr

rdsr Jun 1, 2019

Author Contributor

+1. Lets keep it consistent

@@ -65,8 +65,8 @@ public ScanBuilder caseInsensitive() {
return this;
}

public ScanBuilder select(String... columns) {
this.columns = ImmutableList.copyOf(columns);
public ScanBuilder select(String... selectColumns) {

This comment has been minimized.

Copy link
@aokolnychyi

aokolnychyi Jun 1, 2019

Contributor

What about selectedColumns? I believe it is frequently used throughout the project.

This comment has been minimized.

Copy link
@rdsr

rdsr Jun 1, 2019

Author Contributor

Will do

@@ -58,8 +58,8 @@ private DataReader(Schema readSchema) {
}

@Override
public void setSchema(Schema fileSchema) {
this.fileSchema = Schema.applyAliases(fileSchema, readSchema);
public void setSchema(Schema schema) {

This comment has been minimized.

Copy link
@aokolnychyi

aokolnychyi Jun 1, 2019

Contributor

I believe in previous PRs @mccheah frequently used fileSchema -> newFileSchema type of renames to avoid hiding fields in builders. Would it make sense to make it consistent?

This comment has been minimized.

Copy link
@rdsr

rdsr Jun 1, 2019

Author Contributor

Will update

@@ -128,11 +128,13 @@ protected Record reuseOrCreate(Object reuse) {
}

@Override
@SuppressWarnings("checkstyle:hiddenField")

This comment has been minimized.

Copy link
@aokolnychyi

aokolnychyi Jun 1, 2019

Contributor

I know this rule is a bit controversial but I think it is better to follow it everywhere or ignore it globally. In this particular case, it seems appropriate to follow and, maybe, give another name to the field. It is not really clear the difference between the field and passed object.

This comment has been minimized.

Copy link
@rdsr

rdsr Jun 1, 2019

Author Contributor

You are right. I didn't look into it closely enough. I was thinking that baseline was mistakenly identifying this as a hidden field when non was present

@@ -278,6 +277,10 @@ private GenericParquetReaders() {
}
}

MessageType getType() {

This comment has been minimized.

Copy link
@aokolnychyi

aokolnychyi Jun 1, 2019

Contributor

I am not sure, but I believe Iceberg doesn't use get in getters.

@@ -229,7 +231,8 @@ private static Object generatePrimitive(Type.PrimitiveType primitive, Random ran

case DATE:
// this will include negative values (dates before 1970-01-01)
return EPOCH_DAY.plusDays(random.nextInt() % ABOUT_380_YEARS_IN_DAYS);

This comment has been minimized.

Copy link
@aokolnychyi

aokolnychyi Jun 1, 2019

Contributor

Would it be easier to simply suppress RandomModInteger? I think this what was done in RandomAvroData and that's what I did locally in iceberg-spark.

@rdsr

This comment has been minimized.

Copy link
Contributor Author

commented Jun 1, 2019

Thanks @aokolnychyi for the comments. I've addressed all.

@aokolnychyi

This comment has been minimized.

Copy link
Contributor

commented Jun 1, 2019

Thanks, @rdsr! There are two more places with hiddenField. I am not sure about the best naming there. Otherwise, LGTM.

@mccheah
Copy link
Contributor

left a comment

Looks really close! I'm fine with +1ing this and merging after these comments. @rdblue to sign off when ready.

@@ -104,7 +100,7 @@ public Object getField(String name) {
@Override
public void setField(String name, Object value) {
Integer pos = nameToPos.get(name);
Preconditions.checkArgument(pos != null, "Cannot set unknown field named: " + name);
Preconditions.checkArgument(pos != null, "Cannot set unknown field named: %s", name);

This comment has been minimized.

Copy link
@mccheah

mccheah Jun 8, 2019

Contributor

We can probably use checkNotNull here?

@@ -190,6 +190,7 @@ public D decode(InputStream stream, D reuse) {
* @return true if the buffer is complete, false otherwise (stream ended)
* @throws IOException if there is an error while reading
*/
@SuppressWarnings("checkstyle:InnerAssignment")

This comment has been minimized.

Copy link
@mccheah

mccheah Jun 8, 2019

Contributor

Curious, is there a way we can get around this without suppressing?

This comment has been minimized.

Copy link
@rdsr

rdsr Jun 8, 2019

Author Contributor

I tried it, but there wasn't a good way without changing the familiar idiom of stream reading.

This comment has been minimized.

Copy link
@rdblue

rdblue Jun 8, 2019

Contributor

This suppression looks okay to me.

@@ -111,19 +111,19 @@ public OffsetDateTime read(Decoder decoder, Object reuse) throws IOException {
}

private static class GenericRecordReader extends ValueReaders.StructReader<Record> {
private final StructType struct;
private final StructType record;

This comment has been minimized.

Copy link
@rdblue

rdblue Jun 8, 2019

Contributor

What about structType instead? In the context of a record reader, record sounds like it would not be a type.

This comment has been minimized.

Copy link
@rdsr

rdsr Jun 8, 2019

Author Contributor

Makes sense!

* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.

This comment has been minimized.

Copy link
@rdblue

rdblue Jun 8, 2019

Contributor

+1

@@ -60,7 +79,7 @@ public TestSplitScan(String format) {
}

@Before
public void setup() throws IOException {
public void before() throws IOException {

This comment has been minimized.

Copy link
@rdblue

rdblue Jun 8, 2019

Contributor

What rule required this change? I usually like to have descriptive names for @Before and @After methods. Is the name required to be before? (Granted, the original name, setup wasn't very descriptive either.)

This comment has been minimized.

Copy link
@rdsr

rdsr Jun 8, 2019

Author Contributor

It seems that I can name it anything other than setup/teardown. Below is the error

Task :iceberg-data:checkstyleTest
[ant:checkstyle] [ERROR] /Users/rratti/code/iceberg/data/src/test/java/org/apache/iceberg/TestSplitScan.java:82: Test setup/teardown methods are called before(), beforeClass(), after(), afterClass(), but not setUp, teardown, etc. [RegexpSinglelineJava]

@rdblue

This comment has been minimized.

Copy link
Contributor

commented Jun 8, 2019

I found a couple of minor problems, but overall it looks good. Once the last couple things are fixed, I'll merge it. Thanks for working on this!

@rdblue rdblue merged commit 4691809 into apache:master Jun 11, 2019

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.