Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Baseline] Apply Baseline to iceberg-data #156 #198

Merged
merged 7 commits into from Jun 11, 2019

Conversation

rdsr
Copy link
Contributor

@rdsr rdsr commented May 28, 2019

No description provided.

@@ -229,7 +231,8 @@ private static Object generatePrimitive(Type.PrimitiveType primitive, Random ran

case DATE:
// this will include negative values (dates before 1970-01-01)
return EPOCH_DAY.plusDays(random.nextInt() % ABOUT_380_YEARS_IN_DAYS);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we do not need the negative days here, we can slightly simplify the exp from (random.nextBoolean() ? 1 : -1) * random.nextInt(ABOUT_380_YEARS_IN_DAYS) to random.nextInt(ABOUT_380_YEARS_IN_DAYS)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be easier to simply suppress RandomModInteger? I think this what was done in RandomAvroData and that's what I did locally in iceberg-spark.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1. Lets keep it consistent

@@ -65,8 +65,8 @@ public ScanBuilder caseInsensitive() {
return this;
}

public ScanBuilder select(String... columns) {
this.columns = ImmutableList.copyOf(columns);
public ScanBuilder select(String... selectColumns) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about selectedColumns? I believe it is frequently used throughout the project.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do

@@ -58,8 +58,8 @@ private DataReader(Schema readSchema) {
}

@Override
public void setSchema(Schema fileSchema) {
this.fileSchema = Schema.applyAliases(fileSchema, readSchema);
public void setSchema(Schema schema) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe in previous PRs @mccheah frequently used fileSchema -> newFileSchema type of renames to avoid hiding fields in builders. Would it make sense to make it consistent?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will update

@@ -128,11 +128,13 @@ protected Record reuseOrCreate(Object reuse) {
}

@Override
@SuppressWarnings("checkstyle:hiddenField")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this rule is a bit controversial but I think it is better to follow it everywhere or ignore it globally. In this particular case, it seems appropriate to follow and, maybe, give another name to the field. It is not really clear the difference between the field and passed object.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right. I didn't look into it closely enough. I was thinking that baseline was mistakenly identifying this as a hidden field when non was present

@@ -278,6 +277,10 @@ private GenericParquetReaders() {
}
}

MessageType getType() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure, but I believe Iceberg doesn't use get in getters.

@@ -229,7 +231,8 @@ private static Object generatePrimitive(Type.PrimitiveType primitive, Random ran

case DATE:
// this will include negative values (dates before 1970-01-01)
return EPOCH_DAY.plusDays(random.nextInt() % ABOUT_380_YEARS_IN_DAYS);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be easier to simply suppress RandomModInteger? I think this what was done in RandomAvroData and that's what I did locally in iceberg-spark.

@rdsr
Copy link
Contributor Author

rdsr commented Jun 1, 2019

Thanks @aokolnychyi for the comments. I've addressed all.

@aokolnychyi
Copy link
Contributor

aokolnychyi commented Jun 1, 2019

Thanks, @rdsr! There are two more places with hiddenField. I am not sure about the best naming there. Otherwise, LGTM.

Copy link
Contributor

@mccheah mccheah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks really close! I'm fine with +1ing this and merging after these comments. @rdblue to sign off when ready.

@@ -104,7 +100,7 @@ public Object getField(String name) {
@Override
public void setField(String name, Object value) {
Integer pos = nameToPos.get(name);
Preconditions.checkArgument(pos != null, "Cannot set unknown field named: " + name);
Preconditions.checkArgument(pos != null, "Cannot set unknown field named: %s", name);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can probably use checkNotNull here?

@@ -190,6 +190,7 @@ public D decode(InputStream stream, D reuse) {
* @return true if the buffer is complete, false otherwise (stream ended)
* @throws IOException if there is an error while reading
*/
@SuppressWarnings("checkstyle:InnerAssignment")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious, is there a way we can get around this without suppressing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried it, but there wasn't a good way without changing the familiar idiom of stream reading.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This suppression looks okay to me.

@@ -111,19 +111,19 @@ public OffsetDateTime read(Decoder decoder, Object reuse) throws IOException {
}

private static class GenericRecordReader extends ValueReaders.StructReader<Record> {
private final StructType struct;
private final StructType record;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about structType instead? In the context of a record reader, record sounds like it would not be a type.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense!

* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@@ -60,7 +79,7 @@ public TestSplitScan(String format) {
}

@Before
public void setup() throws IOException {
public void before() throws IOException {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What rule required this change? I usually like to have descriptive names for @Before and @After methods. Is the name required to be before? (Granted, the original name, setup wasn't very descriptive either.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that I can name it anything other than setup/teardown. Below is the error

Task :iceberg-data:checkstyleTest
[ant:checkstyle] [ERROR] /Users/rratti/code/iceberg/data/src/test/java/org/apache/iceberg/TestSplitScan.java:82: Test setup/teardown methods are called before(), beforeClass(), after(), afterClass(), but not setUp, teardown, etc. [RegexpSinglelineJava]

@rdblue
Copy link
Contributor

rdblue commented Jun 8, 2019

I found a couple of minor problems, but overall it looks good. Once the last couple things are fixed, I'll merge it. Thanks for working on this!

@rdblue rdblue merged commit 4691809 into apache:master Jun 11, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants