New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Baseline] Apply Baseline to iceberg-data #156 #198
Conversation
@@ -229,7 +231,8 @@ private static Object generatePrimitive(Type.PrimitiveType primitive, Random ran | |||
|
|||
case DATE: | |||
// this will include negative values (dates before 1970-01-01) | |||
return EPOCH_DAY.plusDays(random.nextInt() % ABOUT_380_YEARS_IN_DAYS); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we do not need the negative days here, we can slightly simplify the exp from (random.nextBoolean() ? 1 : -1) * random.nextInt(ABOUT_380_YEARS_IN_DAYS)
to random.nextInt(ABOUT_380_YEARS_IN_DAYS)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be easier to simply suppress RandomModInteger
? I think this what was done in RandomAvroData
and that's what I did locally in iceberg-spark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1. Lets keep it consistent
@@ -65,8 +65,8 @@ public ScanBuilder caseInsensitive() { | |||
return this; | |||
} | |||
|
|||
public ScanBuilder select(String... columns) { | |||
this.columns = ImmutableList.copyOf(columns); | |||
public ScanBuilder select(String... selectColumns) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about selectedColumns
? I believe it is frequently used throughout the project.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do
@@ -58,8 +58,8 @@ private DataReader(Schema readSchema) { | |||
} | |||
|
|||
@Override | |||
public void setSchema(Schema fileSchema) { | |||
this.fileSchema = Schema.applyAliases(fileSchema, readSchema); | |||
public void setSchema(Schema schema) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe in previous PRs @mccheah frequently used fileSchema
-> newFileSchema
type of renames to avoid hiding fields in builders. Would it make sense to make it consistent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will update
@@ -128,11 +128,13 @@ protected Record reuseOrCreate(Object reuse) { | |||
} | |||
|
|||
@Override | |||
@SuppressWarnings("checkstyle:hiddenField") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know this rule is a bit controversial but I think it is better to follow it everywhere or ignore it globally. In this particular case, it seems appropriate to follow and, maybe, give another name to the field. It is not really clear the difference between the field and passed object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right. I didn't look into it closely enough. I was thinking that baseline was mistakenly identifying this as a hidden field when non was present
@@ -278,6 +277,10 @@ private GenericParquetReaders() { | |||
} | |||
} | |||
|
|||
MessageType getType() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure, but I believe Iceberg doesn't use get
in getters.
@@ -229,7 +231,8 @@ private static Object generatePrimitive(Type.PrimitiveType primitive, Random ran | |||
|
|||
case DATE: | |||
// this will include negative values (dates before 1970-01-01) | |||
return EPOCH_DAY.plusDays(random.nextInt() % ABOUT_380_YEARS_IN_DAYS); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be easier to simply suppress RandomModInteger
? I think this what was done in RandomAvroData
and that's what I did locally in iceberg-spark.
Thanks @aokolnychyi for the comments. I've addressed all. |
Thanks, @rdsr! There are two more places with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks really close! I'm fine with +1ing this and merging after these comments. @rdblue to sign off when ready.
@@ -104,7 +100,7 @@ public Object getField(String name) { | |||
@Override | |||
public void setField(String name, Object value) { | |||
Integer pos = nameToPos.get(name); | |||
Preconditions.checkArgument(pos != null, "Cannot set unknown field named: " + name); | |||
Preconditions.checkArgument(pos != null, "Cannot set unknown field named: %s", name); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can probably use checkNotNull
here?
@@ -190,6 +190,7 @@ public D decode(InputStream stream, D reuse) { | |||
* @return true if the buffer is complete, false otherwise (stream ended) | |||
* @throws IOException if there is an error while reading | |||
*/ | |||
@SuppressWarnings("checkstyle:InnerAssignment") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Curious, is there a way we can get around this without suppressing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried it, but there wasn't a good way without changing the familiar idiom of stream reading.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This suppression looks okay to me.
@@ -111,19 +111,19 @@ public OffsetDateTime read(Decoder decoder, Object reuse) throws IOException { | |||
} | |||
|
|||
private static class GenericRecordReader extends ValueReaders.StructReader<Record> { | |||
private final StructType struct; | |||
private final StructType record; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about structType instead? In the context of a record reader, record sounds like it would not be a type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense!
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY | ||
* KIND, either express or implied. See the License for the | ||
* specific language governing permissions and limitations | ||
* under the License. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
@@ -60,7 +79,7 @@ public TestSplitScan(String format) { | |||
} | |||
|
|||
@Before | |||
public void setup() throws IOException { | |||
public void before() throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What rule required this change? I usually like to have descriptive names for @Before
and @After
methods. Is the name required to be before
? (Granted, the original name, setup
wasn't very descriptive either.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that I can name it anything other than setup/teardown. Below is the error
Task :iceberg-data:checkstyleTest
[ant:checkstyle] [ERROR] /Users/rratti/code/iceberg/data/src/test/java/org/apache/iceberg/TestSplitScan.java:82: Test setup/teardown methods are called before(), beforeClass(), after(), afterClass(), but not setUp, teardown, etc. [RegexpSinglelineJava]
I found a couple of minor problems, but overall it looks good. Once the last couple things are fixed, I'll merge it. Thanks for working on this! |
No description provided.