Skip to content
This repository has been archived by the owner on Jan 11, 2021. It is now read-only.

Add type-safe accessors for primitive types in Row #86

Merged
merged 10 commits into from
Apr 21, 2018
Merged

Add type-safe accessors for primitive types in Row #86

merged 10 commits into from
Apr 21, 2018

Conversation

andygrove
Copy link
Contributor

No description provided.

@coveralls
Copy link

coveralls commented Apr 13, 2018

Coverage Status

Coverage decreased (-0.009%) to 94.9% when pulling 76d4ab0 on andygrove:row_type_safe_accessors into 1f3b8b9 on sunchao:master.

Copy link
Collaborator

@sadikovi sadikovi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! Thank you very much!

I left a comment and I have several general suggestions/questions:

  • Would it be useful to add method is_primitive that returns true if type is not Group/List/Map, and false otherwise?
  • Would it be possible to add tests to cover all value conversions, e.g. Byte, Short, Int, Long, etc.?
  • Would you mind changing line width to 90? We normally follow this style rule. Thanks.

pub fn $METHOD(&self) -> Result<$TY, ParquetError> {
match *self {
Row::$VARIANT(v) => Ok(v),
_ => Err(ParquetError::General(format!("Cannot access {} as {}", self.get_type_name(), stringify!($VARIANT))))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking if it would be easier to just print self instead of self.get_type_name? One of the pros is that we need not maintain the mapping to string values, but I am happy either way.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we print self, we would have a message like Cannot access 1.2 as Bool which makes debugging harder?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough, let's keep type names!

@sadikovi
Copy link
Collaborator

This should fix #85.

@andygrove
Copy link
Contributor Author

I've added the remaining accessors as requested, and is_primitive.

@andygrove
Copy link
Contributor Author

Looks like I need to add tests for the error condition for each type to keep code coverage up. I'll get that done either tonight or tomorrow.

@sadikovi
Copy link
Collaborator

Well, I think it is okay, but if you want to then it is even better! @sunchao what do you think?

@andygrove
Copy link
Contributor Author

It was actually not much work after all.. I think it's great you have code coverage set up and I'll be learning from you and applying this to DataFusion soon

@sunchao
Copy link
Owner

sunchao commented Apr 14, 2018

Well, I think it is okay, but if you want to then it is even better! @sunchao what do you think?

Yes it would be great if we can cover those too but not a must.

macro_rules! row_primitive_accessor {
($METHOD:ident, $VARIANT:ident, $TY:ty) => {
pub fn $METHOD(&self) -> Result<$TY, ParquetError> {
match *self {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: we use 2-space indent.

pub fn $METHOD(&self) -> Result<$TY, ParquetError> {
match *self {
Row::$VARIANT(v) => Ok(v),
_ => Err(ParquetError::General(format!("Cannot access {} as {}", self.get_type_name(), stringify!($VARIANT))))
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: maybe we can use general_err! here - a little less verbose
also we limit line width to 90 characters.

// Macro to generate type-safe get_xxx methods e.g. get_bool, get_short
macro_rules! row_primitive_accessor {
($METHOD:ident, $VARIANT:ident, $TY:ty) => {
pub fn $METHOD(&self) -> Result<$TY, ParquetError> {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: you can import errors::Result and replace Result<$TY, ParquetError> with Result<$TY>

fn test_row_accessors_invalid_message() {
assert_eq!(ParquetError::General("Cannot access Float as Bool".to_string()),
Row::Float(1.2).get_bool().unwrap_err());
assert_eq!(ParquetError::General("Cannot access Float as Str".to_string()), Row::Float(1.2).get_string().unwrap_err());
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: line too long.

row_primitive_accessor!(get_timestamp, Timestamp, u64);

/// Type-safe accessor for Str type
pub fn get_string(&self) -> Result<&String, ParquetError> {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing I'm not quite sure is whether it's better to just panic here, instead of returning Result. It seems pretty rare that the error case would happen, given that one always (I think?) will hold the schema of the row before calling these methods.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think panic is pretty extreme. I think the user should decide if they want to panic or not. In non-test code they just need to add a ? at the end of the line, so it doesn't make much difference to code verbosity.

@sunchao
Copy link
Owner

sunchao commented Apr 20, 2018

Hi @andygrove , sorry for the delay! We just refactored the Row so that adding accessors to it might become easier. Could you take a look at the latest code and see if the changes can be applied to it (it should not be much difference).

@sadikovi sadikovi mentioned this pull request Apr 21, 2018
@andygrove
Copy link
Contributor Author

Cool. I am working on it this morning.

@andygrove
Copy link
Contributor Author

@sunchao This is ready for review (as long as you are OK with the 0.009% decrease in code coverage!)

Copy link
Collaborator

@sadikovi sadikovi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @andygrove. Appreciate you making changes. I left some minor comments, hope you could have a look. Thanks!

@@ -41,6 +42,79 @@ pub struct Row {
fields: Vec<(String, Field)>
}

impl Row {
/// Get then number of fields in this row
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should it be the?

}

impl RowAccessor for Row {

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you remove empty line?

row_complex_accessor!(get_group, Group, Row);
row_complex_accessor!(get_list, ListInternal, List);
row_complex_accessor!(get_map, MapInternal, Map);

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you remove empty line?

@@ -71,6 +145,13 @@ pub struct List {
elements: Vec<Field>
}

impl List {
/// Get then number of fields in this row
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should it be the?

@@ -86,6 +167,13 @@ pub struct Map {
entries: Vec<(Field, Field)>
}

impl Map {
/// Get then number of fields in this row
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here.

@andygrove
Copy link
Contributor Author

@sadikovi I pushed the formatting changes.

Have you thought about adopting cargo fmt? Then you can just have the build fail if the formatting is off and this will no longer be a manual process.

@sadikovi
Copy link
Collaborator

Thanks for making changes. Looks good, we should merge it!

Yes, we discussed that and at the time, and we needed to apply only a subset of the rules, but I could not manage to configure it that way. So we decided to review manually and periodically do clean up with rustfmt.

Copy link
Owner

@sunchao sunchao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @andygrove ! PR looks good. Just one minor comment.

}

/// Trait for type-safe convenient access to fields within a Row
trait RowAccessor {
Copy link
Owner

@sunchao sunchao Apr 21, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be public and be exported in mod.rs, otherwise these methods will not be available.

@andygrove
Copy link
Contributor Author

@sunchao Good catch! Fixed.

@sunchao sunchao merged commit 4f99ae1 into sunchao:master Apr 21, 2018
@sunchao
Copy link
Owner

sunchao commented Apr 21, 2018

Thanks @andygrove . Merged.

@andygrove andygrove deleted the row_type_safe_accessors branch April 22, 2018 00:02
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants