Skip to content

ARROW-5901: [Rust] Add equals to json arrays.#4940

Closed
liurenjie1024 wants to merge 4 commits intoapache:masterfrom
liurenjie1024:arrow-5901
Closed

ARROW-5901: [Rust] Add equals to json arrays.#4940
liurenjie1024 wants to merge 4 commits intoapache:masterfrom
liurenjie1024:arrow-5901

Conversation

@liurenjie1024
Copy link
Copy Markdown
Contributor

Checks whether an arrow array equals to an json array. This is motivated when I'm developing integration tests of parquet arrow reader. I use protobuf to generate both parquet data and json data, read parquet data to arrow, compare it with json data to verify the correct ness.

@liurenjie1024
Copy link
Copy Markdown
Contributor Author

@sunchao @andygrove @nevi-me Please take a look when you get a chance.

@wesm wesm changed the title ARROW-5091: [Rust] Add equals to json arrays. ARROW-5901: [Rust] Add equals to json arrays. Jul 25, 2019
Copy link
Copy Markdown
Contributor

@nevi-me nevi-me left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM @liurenjie1024 , I have a few minor comments and questions

}
}

impl ArrowNativeType for i64 {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it fine that we implement this on ArrowNativeType (for i{32|64} which can also be dates and times)? Should we support temporal arrays? I know with time, we don't really have a JSON format, but we could have JSON timestamps that we might benefit from comparing.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean ArrowPrimitiveType? In fact I don't think so, because temporal arrays already benefits from this because their native type is i32|i64.

Copy link
Copy Markdown
Contributor

@nevi-me nevi-me Jul 29, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, what I was considering was how we'd compare dates in JSON, as they're normally in the string format of 2019-07-29T05:07:07.771Z. I think it's not an issue though, so you may disregard my comment :)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The challenging part is that primitive array's value method returns T::Native. I think we can still compare temporal type if we serialize date time type to time stamp in json.

@codecov-io
Copy link
Copy Markdown

Codecov Report

Merging #4940 into master will decrease coverage by 5.03%.
The diff coverage is 63.05%.

Impacted file tree graph

@@             Coverage Diff             @@
##           master    #4940       +/-   ##
===========================================
- Coverage   87.46%   82.43%    -5.04%     
===========================================
  Files         994       86      -908     
  Lines      140562    25068   -115494     
  Branches     1418        0     -1418     
===========================================
- Hits       122943    20664   -102279     
+ Misses      17257     4404    -12853     
+ Partials      362        0      -362
Impacted Files Coverage Δ
rust/arrow/src/array/array.rs 92.63% <100%> (+0.06%) ⬆️
rust/arrow/src/array/equal.rs 79.68% <69.84%> (-3.26%) ⬇️
rust/arrow/src/datatypes.rs 74.67% <9.09%> (-3.96%) ⬇️
python/pyarrow/ipc.pxi
cpp/src/arrow/csv/chunker-test.cc
cpp/src/parquet/column_page.h
cpp/src/parquet/bloom_filter-test.cc
cpp/src/arrow/array/builder_decimal.cc
r/src/symbols.cpp
cpp/src/arrow/io/test-common.h
... and 902 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f4fcf56...cf34907. Read the comment docs.

@liurenjie1024
Copy link
Copy Markdown
Contributor Author

@wesm I noticed that you changed the title, but I can't see difference. Could you tell me the different so that I will not make the mistake again?

@nevi-me
Copy link
Copy Markdown
Contributor

nevi-me commented Jul 30, 2019

@liurenjie1024 he changed the JIRA number, from 5091 to 5901

Copy link
Copy Markdown
Member

@sunchao sunchao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@nevi-me
Copy link
Copy Markdown
Contributor

nevi-me commented Jul 30, 2019

@sunchao are you merging it, or should I?

@sunchao
Copy link
Copy Markdown
Member

sunchao commented Jul 30, 2019

@nevi-me go ahead to merge it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants