Skip to content

ARROW-45: [Python] Add unnest/flatten function for List types#2757

Closed
kszucs wants to merge 4 commits intoapache:masterfrom
kszucs:ARROW-45
Closed

ARROW-45: [Python] Add unnest/flatten function for List types#2757
kszucs wants to merge 4 commits intoapache:masterfrom
kszucs:ARROW-45

Conversation

@kszucs
Copy link
Copy Markdown
Member

@kszucs kszucs commented Oct 14, 2018

No description provided.

@codecov-io
Copy link
Copy Markdown

Codecov Report

Merging #2757 into master will increase coverage by 0.94%.
The diff coverage is 97.22%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #2757      +/-   ##
==========================================
+ Coverage   87.65%   88.59%   +0.94%     
==========================================
  Files         403      342      -61     
  Lines       61483    57728    -3755     
==========================================
- Hits        53891    51145    -2746     
+ Misses       7520     6583     -937     
+ Partials       72        0      -72
Impacted Files Coverage Δ
python/pyarrow/tests/test_array.py 99.07% <100%> (+0.01%) ⬆️
cpp/src/arrow/array-test.cc 100% <100%> (ø) ⬆️
cpp/src/arrow/array.h 100% <100%> (ø) ⬆️
python/pyarrow/array.pxi 68.81% <66.66%> (-0.02%) ⬇️
rust/src/record_batch.rs
go/arrow/datatype_nested.go
rust/src/util/bit_util.rs
go/arrow/math/uint64_amd64.go
go/arrow/internal/testing/tools/bool.go
go/arrow/internal/bitutil/bitutil.go
... and 55 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 54634dd...3aabaf7. Read the comment docs.

Copy link
Copy Markdown
Member

@wesm wesm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure the C++ changes are needed. In my comment on March 14 https://issues.apache.org/jira/browse/ARROW-45?focusedCommentId=16398663&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16398663 what I intended (but was too overwhelmed with the 0.9.0 release as I recall to comment further) was to have a flatten kernel to use in computational pipelines. I created ARROW-3520 about this

std::shared_ptr<Array> values() const;

/// \brief Unnests the Array by one level
std::shared_ptr<Array> Flatten() const { return values(); }
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not really sure what's the purpose of adding this redundant API. Really we need to implement a flatten kernel but having "flatten" available in Python is useful. Suggest removing this

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, will remove.


// checks
auto flattened = result_->Flatten();
ASSERT_TRUE(flattened->Equals(expected));
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure this test is needed

@kszucs
Copy link
Copy Markdown
Member Author

kszucs commented Oct 17, 2018

+1, merging

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants