Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve rbind() behavior on iterators #2621

Closed
oleksiyskononenko opened this issue Sep 15, 2020 · 0 comments · Fixed by #2622
Closed

Improve rbind() behavior on iterators #2621

oleksiyskononenko opened this issue Sep 15, 2020 · 0 comments · Fixed by #2622
Assignees
Labels
bug Any bugs / errors in datatable; however for severe bugs use [segfault] label

Comments

@oleksiyskononenko
Copy link
Contributor

oleksiyskononenko commented Sep 15, 2020

Currently we have quite strange rbind() behavior when the argument type is an iterator. For example, when we have a test.csv file with the following content

c1, c2, c3
1, 2, 3

read it with iread() multiple times and want to rbind the results, a typical use-case for iread(), this is what happens

>>> from datatable import dt
>>> it = dt.iread(["test.csv", "test.csv"])
>>> DT = dt.rbind(it)
ValueError: Cannot rbind frame with 3 columns to a frame with 0 columns without parameter force=True

The error message is not very helpful. In some cases (cannot reproduce it now) the message even looks like this

ValueError: Cannot rbind frame with 1152921504606846976 columns to a frame with 0 columns without parameter force=True

We should either support frame iterators directly in rbind() (the preferred way), or explicitly say they're not supported.

@oleksiyskononenko oleksiyskononenko added the improve Improvement of an existing functionality label Sep 15, 2020
@st-pasha st-pasha added the bug Any bugs / errors in datatable; however for severe bugs use [segfault] label label Sep 15, 2020
@st-pasha st-pasha self-assigned this Sep 15, 2020
@st-pasha st-pasha removed the improve Improvement of an existing functionality label Sep 15, 2020
@st-pasha st-pasha modified the milestone: Release 0.11.0 Sep 18, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Any bugs / errors in datatable; however for severe bugs use [segfault] label
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants