New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

extract fails on empty data.frame #313

Closed
antoine-sachet opened this Issue Jun 20, 2017 · 3 comments

Comments

Projects
None yet
3 participants
@antoine-sachet

antoine-sachet commented Jun 20, 2017

Reproducible example using tidyr 0.6.3.9000:

df <- data.frame(col1=character(0))
tidyr::extract(df, col1, into="col2", regex="(whatever)")

Expected behaviour: should return an empty data.frame with a new empty column "col2".

Actual behaviour: fails with error:
Error in names(l) <- enc2utf8(into) : 'names' attribute [1] must be the same length as the vector [0]

Why it should be fixed: this makes an otherwise valid pipeline fail when the result of a filter happens to be empty. This requires boiler-plate to guard against empty data when programming with extract.

@hadley hadley added the reprex label Jun 23, 2017

@olsgaard

This comment has been minimized.

olsgaard commented Sep 27, 2017

I am having the same problem. reproducable example using reprex

df <- data.frame(col1 = c("q123", "abc"))
tidyr::extract(df, col1, into = "col2", regex = "[0-9]+", remove = FALSE)
#> Error in names(l) <- enc2utf8(into): 'names' attribute [1] must be the same length as the vector [0]

When I choose "rerun with Debug", the error is in extract_.data.frame, where l is a List of 0


I am also running tidyr 0.6.3, in a corporate setting, so updating is non-trivial. Does 0.7.1 fix the issue or are we interfacing with tidyr in the wrong way?

@hadley hadley added bug and removed reprex labels Nov 15, 2017

@hadley

This comment has been minimized.

Member

hadley commented Nov 16, 2017

More minimal reprex:

df <- data.frame(x = c("q123", "abc"))
extract(df, x, "y", regex = ".")

I think the problem is that there's no grouping defined in the regular expression and the error message doesn't help you discover that.

@hadley hadley closed this in df8443a Nov 16, 2017

@hadley

This comment has been minimized.

Member

hadley commented Nov 16, 2017

Ok, there were actually two problems here. It looks like the motivating issue is actually a small stringi bug; so I'll file over there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment