Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Return row names from vec_names() #920

Merged
merged 1 commit into from
Mar 17, 2020
Merged

Conversation

lionel-
Copy link
Member

@lionel- lionel- commented Mar 16, 2020

Step towards #689, and general support of row names.

vec_names() returns row names, but only if they are of type character. Numeric row names are not treated as identifiers.

@DavisVaughan
Copy link
Member

@lionel- I believe we have discussed this, but I want to be sure.

vec_set_names() should continue to never set data frame row names, correct?

@lionel-
Copy link
Member Author

lionel- commented Mar 16, 2020

hmm I think it should set names, with an additional unique-names check or repair.

Copy link
Member

@hadley hadley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My main concern is that rownames are much stricter about uniqueness than vector names. Is this going to cause downstream problems?

df <- data.frame(x = 1:2)
rownames(df) <- c("x", "x")
#> Warning: non-unique value when setting 'row.names': 'x'
#> Error in `.rowNamesDF<-`(x, value = value): duplicate 'row.names' are not allowed
 
# Avoid the check, but still have problems:
attr(df, "row.names") <- c("x", "x")
df
#> Error in (function (..., row.names = NULL, check.rows = FALSE, check.names = TRUE, : duplicate row.names: x

x <- 1:2
names(x) <- c("x", "x")

Created on 2020-03-16 by the reprex package (v0.3.0)

NEWS.md Outdated Show resolved Hide resolved
@lionel-
Copy link
Member Author

lionel- commented Mar 17, 2020

These are internal tools, so I don't expect them to be causing problems in themselves.

In the last version of vctrs we implemented row names support in vec-rbind, with user-visible consequences. The default way to handle uniqueness is to just repair the names verbosely, which avoids causing failures, but alerts users to potential issues of row identification.

@hadley
Copy link
Member

hadley commented Mar 17, 2020

My primary concern is that we're now conflating two base R ideas (row names + vector names) that have slightly different semantics. I don't think it should block this PR, but I have a vague worry that it's going to create more work down the road.

@lionel-
Copy link
Member Author

lionel- commented Mar 17, 2020

You're right that we should be careful and reconsider this direction if it proves too much work.

@lionel- lionel- merged commit 78278a3 into r-lib:master Mar 17, 2020
@lionel- lionel- deleted the fix-vec-names branch March 17, 2020 13:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants