Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Type stable [<- for base classes #140

Closed
hadley opened this issue Nov 23, 2018 · 27 comments
Closed

Type stable [<- for base classes #140

hadley opened this issue Nov 23, 2018 · 27 comments
Labels

Comments

@hadley
Copy link
Member

@hadley hadley commented Nov 23, 2018

i.e. ensures that the type of x does not change.

Needed for tidyverse/purrr#579

@krlmlr
Copy link
Member

@krlmlr krlmlr commented Feb 10, 2019

I don't see how to do this for [<-, but we could change vec_slice<-() . Currently:

x <- 2:6
vctrs::vec_slice(x, 2:4) <- "foo"
x
#> [1] "2"   "foo" "foo" "foo" "6"

Created on 2019-02-10 by the reprex package (v0.2.1.9000)

@krlmlr
Copy link
Member

@krlmlr krlmlr commented Feb 27, 2019

I'm not sure casting is the right operation here:

library(vctrs)
x <- 2:6
vctrs::vec_slice(x, 2:4) <- vec_cast("3", vec_type(x))
x
#> [1] 2 3 3 3 6

Created on 2019-02-27 by the reprex package (v0.2.1.9000)

I'd say we still throw an error here. We'd also throw an error if vec_type2(x, value) return anything different than vec_type(x). (Subset assignment never changes the data type.?) Do we need a new vec_coerce() ?

@hadley
Copy link
Member Author

@hadley hadley commented Mar 6, 2019

Alternatively we just do vec_assert(). We could always relax it later if that turns out to be too painful.

@hadley
Copy link
Member Author

@hadley hadley commented Mar 6, 2019

No, that's going to be too strict — it's going to make it painful to change (e.g.) dates or factors.

@lionel-
Copy link
Member

@lionel- lionel- commented Mar 7, 2019

Distinguishing coercion and casting we have:

vec_coerce <- function(x, to) {
  vec_cast(x, vec_type2(x, to))
}

x <- 1:6
i <- 2:4

y <- "10"
vec_slice(x, i) <- vec_coerce(y, x)
#> Error: No common type for <character> and <integer>.

y <- 10
vec_slice(x, i) <- vec_coerce(y, x)

This means the type of x might change:

typeof(x)
#> [1] "double"

Which we can't really avoid for practical purposes. But with sane coercion implementations, the result should be in the same genericity domain (here, numeric) or fail, instead of falling back to character as in base R.

@krlmlr
Copy link
Member

@krlmlr krlmlr commented Mar 14, 2019

I don't think [<- should change the underlying type. Going with the cast for now, we can revisit later.

@lionel-
Copy link
Member

@lionel- lionel- commented Mar 14, 2019

I'm working on casting @krlmlr.

@lionel-
Copy link
Member

@lionel- lionel- commented Mar 14, 2019

@hadley suggested a third option: casting value to the type of x, but only if x and value have a common type. Maybe it's safer to just vec_assert() for now, until we figure out the correct way.

@krlmlr
Copy link
Member

@krlmlr krlmlr commented Mar 19, 2019

We also should look at the interplay between vec_slice<-() and [<-(). Currently, the former calls the latter.

What if we made vec_slice<-() very strict (i.e., value must be the same type), and [<-() closer to the behavior of base R (casting to a common subtype)?

@lionel-
Copy link
Member

@lionel- lionel- commented Mar 19, 2019

I'm wondering whether we should call vec_slice() from [. This way we consistently preserve attributes.

@lionel-
Copy link
Member

@lionel- lionel- commented Mar 19, 2019

I.e. I think it's better for vec_slice() / [ and their assign variants to have consistent behaviour.

@krlmlr
Copy link
Member

@krlmlr krlmlr commented Mar 19, 2019

Consistent, yes, but vec_slice() still can be stricter than [? (Basically, because we can't influence the behavior of [ for types we don't control?)

@lionel-
Copy link
Member

@lionel- lionel- commented Mar 19, 2019

I think our types should have strict [ just like tibble implements a stricter interface than data.frame.

@lionel-

This comment has been hidden.

@krlmlr

This comment has been hidden.

@krlmlr

This comment has been hidden.

@lionel-

This comment has been hidden.

@krlmlr

This comment has been hidden.

@krlmlr

This comment has been hidden.

@lionel-

This comment has been hidden.

@lionel-
Copy link
Member

@lionel- lionel- commented Mar 19, 2019

I think this is a better formulation: vec_slice() and vec_slice<- should be stricter than [ and [<- with base and foreign types, but have consistent behaviour for our own types.

@krlmlr
Copy link
Member

@krlmlr krlmlr commented Mar 22, 2019

Thinking about it, I'm not sure what the call chain should look like.

  • [<-() is a generic that "does the right thing", also for objects with dimension
  • vec_slice<-() currently isn't generic and probably shouldn't be
  • vec_slice<-() currently relies on [<-() for objects with dimension (and also for simple vectors)

If [<-() forwards to vec_slice(), we get endless recursion -- tried that. If we move all the checking logic from vec_slice<-() to [<-(), we lose strictness for base and foreign types. If we extract the checking logic and call it in both vec_slice<-() and [<-(), we do the same work twice for our own types.

Can we somehow call the "bare" version of [<-()? Or do we implement our own C version of the slice assignment that works for one-dimensional vectors?

@lionel-
Copy link
Member

@lionel- lionel- commented Mar 22, 2019

I've been wondering whether the genericity in vctrs could be handled with just vec_restore() + vec_data() (removes attributes except dims and names) and vec_proxy() (may preserve attributes, should return either atomic vectors or records/dataframes, it's a noop for most classes).

This way vec_slice() doesn't need to be generic, it'd just slice the vec_proxy(), producing fresh data, and vec_restore() the class at the end.

Similarly vec_slice<- could operate on the unclassed vec_data() with [<- and vec_restore() would restore attributes. Then there is no issue calling vec_slice<- from [<-.

@krlmlr
Copy link
Member

@krlmlr krlmlr commented Mar 23, 2019

I like that idea, will try (and also mind your open PR).

I have prepared a survey of the current behavior of sliced assignment: http://rpubs.com/krlmlr/vctrs-140-sliced-assign . The code compares dput() output between [<-() and vec_slice<-(), a comment before every comparison shows my assessment. Does this look reasonable? Is anything missing?

@krlmlr
Copy link
Member

@krlmlr krlmlr commented Mar 23, 2019

Unfortunately, vec_data() currently doesn't preserve dimensions (#245).

@krlmlr
Copy link
Member

@krlmlr krlmlr commented Mar 24, 2019

We also need to think about x[] <- value (#246).

@lionel-
Copy link
Member

@lionel- lionel- commented May 1, 2019

@krlmlr I've now started work on slice-assign.

lionel- added a commit to lionel-/vctrs that referenced this issue May 2, 2019
lionel- added a commit to lionel-/vctrs that referenced this issue May 7, 2019
@lionel- lionel- closed this in #312 May 7, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

3 participants