Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problems serializing R POSIXlt objects #47

Closed
jefshe opened this issue Aug 23, 2018 · 10 comments
Closed

Problems serializing R POSIXlt objects #47

jefshe opened this issue Aug 23, 2018 · 10 comments

Comments

@jefshe
Copy link
Contributor

jefshe commented Aug 23, 2018

I ran into an issue recently when using rprotobuf to serialize then deserialize a POSIXlt object:

library(RProtoBuf)
obj <- as.POSIXlt(Sys.time())
unserialize_pb(serialize_pb(obj, connection=NULL))

yields the following error:

Error in attributes(xobj) <- attrib[names(attrib) != "class"] : 
  'names' attribute [11] must be the same length as the vector [1]

what's happening is that the POSIXlt object is represented as a list of attributes in R e.g. (hours, min, sec...) but when iterated over using lapply it appears only to be a list of length 1. Only by unclassing the object do we see its true representation:

> obj
[1] "2018-08-23 16:48:29 AEST"
> unclass(obj)
$`sec`
[1] 29.09264

$min
[1] 48

$hour
[1] 16

$mday
[1] 23
...

this somewhat related to #37 which had a similar issue where the list access had special handling for a particular class of object.

I propose that prior to serializing lists in the package we unclass the object beforehand

@eddelbuettel
Copy link
Owner

That may be more a known issue with POSIXlt as well.

You could address it on your side by focusing POSIXct, the compat representation:

R> library(RProtoBuf)
R> obj <- as.POSIXct(Sys.time())
R> unserialize_pb(serialize_pb(obj, connection=NULL))
[1] "2018-08-23 05:06:55.263143 CDT"
R> 

Also:

R> now <- as.POSIXlt(Sys.time())
R> identical(as.POSIXlt(unserialize_pb(serialize_pb(as.POSIXct(now), connection=NULL))), now)
[1] TRUE
R> 

@eddelbuettel
Copy link
Owner

@jefshe I would appreciate some followup on this as the related #48 which I still find to encompassing.

I plan to close this unless I hear from you.

@jefshe
Copy link
Contributor Author

jefshe commented Aug 31, 2018

Sorry for not following up sooner @eddelbuettel , that's fair. My main motivation was encountering this issue before in #37 as well. I doubt these 2 are isolated cases.

I do agree it's a big change. Would a more narrow condition, like checking if unclassing the list changes the value of the list be more appropriate?

I do agree there are workarounds, but it seems odd for this library not to handle base data types

@eddelbuettel
Copy link
Owner

Well, one step at a time. #37 is closed, and if there was an issue, wasn't it with as.raw() which is base R?

As for falling over POSIXlt: how about if you rework your PR to

  • not blindly move away from the existing if/else
  • find in which branch your error occurs
  • test for POSIXlt and if true unclass

That would be a bit more surgical. And should overcome the issue while not changing things for everybody else.

@jefshe
Copy link
Contributor Author

jefshe commented Sep 5, 2018

Okay I'm happy to make that change, I've updated the PR now.

The issue with #37 was that calling lapply on a package_version object had unexpected behaviour similar to this scenario. I take your point however, and have just added a fix for this specific scenario

@eddelbuettel
Copy link
Owner

That looks better, and is more minimal.

@eddelbuettel
Copy link
Owner

eddelbuettel commented Sep 5, 2018

@jeroen Can you take a quick look at #48? Just changing in rexp_list() when a POSIXlt is seen should be fine, address the issue and not have side effects.

@jeroen
Copy link
Collaborator

jeroen commented Sep 5, 2018

I think the original proposal to unconditionally unclass lists to get the expected lapply behavior is fine. The implementation in the protolite package loops over the list elements, ignoring the list class. The intention here is just to serialize all elements of the list, regardless of specific R methods that affect lapply behavior.

obj <- as.POSIXlt(Sys.time())
protolite::unserialize_pb(protolite::serialize_pb(obj, connection=NULL))

@eddelbuettel
Copy link
Owner

I still find it a little broad. But the dispatching from rexp_obj.R and its rexp_obj() in

  msg <- switch(sm,
    "character" = rexp_string(obj),
    "raw" = rexp_raw(obj),
    "double" = rexp_double(obj),
    "complex" = rexp_complex(obj),
    "integer" = rexp_integer(obj),
    "list" = rexp_list(obj),
    "logical" = rexp_logical(obj),
    "NULL" = rexp_null(),
    return(rexp_native(obj))
  );

via rexp_list seems fine as we now check the list elements. I think I'll merge this.

@eddelbuettel
Copy link
Owner

#48 is merged so closing.

Thanks for the PR!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants