Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Methods to convert to/from builtin objects #189

Closed
jcrist opened this issue Sep 18, 2022 · 4 comments · Fixed by #266
Closed

Methods to convert to/from builtin objects #189

jcrist opened this issue Sep 18, 2022 · 4 comments · Fixed by #266

Comments

@jcrist
Copy link
Owner

jcrist commented Sep 18, 2022

msgspec currently contains methods for converting converting objects to/from bytes using either JSON or MessagePack protocols. Sometimes it'd be useful to convert to/from "simpler" types (lists, dicts, ...).

Example use cases:

  • Encoding using a third-party protocol library like pyyaml. This can currently be handled on the encoding end by passing in a custom default method, but if the encode call is buried in some wrapper library the user may not have access to it directly and would need to recursively simplify the object before handing it off.
  • Decoding from a third-party protocol like pyyaml into higher-level types, while keeping the type validation. The easiest way to do this right now is to roundtrip the message through a supported protocol, for example:
def from_builtins(obj: Any, type: Type[T]) -> T:
    msg = msgspec.json.encode(obj)
    return msgspec.json.decode(msg, type=type)

Proposed initial API:

def to_builtins(obj: Any, *, recurse=True) -> Any:
    """Convert obj to simple builtin types (list, dict, ...). If `recurse` is True, this is applied recursively.
    Note that copying only happens when necessary, if a list of integers is passed in the same list
    will be returned."""
    ...

def from_builtins(obj: Any, type: Type[T]) -> T:
    """Convert obj to type. Note that copying only happens when necessary for conversion."""
    ...

I'm not 100% happy with this names, but all other options I could think of I liked less:

  • to_simple/from_simple
  • simplify/convert (I don't like the asymmetry in name here, and convert implies some casting behaviors like float -> int that we won't actually support)
  • cattrs calls these destructure/structure respectively
  • lower/lift (for converting to lower/higher level types)

A few open questions:

  • What types are valid to return from to_builtins? My initial reaction is dict, list, tuple, str, int, float, bool, None (and no subclasses).
  • How do we handle types that encode/decode differently between JSON and msgpack? These would be bytes/bytearray and datetime currently (we wouldn't support Ext/Raw in this API anyway)? Perhaps convert_binary=True and convert_datetime=True flags in to_builtins?

If anyone has thoughts on how to spell these APIs, please let me know below.

@FeeeeK
Copy link

FeeeeK commented Nov 3, 2022

Maybe just implement .to_dict() as in pydantic?

@jcrist
Copy link
Owner Author

jcrist commented Jan 10, 2023

The first half of this (to_builtins) was added in #258. Still need to add from_builtins.

@madisonleavo
Copy link

This proposal looks like transforming an entire object? Maybe I don't follow the reasoning. to/from_builtins() I thought it made sense, but since I don't think I understand they why, maybe it still needs work :)

I'm looking to override behavior on a single type bytearray. I wonder if running an object through this might still work for transforming a single field?? In my application I just want str.decode() instead of base64. convert_binary=True is how my searching led me here.

recursively simplify the object

is really annoying in other JSON libs! thanks for adding here.

@TomAugspurger
Copy link

TomAugspurger commented Jan 21, 2023 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants