Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deserializing embed messages from subscriptions #17

Closed
sugyan opened this issue May 22, 2023 · 7 comments
Closed

Deserializing embed messages from subscriptions #17

sugyan opened this issue May 22, 2023 · 7 comments
Assignees
Labels
bug Something isn't working

Comments

@sugyan
Copy link
Owner

sugyan commented May 22, 2023

ref: #15 (comment)

@sugyan sugyan self-assigned this May 22, 2023
@sugyan sugyan added the bug Something isn't working label May 22, 2023
@timfpark
Copy link
Contributor

Let me know if there is an outline of what this work entails and if it is a suitable first time issue for newcomer. If so I would be happy to help!

@monaqa
Copy link

monaqa commented Feb 7, 2024

Facing the same problem.
I suspect that the following issue of ciborium crate is related.

enarx/ciborium#71

@sugyan
Copy link
Owner Author

sugyan commented Feb 8, 2024

I'm currently working on resolving this issue and expect to fix it by applying #96.
I believe serde_ipld_dagcbor is more appropriate than ciborium for deserializing data in DAG-CBOR format, including cid, and I have submitted a pull request for one issue there
ipld/serde_ipld_dagcbor#21
I think that once these are resolved, we can deserialize all messages correctly. Please wait a bit until the new version comes out.

@sugyan
Copy link
Owner Author

sugyan commented Feb 8, 2024

I believe that with the #96 and #98 update, records containing embedded images can now be deserialized correctly and the error should no longer occur.

@str4d
Copy link
Contributor

str4d commented Feb 27, 2024

Tested using latest atrium-api (plus #121), and patching in ipld/serde_ipld_dagcbor@ffae329 (which includes ipld/serde_ipld_dagcbor#23), using the following code to deserialize a post containing an embedded image:

let post: app::bsky::feed::post::Record = serde_ipld_dagcbor::from_reader(&data[..])?;

When no additional atrium-api features are enabled, I get the following error:

Error: Msg("data did not match any variant of untagged enum BlobRef")

When I enable the dag-cbor feature of atrium-api, the post parses correctly.

And for completeness, if I enable the dag-cbor feature of atrium-api but don't patch serde_ipld_dagcbor, I get this error:

Error: Msg("Only bytes can be deserialized into a CID")

@sugyan
Copy link
Owner Author

sugyan commented Feb 27, 2024

@str4d Yes, that's right!
The CidLink deserialization uses a slightly odd method because the schema is different for JSON format and for DAG-CBOR format: to accommodate both formats, the deserialize method converts once to Ipld to determine the format. But this seemed to have some overhead.

Benchmark for checking it: https://github.com/sugyan/atrium/blob/main/atrium-api/benches/cid-link.rs
(Looks like we need libipld_core in dev-dependencies...)

So I implemented CidLink for DAG-CBOR as an additional feature, because I thought it is better to use only simple JSON objects if the user knows that the user will only use JSON format.
Sorry for the lack of documentation.

...But now we should probably use types::string::Cid for the link string of cid_link_json as well, and then it would be useless to worry about the overhead anymore, so maybe we can make it the default implementation.

@sugyan
Copy link
Owner Author

sugyan commented Mar 5, 2024

This is resolved at #98

@sugyan sugyan closed this as completed Mar 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants