Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Schema metadata API #255

Open
freopen opened this issue Jan 15, 2022 · 10 comments
Open

Schema metadata API #255

freopen opened this issue Jan 15, 2022 · 10 comments

Comments

@freopen
Copy link
Contributor

freopen commented Jan 15, 2022

This is a follow-up to #246. I was thinking about ways to introduce access to the schema metadata for user code, here are my thoughts. I could try to implement a PR for something mentioned below, but it sounds like a lot of work, not sure if I'll handle that...

The problem

Rust has no reflection, also lots of schema information is not introduced in generated code at all. This information could be used for automatic generation of additional traits or altering runtime behavior. In addition, there is a whole annotation system in Cap'n Proto that seems to be completely ignored currently.

#247 introduced a way to access the whole CodeGeneratorRequest that contains all the information about schema. It resolves the problem but it's pretty hard to use. The goal of this issue is to introduce a usable alternative that will attach relevant information from CodeGeneratorRequest to relevant places.

Since my personal goal is to do a proc-macro based on the capnp reflection information, I'm going to try and design proc-macro friendly solution. Const functions are resolved on compile time, but there seem to be fundamental obstacles to calling just generated const functions from macro expanding part: rust-lang/rfcs#2279. So the only option to make data available is macro attributes: https://doc.rust-lang.org/reference/attributes.html. In the future there could be a separate macro library that will generate const functions based on annotations so the data could be used at runtime too.

Option 1: generate full metadata

The currently generated code doesn't have actual fields for structs, but each field could be associated with a set of functions in generated Reader and Builder structs. I suggest to add the annotation for Reader and Builder structs that will contain a corresponding Node from schema.capnp, also add annotation to every single field related function that contains Field, the Node from parent struct and the ordinal number of a field in a struct. Also every annotation should contain a map of id -> Node to all the dependencies and a enum that specifies the annotated object (Reader struct, Builder struct, get_* function, set_* function, has_* function, etc...)

Since the annotations are generated, there is no big incentive to make annotations readable. It could be just a single path like #[capnp::meta(<byte string literal>)] for every single annotation. The byte string is a Cap'n Proto encoded message called MacroMetadata that contains all the information above. There will be a separate library that will provide a TokenStream to a MacroMetadata's Reader. Also every annotation declaration should generate a function that will take MacroMetadata and return the Option<Reader> to the annotation type if that annotation is attached to that object.

Option 2: allow custom annotations that generate only needed metadata

Another solution is based on the ability of Cap'n Proto to annotate annotations. That allows for a less invasive and more elegant solution. We could extend rust.capnp with generateAttributes that could be used to annotate user defined annotations that will generate macro attributes and/or derives. Annotation parameters would configure the generated macro: path, format, additional info (field id, type, name, deps, etc...) as well as which structs/functions to annotate. Annotated annotation should generate the code that will help extracting useful data from generated attributes.

With enough flexibility in generateAttributes parameters that could allow integrating with third-party macros. Correct parameters will generate exactly the macro that a third-party crate expects. On the other hand there could be virtually no parameters so it's similar to Option 1 but just for annotated nodes instead of for everything.

@dwrensha
Copy link
Member

I still keep thinking: "I wonder how far we could get if capnproto-rust had a dynamic reflection API".
In #246 you wrote:

Runtime implementation seem to require going through each field's annotations every time we are extracting key from value, sounds much worse than compile-time variant.

But if your procedural macros had access to the dynamic reflection API, that seems like it could maybe be good enough -- all of the annotation lookups would happen during macro expansion. I suppose getting that to work might require splitting things into multiple crates -- one with the capnp generated code, and then one that calls proc macros on it. But that doesn't sound like a fundamental obstacle to me.

It could be just a single path like #[capnp::meta(<byte string literal>)]

Is there any limit to the size of byte strings that appear inside of an attribute like this?

@freopen
Copy link
Contributor Author

freopen commented Jan 15, 2022

I'm not sure what do you mean by Dynamic reflection. The link you provided mentions that "Dynamic Reflection API" refers to the interface in dynamic.h and schema.h, first one seems to be all about working with schemas known at runtime, second seems to be just a convenience around Node object. Generated code doesn't seem to provide any access to the corresponding schema objects. In general, i'm not sure if what I propose is even possible in C++ version.

Re. literal size limit: can't find anything about that. Doubt that, considering that macros are parsed using the same tokenizer as the rest of the code and considering usage of macros like include_str inside other macros. Should be easy to verify that before starting.

@dwrensha
Copy link
Member

dwrensha commented Jan 15, 2022

Generated code doesn't seem to provide any access to the corresponding schema objects.

Do you mean in capnproto-c++ or in capnproto-rust? For examples of how to access the schema objects in c++, see https://github.com/capnproto/capnproto/blob/master/c%2B%2B/src/capnp/schema-loader-test.c%2B%2B .

In particular, there is a method Schema::from():
https://github.com/capnproto/capnproto/blob/0c80c300f07d88bc31e9fc7b7e888ed20c33babf/c%2B%2B/src/capnp/schema.h#L83-L85

template <typename T>
  static inline SchemaType<T> from() { return SchemaType<T>::template fromImpl<T>(); }
  // Get the Schema for a particular compiled-in type.

It should be possible to do something similar in Rust.

@dwrensha
Copy link
Member

On the Rust side we still need to start actually including the schema::Node data in generated code and we need to design a nice interface for accessing it. But it should be doable.

@aikalant
Copy link

aikalant commented Mar 1, 2022

I'm not sure how related this is, but I have been working internally on a proc macro inspired by this PR from last year #157. At the moment its able to generate all necessary set/get/init calls to the code generated builders and readers, and it currently supports most if not all capnp schema data types (including generic types, nested lists, groups, and unions). It does require you to write the rust structs/enums first.

Looks something like this:
image
and works like this:

  let input = ExampleStruct::<NestedStruct, NestedStruct>::default();

  let builder =
    TypedBuilder::<example_struct::Owned<nested_struct::Owned, nested_struct::Owned>>::new_default(
    );

  input.write(builder.init_root());

  let reader = builder.get_root_as_reader().unwrap();

  let output = ExampleStruct::<NestedStruct, NestedStruct>::read(reader);

  assert!(input == output);

I was thinking about making a pull request in the next couple weeks once I finish refining it ... would that be helpful/relevant?

@xcthulhu
Copy link

xcthulhu commented Mar 6, 2022

@aikalant I have been trying to revive capnp-conv from PR #157 . We investigating changing to capnp from our current binary format at work.

Happy to collaborate if you are interested!

@gagbo
Copy link

gagbo commented Aug 14, 2022

Were you able to find time to advance on the subject ? I don't really know where to start to pick up some of the work already done on capnp-conv and try to make a PoC to see how capnp goes

@aikalant
Copy link

@gagbo

I just published it here: https://github.com/aikalant/capnp_conv - I still want to write some more/better tests and add a little bit of polish on error messages before I make a PR here though.

@ajosecueto
Copy link

@aikalant any news?

@dwrensha
Copy link
Member

On the Rust side we still need to start actually including the schema::Node data in generated code and we need to design a nice interface for accessing it. But it should be doable.

As of last month, it is possible to get schema::Node data from the generated code:
https://dwrensha.github.io/capnproto-rust/2023/05/08/run-time-reflection.html

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants