Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve performance of Reader get_named #455

Closed
quartox opened this issue Nov 27, 2023 · 4 comments
Closed

Improve performance of Reader get_named #455

quartox opened this issue Nov 27, 2023 · 4 comments

Comments

@quartox
Copy link
Contributor

quartox commented Nov 27, 2023

I was doing some performance testing of code that serializes a large number of capnp messages. The biggest bottleneck is capnp::dynamic_struct::Reader.get_named because it does an O(N) lookup of the field names, converts them to text and compares them to the input.

I was wondering if you would be open to a contribution that improves the performance of this method.

Naively I was thinking of solving this with a HashMap that populates the names of fields as keys and fields as values. This could be populated at runtime after an O(N) lookup.

@dwrensha
Copy link
Member

Thanks for the report.

We have a TODO in the code about this:

// TODO: members_by_name, allowing fast field lookup by name.

capnproto-c++ has implemented this via binary search:
https://github.com/capnproto/capnproto/blob/537fe97d6ab1962e92e15b8a16f60439a63b90f5/c%2B%2B/src/capnp/raw-schema.h#L184-L186

https://github.com/capnproto/capnproto/blob/537fe97d6ab1962e92e15b8a16f60439a63b90f5/c%2B%2B/src/capnp/schema.c%2B%2B#L458-L481

@quartox
Copy link
Contributor Author

quartox commented Jan 3, 2024

I started adding a hashmap in a fork but I cannot re-generate the code because the previously generated code does not compile with the new field:

error[E0063]: missing field `members_by_name` in initializer of `RawStructSchema`
     --> capnp/src/schema_capnp.rs:14437:21
      |
14437 |                     crate::introspect::RawStructSchema {
      |                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ missing `members_by_name`

I tried removing the old generated file but that led to other errors for files that depend on that file. Any advice on how to add a new field and re-generate the generated files?

I also went down the path of a hashmap since the C++ code also had a todo suggesting this. We lose the Copy trait on RawStructSchema and I do not know the implications of that. This seems like a viable path but willing to redo doing binary search instead.

@dwrensha
Copy link
Member

dwrensha commented Jan 3, 2024

I tried removing the old generated file but that led to other errors for files that depend on that file. Any advice on how to add a new field and re-generate the generated files?

You'll probably need to hand-edit the existing checked-in schema_capnp.rs.

I also went down the path of a hashmap since the C++ code also had a todo suggesting this. We lose the Copy trait on RawStructSchema and I do not know the implications of that.

My impression is that HashMap wouldn't work because we need to construct static or const instances of RawStructSchema. I don't think HashMap has support for that.

@dwrensha
Copy link
Member

Closing because #469 has landed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants