New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
reflection_table std::string messagepack writing is completely broken #1858
Comments
We use strings in reflection tables in cctbx.xfel.merge: This is mostly done in memory, but when we write the data back to disc, we currently have an old as_pickle call. Oddly enough, it's on my list to swap that out to an as_file call to start using msgpack, but I hadn't tested it yet. |
I think it's relatively easy to fix, just have a custom packer (like Shoeboxes) do. It's good to have an example of it being used, it means it's worth fixing vs throwing out. |
Some re-discussion of this issue happened in cctbx/dxtbx#465 |
Working on dials#1858 This could never have worked. Now trying to make something which does work. This does not work. This is at least some stubs / breadcrumbs on how it could work. Turns out to be hard though
@phyy-nx that is probably a better solution in the long game - I was wondering why you would keep strings in a reflection table. With that in mind: I am tempted to simply remove support for that (in the messagepack file, at least) rather than having the appearance of support. A "fix" for this is possible but annoying. |
😱 in trying to remove this I find that the packing of strings is part of how the general packing works -> also means if we touch this it is likely to break everything Now working out of I can simply unlink... |
This issue has been automatically marked as stale because it has not had recent activity. The label will be removed automatically if any activity occurs. Thank you for your contributions. |
Closes dials/dials#1858. Background: exp_id is a column of experiment identifiers, originally developed in parallel with the experiment_identifier map in DIALS. Now that the experiment_identifier map is standard in DIALS, exp_id is redundant, so this commit removes it. Note this commit makes heavy use of two convenience methods in DIALS: - concat: takes two or more reflection tables and concatenates them together, renumbering the 'id' column and taking care of the experiment_identifer map - reset_ids: resets the 'id' column, accounting for any gaps. Useful after using select to remove reflections from a table. We can now use message_pack instead of pickle when saving refl tables.
Closes dials/dials#1858. Background: exp_id is a column of experiment identifiers, originally developed in parallel with the experiment_identifier map in DIALS. Now that the experiment_identifier map is standard in DIALS, exp_id is redundant, so this commit removes it. Note this commit makes heavy use of two convenience methods in DIALS: - concat: takes two or more reflection tables and concatenates them together, renumbering the 'id' column and taking care of the experiment_identifer map - reset_ids: resets the 'id' column, accounting for any gaps. Useful after using select to remove reflections from a table. We can now use message_pack instead of pickle when saving refl tables.
It supports writing of
std::string
arraysdials/array_family/reflection_table_msgpack_adapter.h
Lines 665 to 666 in e129e6c
but does so in a broken way; It uses the same "ref/versa" writing as all the standard objects:
dials/array_family/reflection_table_msgpack_adapter.h
Lines 171 to 173 in e129e6c
This means that it writes to the messagepack file, an area of memory
n_items * sizeof(std::string)
starting from the&const_ref<std::string>[0]
; which is not how std::strings work; e.g. this will definitely contain the first string (but not read it), and if you are lucky this will contain some of the subsequent strings, but basically you are reading random memory.I don't know if this functionality was ever used for writing strings in the old
.pickle
format, but it's broken here and we've obviously not used it. Are there any examples of us wanting/needing to write string columns in the reflection tables?If not, my suggestion is that we just drop the
versa<std::string>
support.The text was updated successfully, but these errors were encountered: