Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add isElementAllocated/isValueAllocated/isPayloadAllocated functions to SeqAccess/MapAccess/VariantAccess interface #127

Closed
polykernel opened this issue Aug 4, 2023 · 5 comments · Fixed by #132
Assignees
Labels
accepted This proposal is planned. proposal This issue suggests modifications. If it also has the "accepted" label then it is planned.
Milestone

Comments

@polykernel
Copy link
Contributor

polykernel commented Aug 4, 2023

Problem

There are cases where it is desirable to know whether the last element deserialized by nextElementSeed, the last value deserialized by nextValueSeed or the last variant deserialized by payloadSeed is allocated on the heap. For instance, a data structure may apply some transformation to an element to be inserted such that the original value is unneeded, or a data structure may create defensive copies of inserted values making the original values unreachable.

I observed this use case while working working on (de)serialization support for std.IndexedSet as part of #120 which translate values inserted into it to indices, making the values themselves ephemeral.

Proposal

Add the functions isElementAllocated, isValueAllocated, isPayloadAllocated functions to SeqAccess, MapAccess and VariantAccess interface respectively. This allows checking the storage location of a value deserialized from access methods (e.g., nextKey) and makes it possible to conditionally free the value only if it is allocated on the heap. Using these functions permit deserialization without an allocator when no allocation would be performed, following the principle of least privilege. For example,

while (try seq.nextElement(allocator, T)) |elem| {
  defer if (seq.isElementAllocated(T)) {
    free(allocator.?, Deserializer, elem);
  }
  ...
}
while (try map.nextKey(allocator, K)) |k| {
  ...
  const v = try map.nextValue(allocator, V);
  defer if (map.isValueAllocated(V)) {
    free(allocator.?, Deserializer, v);
  }
  ...
}

Alternatives

Leave the status quo as is, the described problem can be remedied by unconditionally freeing the last deserialized element/values. For example,

while (try seq.nextElement(allocator, T)) |elem| {
  defer free(allocator.?, Deserializer, elem);
  ...
}
while (try map.nextKey(allocator, K)) |k| {
  ...
  const v = try map.nextValue(allocator, V);
  defer free(allocator.?, Deserializer, v);
  ...
}

This works because types which are not allocated do not implement a free function however it has the consequence of always requiring an allocator for deserialization even if when no allocation would be performed

Additional Context

No response

@polykernel polykernel added the proposal This issue suggests modifications. If it also has the "accepted" label then it is planned. label Aug 4, 2023
@ibokuri
Copy link
Contributor

ibokuri commented Aug 4, 2023

Can you update the proposal so that it includes adding an isPayloadAllocated to the VariantAccess interface? That way, everything will be consistent across the access interfaces and would give unions the same benefits that you're describing for sequences and maps.

Not to mention, the consistency would allow me to finally define (in a sane manner) Getty's allocation story during deserialization:

  • Deserialized pointer values returned by an access method (e.g., nextKey) will be freed upon failure and, where applicable1, upon success, if the value's corresponding isXAllocated method (e.g., isKeyAllocated) returns true.
  • Deserialized pointer values not returned by an access method will always be freed upon failure and, where applicable1, upon success.

  1. Generally speaking, "where applicable" means whenever a deserialized value is not part of the final value returned to the end user. For instance, if a heap-allocated string is visited to produce an integer, the string isn't part of the final, returned integer and so we'd free it even upon success.

@ibokuri ibokuri added the accepted This proposal is planned. label Aug 4, 2023
@polykernel polykernel changed the title Add isElementAllocated/isValueAllocated functions to SeqAccess/MapAccess interface Add isElementAllocated/isValueAllocated/isPayloadAllocated functions to SeqAccess/MapAccess/VariantAccess interface Aug 5, 2023
@polykernel
Copy link
Contributor Author

Sorry for the late response, my availability have been sporadic for the last few days. I have updated the proposal to include isPayloadAllocated for the VariantAccess interface. As for the definition of Getty's allocation flow, I think it might be worthwhile to also define the responsibility of the free function clearly. It is my understanding that free is intended only for freeing a value in after it has been completely deserialized, thus if a value is incompletely deserialized (i.e. an error has occurred during its deserialization), custom logic may be needed to free the incomplete value. It turns out that for a vast number of types, the logic for both cases is exactly the same but there are some notably exceptions such as arrays and other fixed-size data structures.

@ibokuri
Copy link
Contributor

ibokuri commented Aug 5, 2023

Sorry for the late response, my availability have been sporadic for the last few days.

No worries!

I think it might be worthwhile to also define the responsibility of the free function clearly. It is my understanding that free is intended only for freeing a value in after it has been completely deserialized [...]

Yep. I had thought I added it to the doc comment for free but apparently not. I'll make a todo for that. I also need to update that doc comment anyways since the allocation model described in it is outdated at this point...

[...] if a value is incompletely deserialized (i.e. an error has occurred during its deserialization), custom logic may be needed to free the incomplete value. It turns out that for a vast number of types, the logic for both cases is exactly the same but there are some notably exceptions such as arrays and other fixed-size data structures.

Right. I think the visitors that Getty define handle incomplete values properly with custom logic. But custom visitors written by users should know that they should use custom logic as well in those cases, so a doc comment would be nice for them.

@ibokuri
Copy link
Contributor

ibokuri commented Aug 5, 2023

I'll assign this to you for now since I think you had some work started already. Let me know if you can't or don't want to do it. I don't mind implementing it 👍

@ibokuri ibokuri added this to the 0.5.0 milestone Aug 5, 2023
@polykernel
Copy link
Contributor Author

Yeah sure, I just need to patch up my changes a bit and add some documentation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
accepted This proposal is planned. proposal This issue suggests modifications. If it also has the "accepted" label then it is planned.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants