Current plan for pointers? #347

SebastianStehle · 2023-11-13T15:34:21Z

Hi,

I could not attend the last meeting and I am not sure about tomorrow. What is the current plan for pointers in the ffi interface? We have discussed a few solutions, but I am not sure if there was any conclusion.

Horusiath · 2023-11-14T05:47:54Z

Hi @SebastianStehle - re. node pointers: I'll be working on fixing the issue after the next release (v0.17). There's already quite a lot on a plate and I don't want to prolong the release in order to include the fix in there.

Since we want to avoid forcing users to manually free node references at the client side, I'll start by going with your proposal. Namely: create a map within document store, which provides int→Branch addressing. All shared collections will be wrappers around that int number. Destroying a block will also erase the entry from that map.

While it introduces a small indirection, it will let us keep node size equal to pointer size and won't require manual alloc/free in the FFI.

Horusiath · 2023-11-14T06:24:45Z

That being said this feature can be messy on the receiving end. Assuming BranchPtr representations as struct BranchPtr(u32) we need to somehow cover situations where it refers to a node that's already dead. Depending on the strategy it may be either unsafe or very invasive into a current state of an API.

1. Unsafe variant - checks and panics

With this we can basically leave the API as it is - since all methods require transaction param, we can use it to materialize node ids into pointers. If this cannot be done - ID refers to a node that no longer exists - we panic. If user suspects a panic we can expose Transaction::is_alive<T>(&self, node_ref: &T) -> bool method and push the requirement of checking and reacting to value existence onto the users.

2. Safe variant - refs

In this approach we modify the API to not return shared collections directly but rather return node ID, that can be dereferenced into actual shared collection in a transaction scope:

impl MapRef {
  pub fn get<'txn>(&self, txn: &'txn mut TransactionMut, key: &str) -> Option<Ref<ValueRef<'txn>>;
}

pub struct Ref<V> {
  id: NodeId,
  ...
}

impl<V> Ref<V> {
  // Dereference current Ref into actual type. Underlyin Ref's value is no longer alive return None.
  pub fn deref<'txn, T: ReadTxn>(&self, txn: &'txn T) -> Option<&'txn V>;
}

With this we can always guarantee that user does an aliveness check whenever they try to execute methods on it. Cons here is that this approach is extremely invasive in terms of API changes.

SebastianStehle · 2023-11-14T08:19:32Z

If we panic, when a node does not exist anymore, where is the difference to the current state? We should just have proper error codes.

Horusiath · 2023-11-14T09:11:36Z

If we're about to change the API, I'd prefer to go all way through and implement 2nd approach - it's way more descriptive and reduces number of redundant checks:

let a = ymap.get(&mut txn, "a")
    .unwrap()  // one check if the ymap is alive and may return an error
    .unwrap(); // one check if the value exists
    
let b = ymap.get(&mut txn, "b")
    .unwrap()  // at this point we already know that ymap is alive but we need to unwrap result anyway
    .unwrap(); // one check if the value exists

I'm not conflating ymap liveness error with value absence as these are different concepts in Rust.

With the second approach, what we have is:

let ymap: &MapRef = map_ref.deref(&mut txn)
    .unwrap(); // check if ymap is alive
let a = ymap.get("a") // bonus: we can now include txn inside of MapRef itself
    .unwrap(); // check if value exists
let b = ymap.get("b").unwrap();

While I consider 2nd option more tempting and wouldn't mind implementing it there one problem: it's basically a complete overhaul of the API, which is burden for all existing users and maintainers of host language bindings.

SebastianStehle · 2023-11-14T09:19:29Z

For me it is really difficult when we mix the implementation details with the interface all the time. First I am not a rust developer and cannot really give valuable input and second because I think we should not drive the interface by the implementation if possible.

The whole point of the indirection is to prevent non-catchable process faults. If we just panic, we gain absolutely nothing. I think the current API needs a better approach for error handling anyway, so sooner or later you have to break things (or introduce a v2 or second project).

Horusiath · 2023-11-14T10:45:03Z

The whole point of the indirection is to prevent non-catchable process faults.

What I meant here was:

// this code works as it used to - if ymap is destroyed, it will panic
// I'm calling it because I'm sure that it's alive
let a = ymap.get(&mut txn, "a");

// if I'm not sure that my collection is alive, I do a check first
if txn.is_alive(&ymap) {
  let a = ymap.get(&mut txn, "a");
}

This way it works without breaking existing code. We can also expose txn.is_alive to C FFI - this way we can do that check ie. in C# as part of member access and throw ie. ObjectDisposedException if check fails, instead of panicking.

SebastianStehle · 2023-11-14T12:06:28Z

Yes, that would work. If you ever thing about a ffi V2, I think error codes might be a good thing. There are other cases like transactions where an error (instead of null pointer) would be great.

SebastianStehle · 2023-11-25T09:05:23Z

Here are a few methods, where I am not sure about this new method:

yarray_len (we can just a transaction anyway, so no problem).
ybranch_write_transaction
ybranch_read_transaction
ymap_observe
ymap_unobserve
ytext_observe
yobserve_deep
yunobserve_deep
ytype_kind
yinput_yXXX

There are a lot of methods where we are dealing with pointers that have no transaction. I have not included methods with a well defined scope like ytext_event_delta

Horusiath mentioned this issue Nov 16, 2023

added methods to check if shared collection references has been deleted #351

Merged

SebastianStehle mentioned this issue Nov 25, 2023

Integrate new method ytransaction_alive y-crdt/ydotnet#63

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Current plan for pointers? #347

Current plan for pointers? #347

SebastianStehle commented Nov 13, 2023

Horusiath commented Nov 14, 2023 •

edited

Horusiath commented Nov 14, 2023 •

edited

SebastianStehle commented Nov 14, 2023

Horusiath commented Nov 14, 2023

SebastianStehle commented Nov 14, 2023

Horusiath commented Nov 14, 2023 •

edited

SebastianStehle commented Nov 14, 2023

SebastianStehle commented Nov 25, 2023

Current plan for pointers? #347

Current plan for pointers? #347

Comments

SebastianStehle commented Nov 13, 2023

Horusiath commented Nov 14, 2023 • edited

Horusiath commented Nov 14, 2023 • edited

1. Unsafe variant - checks and panics

2. Safe variant - refs

SebastianStehle commented Nov 14, 2023

Horusiath commented Nov 14, 2023

SebastianStehle commented Nov 14, 2023

Horusiath commented Nov 14, 2023 • edited

SebastianStehle commented Nov 14, 2023

SebastianStehle commented Nov 25, 2023

Horusiath commented Nov 14, 2023 •

edited

Horusiath commented Nov 14, 2023 •

edited

Horusiath commented Nov 14, 2023 •

edited