Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Current plan for pointers? #347

Open
SebastianStehle opened this issue Nov 13, 2023 · 8 comments
Open

Current plan for pointers? #347

SebastianStehle opened this issue Nov 13, 2023 · 8 comments

Comments

@SebastianStehle
Copy link

Hi,

I could not attend the last meeting and I am not sure about tomorrow. What is the current plan for pointers in the ffi interface? We have discussed a few solutions, but I am not sure if there was any conclusion.

@Horusiath
Copy link
Collaborator

Horusiath commented Nov 14, 2023

Hi @SebastianStehle - re. node pointers: I'll be working on fixing the issue after the next release (v0.17). There's already quite a lot on a plate and I don't want to prolong the release in order to include the fix in there.

Since we want to avoid forcing users to manually free node references at the client side, I'll start by going with your proposal. Namely: create a map within document store, which provides int→Branch addressing. All shared collections will be wrappers around that int number. Destroying a block will also erase the entry from that map.

While it introduces a small indirection, it will let us keep node size equal to pointer size and won't require manual alloc/free in the FFI.

@Horusiath
Copy link
Collaborator

Horusiath commented Nov 14, 2023

That being said this feature can be messy on the receiving end. Assuming BranchPtr representations as struct BranchPtr(u32) we need to somehow cover situations where it refers to a node that's already dead. Depending on the strategy it may be either unsafe or very invasive into a current state of an API.

1. Unsafe variant - checks and panics

With this we can basically leave the API as it is - since all methods require transaction param, we can use it to materialize node ids into pointers. If this cannot be done - ID refers to a node that no longer exists - we panic. If user suspects a panic we can expose Transaction::is_alive<T>(&self, node_ref: &T) -> bool method and push the requirement of checking and reacting to value existence onto the users.

2. Safe variant - refs

In this approach we modify the API to not return shared collections directly but rather return node ID, that can be dereferenced into actual shared collection in a transaction scope:

impl MapRef {
  pub fn get<'txn>(&self, txn: &'txn mut TransactionMut, key: &str) -> Option<Ref<ValueRef<'txn>>;
}

pub struct Ref<V> {
  id: NodeId,
  ...
}

impl<V> Ref<V> {
  // Dereference current Ref into actual type. Underlyin Ref's value is no longer alive return None.
  pub fn deref<'txn, T: ReadTxn>(&self, txn: &'txn T) -> Option<&'txn V>;
}

With this we can always guarantee that user does an aliveness check whenever they try to execute methods on it. Cons here is that this approach is extremely invasive in terms of API changes.

@SebastianStehle
Copy link
Author

If we panic, when a node does not exist anymore, where is the difference to the current state? We should just have proper error codes.

@Horusiath
Copy link
Collaborator

If we're about to change the API, I'd prefer to go all way through and implement 2nd approach - it's way more descriptive and reduces number of redundant checks:

let a = ymap.get(&mut txn, "a")
    .unwrap()  // one check if the ymap is alive and may return an error
    .unwrap(); // one check if the value exists
    
let b = ymap.get(&mut txn, "b")
    .unwrap()  // at this point we already know that ymap is alive but we need to unwrap result anyway
    .unwrap(); // one check if the value exists

I'm not conflating ymap liveness error with value absence as these are different concepts in Rust.

With the second approach, what we have is:

let ymap: &MapRef = map_ref.deref(&mut txn)
    .unwrap(); // check if ymap is alive
let a = ymap.get("a") // bonus: we can now include txn inside of MapRef itself
    .unwrap(); // check if value exists
let b = ymap.get("b").unwrap();

While I consider 2nd option more tempting and wouldn't mind implementing it there one problem: it's basically a complete overhaul of the API, which is burden for all existing users and maintainers of host language bindings.

@SebastianStehle
Copy link
Author

For me it is really difficult when we mix the implementation details with the interface all the time. First I am not a rust developer and cannot really give valuable input and second because I think we should not drive the interface by the implementation if possible.

The whole point of the indirection is to prevent non-catchable process faults. If we just panic, we gain absolutely nothing. I think the current API needs a better approach for error handling anyway, so sooner or later you have to break things (or introduce a v2 or second project).

@Horusiath
Copy link
Collaborator

Horusiath commented Nov 14, 2023

The whole point of the indirection is to prevent non-catchable process faults.

What I meant here was:

// this code works as it used to - if ymap is destroyed, it will panic
// I'm calling it because I'm sure that it's alive
let a = ymap.get(&mut txn, "a");

// if I'm not sure that my collection is alive, I do a check first
if txn.is_alive(&ymap) {
  let a = ymap.get(&mut txn, "a");
}

This way it works without breaking existing code. We can also expose txn.is_alive to C FFI - this way we can do that check ie. in C# as part of member access and throw ie. ObjectDisposedException if check fails, instead of panicking.

@SebastianStehle
Copy link
Author

Yes, that would work. If you ever thing about a ffi V2, I think error codes might be a good thing. There are other cases like transactions where an error (instead of null pointer) would be great.

@SebastianStehle
Copy link
Author

Here are a few methods, where I am not sure about this new method:

  • yarray_len (we can just a transaction anyway, so no problem).
  • ybranch_write_transaction
  • ybranch_read_transaction
  • ymap_observe
  • ymap_unobserve
  • ytext_observe
  • yobserve_deep
  • yunobserve_deep
  • ytype_kind
  • yinput_yXXX

There are a lot of methods where we are dealing with pointers that have no transaction. I have not included methods with a well defined scope like ytext_event_delta

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants