Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Maybe a design issue of collections::Vector, also in UnorderedSet and UnorderedMap #703

Closed
riversyang opened this issue Jan 20, 2022 · 4 comments

Comments

@riversyang
Copy link

riversyang commented Jan 20, 2022

In recent development practices of NEAR smart contract, I found a case that will cause an unfixable wrong state of collections::Vector.

Actually, I found this issue in using UnorderedSet in a contract. Here is the case:

I used an UnorderedSet in a contract to store some unique account ids. In a contract function, I used insert function of UnorderedSet to insert an account id to the set. When I call this function through near-cli, I didn't provide extra gas for it and the function failed because it contained many business processing steps which will cost more gas than the default pre-paid gas (30T). Then I found the problem. When I view the UnorderedSet by function to_vec, I got nothing, and the len of it remains 0. But if I insert the same account id which I used in the failed function call, the set still remains empty. And if use contains function to check the account id, it returns true!

So I looked into the implementation of UnorderedSet, I found that the problem is in Vector.

The Vector is defined as:

pub struct Vector<T> {
    len: u64,
    prefix: Vec<u8>,
    #[borsh_skip]
    el: PhantomData<T>,
}

And the function push_raw which actually performs the storage writing is implemented as:

    /// Appends a serialized element to the back of the collection.
    pub fn push_raw(&mut self, raw_element: &[u8]) {
        let lookup_key = self.index_to_lookup_key(self.len);
        self.len += 1;
        env::storage_write(&lookup_key, raw_element);
    }

So, in the case I mentioned above, the element that I inserted in the failed function call had already been written to storage, but as the function call failed, the len of Vector was not updated to the contract state. And the function contains of UnorderedSet is to check the storage key directly, so it returns true. But the function to_vec will use len of internal Vector to get elements from storage, so it will return empty, as the len is still 0.

Obviously, the UnorderedMap has the same issue because it is also using Vector to store its keys and values.

In current design, if the case happens, we can not fix it easily, as there is no way to manually change the len of Vector (of course, I agreed it should like this). But the problem is, the case is not very rare in contracts which including complicated processing logic, it happens in all cases that we change the state of Vector and failed to update its len to storage (in the cases that a contract function call failed by asserts or gas limit after the len of Vector is changed).

I think this is a design issue in sdk level, and it should be noticed by the developers and community. As the wrong state of Vector, also in UnorderedSet and UnorderedMap, is very hard to understand and fix, especially in those contracts which are designed to implement complicated business logic.

Please advice your thoughts.

@austinabell
Copy link
Contributor

Hi!

The first thing I'm curious about is how this could be triggered because the state changes of failed method executions are rolled back. There are ways to trigger this inconsistency, but it involves having multiple copies of the same collection with the same prefix being used within the contract, which we cannot restrict in any way from the SDK. This can also happen if you are loading the collection outside of state and do not store the structure back (which would update this length metadata).

A note is that with the new implementations of the collections, this pattern of misuse is harder to encounter with the updated API. The updates I'm talking about are the structures under near_sdk::store that are enabled with the unstable feature.

Maybe if you provide a repro of this issue we can diagnose where the issue is? From looking at a glance, seems similar to #560

@riversyang
Copy link
Author

riversyang commented Jan 21, 2022

Thanks for your reply. @austinabell

I'm sorry I didn't create the issue right after I found it, I'm very busy those days.

I recalled the case and I think it is just like you said, that the inconsistency is caused by a copy of the collection struct, like I reproduced in the simple contract.

And I also recalled that I had faced the problem like #560 too, a few months ago, when I performed a storage migration for octopus-appchain-registry.

I did not read the new implementation for now, I don't know wether the case shown in the simple contract can be restricted on SDK level. If it is not, I think the developers should be warned for the case, as it's easy to happen but hard to fix, especially in those complicated contracts. Because from my understanding, we should not put all necessary data in the contract struct, it will cost much more gas to deserialize/serialize the struct before/after each function call of the contract. We may use LazyOption to wrap the actual data struct to reduce the gas consumption in each contract function call, like my simple sample. This pattern increases the possibility of the inconsistency while using Vector, UnorderedSet and UnorderedMap in complicated contracts.

@austinabell
Copy link
Contributor

Thanks for your reply. @austinabell

I'm sorry I didn't create the issue right after I found it, I'm very busy those days.

I recalled the case and I think it is just like you said, that the inconsistency is caused by a copy of the collection struct, like I reproduced in the simple contract.

And I also recalled that I had faced the problem like #560 too, a few months ago, when I performed a storage migration for octopus-appchain-registry.

I did not read the new implementation for now, I don't know wether the case shown in the simple contract can be restricted on SDK level. If it is not, I think the developers should be warned for the case, as it's easy to happen but hard to fix, especially in those complicated contracts. Because from my understanding, we should not put all necessary data in the contract struct, it will cost much more gas to deserialize/serialize the struct before/after each function call of the contract. We may use LazyOption to wrap the actual data struct to reduce the gas consumption in each contract function call, like my simple sample. This pattern increases the possibility of the inconsistency while using Vector, UnorderedSet and UnorderedMap in complicated contracts.

So first, LazyOption should really only be used when the amount of data being deserialized is large, and in this simple case, it is not. I assume you are just doing this to indicate the inconsistency, but I figured I'd mention this.

As for this pattern, yes it is unfortunately the case with the previous implementations that the value pulled from LazyOption does not know how it should be stored back to state so it must be done intentionally with the set or equivalents from the other data structures. I'm sorry to say that this is intended for this API and seems like documenting this fact can definitely be improved for these versions.

I took the liberty of switching the collections in your example to show what the new API looks like and to see if you think there is a way to expose this pattern in some way here (austinabell/near-contract-sample1@b67640e). With the new API, it is much harder to hit this inconsistency because it would have to be done very intentionally to deserialize multiple of the same collection.

@riversyang
Copy link
Author

Yes, my simple sample repo is just to show the use case I mentioned. It's no need to use LazyOption in this contract obviously.

And for the example you made, I think the new API is clear and clean enough to me. Well done! Looking forward to use SDK 4.0.0 in further development.

I think the issue can be closed. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants