Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update rust bindings to use runtime field count #363

Closed

Conversation

jtraglia
Copy link
Member

This PR updates the Rust bindings to work with runtime field counts.

@asn-d6
Copy link
Contributor

asn-d6 commented Sep 18, 2023

Would be nice to get a review by @pawanjay176 et al. here

Copy link
Contributor

@pawanjay176 pawanjay176 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good, minor nits

bindings/rust/benches/kzg_benches.rs Outdated Show resolved Hide resolved
bindings/rust/src/bindings/mod.rs Outdated Show resolved Hide resolved
bindings/rust/src/bindings/mod.rs Outdated Show resolved Hide resolved
bindings/rust/src/bindings/mod.rs Outdated Show resolved Hide resolved
bindings/rust/src/bindings/mod.rs Outdated Show resolved Hide resolved
bindings/rust/src/bindings/mod.rs Outdated Show resolved Hide resolved
bindings/rust/src/bindings/mod.rs Outdated Show resolved Hide resolved
jtraglia and others added 2 commits September 18, 2023 16:17
Co-authored-by: Pawan Dhananjay <pawandhananjay@gmail.com>
@jtraglia
Copy link
Member Author

Thanks @pawanjay176. All should be fixed now.

Copy link
Contributor

@pawanjay176 pawanjay176 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks so much for this.
Might need to run cargo fmt --all once before merging

@jtraglia
Copy link
Member Author

Might need to run cargo fmt --all once before merging

Just ran, no changes. So all good I think.

Copy link
Contributor

@asn-d6 asn-d6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks!

@jtraglia
Copy link
Member Author

Hmm. I'm not sure of the best way to fix this serde issue. How can we pass KzgSettings to that function?

@DaniPopes any ideas?

@Rjected
Copy link
Contributor

Rjected commented Sep 20, 2023

Hmm. I'm not sure of the best way to fix this serde issue. How can we pass KzgSettings to that function?

@DaniPopes any ideas?

I don't think it's possible to do this, serde functions are just supposed to do the serialization / deserialization, not validation. Since the size is no longer known based solely on the Blob type at compile time, it might be better to have a separate function that does not check the length, taking in only the bytes, and provide a validation method that checks the length w.r.t the KzgSettings

@jtraglia
Copy link
Member Author

@Rjected good idea! I think that's a decent (and simple) approach. Since the interface functions check blob lengths, it should be safe to use. The last commit implements this change. @asn-d6, @pawanjay176: thoughts?

@@ -70,6 +42,10 @@ pub struct KZGProof {
bytes: [u8; BYTES_PER_PROOF],
}

pub struct Blob {
bytes: Vec<u8>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a concern about these Blob types defined here and also in all the other bindings PRs.

The problem is that we need to manually length-check them everytime they are used. This manual length check was not done in the past because a Blob was always a blob, whereas now with the variable length, it's not necessary that a byte vector is a blob.

I feel like this use of the Blob type when you don't actually know that it's a blob (until you validate it with a length check) is a dangerous pattern. Maybe in these PRs we have performed all the length checks required, but I think it sets us in a bad state for future changes because a dev could reasonably think that an input Blob is a blob and forget to do the length check.

This also is an anti-pattern in our code base where we have established the (IMO good) Bytes32/Bytes48 convention for pre-validation objects. Because now this new Blob type is pre-validated and it's still called Blob.

I think The Right Thing would have been to introduce a new Bytes type that can be turned into a Blob if it's valid (i.e. its size is right). Same way that a Bytes32 can be turned into a KZGCommitment if it's valid.

(moved this from TG here so that it doesn't get lost)

@jtraglia
Copy link
Member Author

Based on @asn-d6's comment, I've made some more changes to this PR. Notably, I've replaced Blob with Vec<u8>. Before, it would be somewhat misleading if the blob is an invalid size. I think this is somewhat cleaner and is closer to the core C implementation, but if this causes unintended side effects we can implement it differently. @pawanjay176, I would appreciate another review by you.

@pawanjay176
Copy link
Contributor

@asn-d6 Agree that doing the length check repeatedly isn't the best.
However, I do not like passing around Vecs in the kzg functions like suggested by @jtraglia . It takes us even further away in terms of type safety.
I added some changes on top of your rust bindings refactor here https://github.com/ethereum/c-kzg-4844/compare/runtime-fields...pawanjay176:c-kzg-4844:rust-runtime-field-count?expand=1

Basically my changes make it such that we export 2 separate modules from the rust bindings: kzg_mainnet and kzg_minimal. Each module contains a Kzg struct along with all interface functions as methods on that struct. It also defines FIELD_ELEMENTS_PER_BLOB, BYTES_PER_BLOB and contains a Blob and KzgSettings struct in the module namespace.
Blob and KzgSettings are instantiations of BlobGeneric and KzgSettingsGeneric which are generic over values of FIELD_ELEMENTS_PER_BLOB const.

By doing this, for each preset module:

  1. Blob is a fixed length array like before
  2. KzgSettings is only constructed if the trusted setup params correspond to the preset
  3. We don't need to do any length checks like now because we enforce that Blob and KzgSettings passed to the methods are consistent with each other.

I haven't fully cleaned up my commits (still needs some work in terms of error handling), but just trying to get feedback on the idea.

@jtraglia
Copy link
Member Author

jtraglia commented Oct 3, 2023

Hey @pawanjay176 thanks. I like this idea. Wasn't a fan of this in Lighthouse, but it makes sense here.

I had two thoughts while reviewing those changes:

  • We should check that FIELD_ELEMENTS_PER_BLOB matches s->field_elements_per_blob in the two load_trusted_setup functions. Just to ensure the client loaded the appropriate trusted setup. When blob length checks are removed, this wouldn't be caught anywhere.
  • Just a nit, but I'd like to rename kzg_mainnet and kzg_minimal to mainnet and minimal. I think c_kzg::kzg_mainnet is too repetitive.

You are welcome to push your changes to this PR if you can. If not, we'll give you access.

@Rjected
Copy link
Contributor

Rjected commented Oct 3, 2023

I have some concerns with both the current PR and making Blob etc generic:

  • The suggestion by @asn-d6 about not having length checks everywhere is a good one - there should be two types, a validated and unvalidated blob type. Validation in this case means referencing the KzgSettings and ensuring the blob has the correct length
  • Using Vec<u8> everywhere is not the right way to implement this change, because Vec<u8> is not a validated type.
  • The correct way to implement this change would be to implement a KzgSettings::validate(unvalidated_blob: UnvalidatedBlob) -> Blob.
  • serde types should be implemented on the unvalidated types, or can be entirely omitted if there is no new "unvalidated type", i.e. all conversions are Vec<u8> -> Blob.

Using a generic causes a very poor UX/DX, for example in reth this would require us to change our blob type to the following, and handle each variant separately everywhere:

pub enum Blob {
    Mainnet(c_kzg::mainnet::Kzg::Blob),
    Minimal(c_kzg::minimal::Kzg::Blob),
}

Without using an enum, the only other way to consume the Blob<T> type would be to bubble the T generic up to all higher level types, which is worse IMO.
Generic KzgSettings is also more complicated to consume, compared to the current KzgSettings.

To properly implement the unvalidated / validated type pattern, the validated Blob type can be the following:

pub struct Blob(bytes::Bytes);

The unvalidated type can be anything, either a new type that wraps Bytes / Vec<u8> / &[u8], or there could be no "unvalidated" type. The validation method just needs to accept some bytes type.

As long as all constructors for Blob are methods on KzgSettings which validate the length, for example the above validate method, this type both avoids generics and is safer than Vec<u8>. Methods that take Blob as input can then assume that the length has been validated and do not need to perform an additional length check.

I also do not think the following type of API misuse is a large concern:

  • using a minimal-spec KzgSettings to construct a Blob, and then passing it to a mainnet KzgSettings. The user has to explicitly do this, and documentation can easily describe this failure mode.

@pawanjay176
Copy link
Contributor

@Rjected Thanks a lot for the feedback.
Just to be clear, I vastly preferred the version of this PR at 8168014 over what I suggested in my previous comment.

I was a bit concerned about the API misuse that you described:

using a minimal-spec KzgSettings to construct a Blob, and then passing it to a mainnet KzgSettings

However, agree with your point now. I overestimated the requirement of providing fewer checks at runtime at the expense of cleaner end user api.
I am in favour of the solution you described where we assume that the methods that take Blob as input have the length checks done during construction of the Blob. We just document the failure case explicitly like you said.

@realbigsean
Copy link

I've extended @pawanjay176's proposal to include types that would fit the bill for this use case

PR jtraglia#5

@jtraglia
Copy link
Member Author

jtraglia commented Oct 4, 2023

I don't have a strong opinion on the direction of the Rust bindings. But if we end up going this route, I think it would be best to create a new PR in ethereum/c-kzg-4844 and close this PR. A fresh start to reduce potential confusion.

@realbigsean
Copy link

realbigsean commented Oct 5, 2023

Ok so I talked with @pawanjay176 more about this. I think what we're looking for is more flexibility in the interface between client codebase and rust bindings. And more strictness/safety between rust bindings and C-code. So we thought about the following pattern (edging closer to #363 (comment) just obfuscating some of it from the end user, and making the public interface maximally flexible):

  • all public methods exist on KzgSettings and accept &[u8] for blobs
  • all methods interacting with C code accept ValidatedBlob<'a>(&'a [u8]) and are private
  • ValidatedBlob is private
  • Only KzgSettings can generate the verified type fn validate_blob<'a>(bytes: &'a [u8]) -> ValidatedBlob<'a> (again private)
  • KzgSettings is generic over BYTES_PER_BLOB

Basically we wouldn't support creating an intermediate ValidatedBlob by the end user. But would require it for any C code interaction.

Reasons for this from LH perspective:

  • We don't need access to an intermediate ValidatedBlob type, from our POV this can be a c-kzg detail
  • We already have blob types with compile time length guaruntees, these are the types we use with SSZ serialization/deserialization. So long as we have a KzgSettings type per-spec that has all kzg methods, we can associate with this and don't need generic blobs in the library. (all our types are already generic over mainnet vs minimal spec)
  • if we're able to pass in &[u8] and ValidatedBlob<'a> simply wraps this &[u8] we can avoid having to copy/clone from our lighthouse types into c-kzg types (in single-blob verification, don't think it's avoidable in batch verifiction because the C code expects essentially &[[u8]]). Our types are already fixed sized and Arc'd so we'd prefer not to pass ownership and not copy/clone

Questions for Reth (@Rjected ):

  • Is having multiple KzgSettings types useful for you? if not we can move this into lighthouse as a wrapper
  • do you have a usecase for an intermediate ValidatedBlobs type? If yes, we can also just make this struct and all related methods public, but the lifetime in ValidatedBlob<'a> might be a PITA

A more fleshed out example:


#[derive(Debug, Clone, PartialEq)]

struct ValidatedBlob<'a> {
    bytes: &'a [u8],
}

#[derive(Debug)]
struct KzgSettings<const BYTES_PER_BLOB: usize> {
    kzg_settings: KZGSettings,
}



pub type MainnetKzgSettings = KzgSettings<BYTES_PER_BLOB_MAINNET>;
pub type MinimalKzgSettings = KzgSettings<BYTES_PER_BLOB_MINIMAL>;



impl<const BYTES_PER_BLOB: usize> KzgSettings<BYTES_PER_BLOB> {


    pub fn verify_kzg_proof(&self, bytes: &[u8], …) -> Result<bool, Error> { 
         
        let blob = Self::blob_to_bytes(bytes)?;
         
        self.verify_kzg_proof_inner(bytes)
    
    }
    

    fn verify_kzg_proof_inner<‘a>(&self, blob: VerifiedBlob<‘a >) -> Result<bool, Error> {

     
    <actually interface with C code>        
    }
    

    fn validate_blob<'a>(bytes: &'a[u8]) -> Result<ValidatedBlob<'a>, Error> {
        if bytes.len() != BYTES_PER_BLOB {
            return Err(Error::MismatchLength(format!(
                "Blob length invalid {}",
                bytes.len()
            )));
        }

        Ok(ValidatedBlob {
            bytes,
        })
    }

}


@pawanjay176 is going to work on an implementation tomorrow

@jtraglia jtraglia closed this Oct 9, 2023
@jtraglia jtraglia deleted the rust-runtime-field-count branch August 1, 2024 15:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants