Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Partial interpolation #61

Closed
wants to merge 42 commits into from

Conversation

psivesely
Copy link
Contributor

@psivesely psivesely commented Apr 1, 2018

Status

Having shown an earlier version of this branch to @romac, I'm now putting this up for public review. It is still a work in progress, but the main components are in place.

Description

This PR implements most of a complete interface for incrementally computing a secret using barycentric Lagrange interpolation. In cases where not all shares are available at once, it is still possible to start interpolating the secret polynomial, such that when threshold shares are finally present, the final computation time (i.e., the time between receiving the threshold-th share and the secret being recovered) is much smaller. This will be especially noticeable when the secret is very long, or the threshold is very high. (E.g., in Sunder, shares are entered one at a time and this could be useful.)

It is still missing it's highest-level, public-facing interface that would accept &[String]s (shares in string form), parse them, and return Recovery objects. Recovery objects store and update the state of the partially recovered secret.

Under the hood, Recovery objects store:

  1. barycentric weights and diffs computed evaluate the polynomial.
  2. The ids processed so far.
  3. The threshold.
  4. The secret length (slen).
  5. Optionally, the root_hash of the Merkle tree (i.e., the public key the shares were signed with).
  6. Optionally, the secret.

1-3 are needed to correctly compute the secret, and know when it's ready. Since there would be a lot of redundant data among the BarycentricWeightss if we were to store the ids and threshold in that struct, we depend on Recovery to store and update the ids, and provide new points to update each BarycentricWeights. When threshold shares have been processed, the secret, which is initialized at None, is automatically computed (Some<Vec<u8>>), and can then be retrieved with get_secret.

2-5 are needed for validation of the shares and verification of the signatures. If verify_signatures is true Recovery::new() will set the the root_hash to the root hash of the initial share(s) used to create the Recovery object, and this root_hash value is automatically used during subsequent Recover::updates to ensure consistency among the signatures. Likewise, during updates share validation "picks up where it left off" by passing the already processed ids, along with the new shares to be validate, to the new function validate_additional_signed_shares in order to prevent duplicate shares from being processed.

Thoughts/ ideas:

I'm not sure if it is necessary to create {to,from}_string methods for Recovery in order to make a functional interface in regards to the node library. If it is not necessary, I assume this means the node interface can act on an Recovery object in memory, without it having to be in a printable or JS-intelligible/interoperable form. In this case, one could simply wait until threshold shares have been interpolated, and then call get_secret() on the Recovery object, which will return a Result<Vec<u8>> (just as does Recovery::recover_secret, which, though changed under the hood, provides the same interface that SSS::recover_secret used to).

Whether or not it such methods are necessary, they may be useful. The idea being that you could partially interpolate a secret from less than threshold shares, storing a result a fraction of the size the combined shares (and maybe more importantly/ conveniently, as a single piece of data), and then later interpolate more shares to get the final result. To do this we'd need some way to serialize an Recovery for long-term storage (string, or binary for that matter). As a use case, imagine you want to recover a secret using Sunder, and you expect to get shares in person over the course of some time. A useful feature would be the ability to save to disk a serialized Recovery object, so that you don't have to save each share to disk, and then enter them all in once there are sufficient. Good UI would keep you updated on how many shares you've interpolated so far and how many more are needed to fully recover the object using the shares_interpolated() and shares_needed() methods. Especially when recovering multiple secrets at a time in an incremental effort, a good UI built around this functionality could save you a lot of organizational effort and help prevent mistakes.

TODO

  • Improve documentation and code comments.
  • Add a lot more tests.
  • Add a few more benchmarks.
  • Create highest-level (user) interface that takes &[String].
    • Protobuf/ {to,from_string} for Recovery?
  • Catch and modify any InconsistentSignature errors to correct the ids argument when S::verify_signature is called in validate::validate_additional_signed_shares before re-raising (or whatever the Rustaceans call this) with match and whatever else from the error_chain crate (see this commit message).

psivesely and others added 17 commits March 19, 2018 21:22
Implements barycentric Lagrange interpolation. Uses algorithm (3.1) from the
paper "Polynomial Interpolation: Langrange vs Newton" by Wilhelm Werner to find
the barycentric weights, and then evaluates at `Gf256::zero()` using the second
or "true" form of the barycentric interpolation formula.

I also earlier implemented a variant of this algorithm, Algorithm 2, from "A new
efficient algorithm for polynomial interpolation," which uses less total
operations than Werner's version, however, because it uses a lot more
multiplications or divisions (depending on how you choose to write it), it runs
slower given the running time of subtraction/ addition (equal) vs
multiplication, and especially division in the Gf256 module.

The new algorithm takes n^2 / 2 divisions and n^2 subtractions to calculate the
barycentric weights, and another n divisions, n multiplications, and 2n
additions to evaluate the polynomial*. The old algorithm runs in n^2 - n
divisions, n^2 multiplications, and n^2 subtractions. Without knowing the exact
running time of each of these operations, we can't say for sure, but I think a
good guess would be the new algorithm trends toward about 1/3 running time as n
-> infinity. It's also easy to see theoretically that for small n the original
lagrange algorithm is faster. This is backed up by benchmarks, which showed for
n >= 5, the new algorithm is faster. We can see that this is more or less what
we should expect given the running times in n of these algorithms.

To ensure we always run the faster algorithm, I've kept both versions and only
use the new one when 5 or more points are given.

Previously the tests in the lagrange module were allowed to pass nodes to the
interpolation algorithms with x = 0. Genuine shares will not be evaluated at x =
0, since then they would just be the secret, so:

1. Now nodes in tests start at x = 1 like `scheme::secret_share` deals them out.
2. I have added assert statements to reinforce this fact and guard against
   division by 0 panics.

This meant getting rid of the `evaluate_at_works` test, but
`interpolate_evaluate_at_0_eq_evaluate_at` provides a similar test.

Further work will include the use of barycentric weights in the `interpolate`
function.

A couple more interesting things to note about barycentric weights:

* Barycentric weights can be partially computed if less than threshold
  shares are present. When additional shares come in, computation can resume
  with no penalty to the total runtime.
* They can be determined totally independently from the y values of our points,
  and the x value we want to evaluate for. We only need to know the x values of
  our interpolation points.
While this is a slight regression in performance in the case
where k < 5, in absolute terms it is small enough to be neglible.
Horner's method is an algorithm for calculating polynomials, which consists of
transforming the monomial form into a computationally efficient form. It is
pretty easy to understand:
https://en.wikipedia.org/wiki/Horner%27s_method#Description_of_the_algorithm

This implementation has resulted in a noticeable secret share generation speedup
as the RustySecrets benchmarks show, especially when calculating larger
polynomials:

Before:
test sss::generate_1kb_10_25 ... bench: 3,104,391 ns/iter (+/- 113,824)
test sss::generate_1kb_3_5 ... bench: 951,807 ns/iter (+/- 41,067)

After:
test sss::generate_1kb_10_25        ... bench:   2,071,655 ns/iter (+/- 46,445)
test sss::generate_1kb_3_5          ... bench:     869,875 ns/iter (+/- 40,246)
RustySecrets makes minimal use of the rand library. It only initializes
the `ChaChaRng` with a seed, and `OsRng` in the standard way, and then calls
their `fill_bytes` methods, provided by the same Trait, and whose function
signature has not changed.  I have confirmed by looking at the code changes,
that there have been no changes to the relevant interfaces this library uses.
Since id is a `u8` it will never be greater than 255.
It's possible that two different points have the same data.

To give a concrete example consider the secret polynomial `x^2 + x + s`, where
`s` is the secret byte. Plugging in 214 and 215 (both elements of the cyclic
subgroup of order 2) for `x` will give the same result, `1 + s`.

More broadly, for any polynomial `b*x^t + b*x^(t-1) + ... + x + s`, where `t` is
the order of at least one subgroup of GF(256), for all subgroups of order `t`,
all elements of that subgroup, when chosen for `x`, will produce the same
result.

There are certainly other types of polynomials that have "share collisions."
This type was just easy to find because it exploits the nature of finite fields.
Ensures that threshold > 2 during the parsing process, since we ensure the same
during the splitting process.
Since the validation already confirms `shares` is not empty, `k_sets` will never
match 0.
The arguments were provided in the wrong order.
* Pass a ref to `Vec<Shares>` instead of recreating and moving the object
  through several functions.
* Return `slen`/ `data_len`, since we'll be using it anyway in `recover_secrets`
I think that using hashmaps and hash sets was overkill and made the code much
longer and complicated than it needed to be.

The new code also produces more useful error messages that will hopefully help
users identify which share(s) are causing the inconsistency.
The best place to catch share problems is immediately during parsing from
`&str`, however, because `validate_shares` takes any type that implements the
`IsShare` trait, and there's nothing about that trait that guarantees that the
share id, threshold, and secret length will be valid, I thought it best to leave
those three tests in `validate_shares` as a defensive coding practice.
This should be useful when validating very large sets of shares. Wouldn't want
to print out up to 254 shares.
* Update rustfmt compliance

Looks like rustfmt has made some improvements recently, so wanted to bring the
code up to date.

* Add rustfmt to nightly item in Travis matrix

* Use Travis Cargo cache

* Allow fast_finish in Travis

Items that match the `allow_failures` predicate (right now, just Rust nightly),
will still finish, but Travis won't wait for them to report a result if the
other builds have already finished.

* Run kcov in a separate matrix build in Travis

* Rework allowed_failures logic

We don't want rustfmt to match `allow_failures` just because it needs to use
nightly, while we do want nightly to match `allow_failures`. Env vars provide a
solution.

* Add --all switch to rustfmt Travis

* Test building docs in Travis

* Use exact Ubuntu dependencies listed for kcov

Some of the dependencies we were installing were not listed on
https://github.com/SimonKagstrom/kcov/blob/master/INSTALL.md, and we were
missing one dependency that was listed there. When `sudo: true` Travis uses
Ubuntu Trusty.

* No need to build before running kcov

kcov builds its own test executables.

* Generate `Cargo.lock` w/ `cargo update` before running kcov

As noted in aeb3906 it is not necessary to
build the project before running kcov, but kcov does require a `Cargo.lock`
file, which can be generated with `cargo update`.
This refactor makes the code a lot clearer, and separates barycentric
interpolation into parts that can be reused, such as in the partial
interpolation functionality I intend to implement.
In this PR:

* Introduces `PartialSecret` struct and associated methods for interpolating and
  evaluating polynomials incrementally (or all-at-once for that matter).
  * Implements strict input validation for all public functions. With private
    ones we can reason about their inputs.
  * Uses this struct behind-the-scenes with `interpolate_at`.

Problems to be addressed later:

* There should be a higher level interface in sss.
* Error handling right now is mostly for example. Probably we should create some
  new `ErrorKinds`. I just used the most analagous ones as placeholders.
  Validation is comprehensive, I believe, which is good, but it should be DRYed
  out.
  * Numeric overflow is possible when we cast some `len()` to `u8` in order to
    satisfy the function signatures of certain `ErrorKinds`. This is a general
    bug, that I will make a separate PR for.

Future work:

* It is possible to pre-compute all barycentric weights for a given secret
  after receiving the first share(s) if `shares_count` is equal to `threshold`,
  but `Share`s don't include a `shares_count` field (presumably because this is
  unecessary information for reconstruction, and in the case of a share being
  compromised would provide the bad actor with more information).
* Use barycentric Lagrange interpolation to find coefficients (incrementally and
  all at once).
Changes the `differences` field name to `diffs`, adds/ improves some
documentation, makes sure `update` fails if we've already evaluated sufficient
points to compute the secret, and adds the `shares_needed` convenience method.
This way we can reuse the computational work we've done if for some reason we
want to evaluate the same set of interpolated points at value other than
`Gf256::zero()`.

As noted, a slight sacrifice to efficiency was made when implementing this
function, in order to reduce the `PartialSecret` size, and increase
precomputation in the standard case of evaluating at `Gf256::zero()`.
Makes `secret` and `threshold` fields public for easy access. Refines
`shares_needed` and adds `shares_evaluated` convenience functions. Refines
example error handling*.

* Note these are still just temporary values to illustrate what type of
validation we will be doing.
These should be considered exemplary at this point, but I wanted to start to
flesh out a higher-level way to interact with the `PartialSecret` struct.
Besides more conceptual changes in terms of how to make this interface more
user-friendly, I think the validation needs to be DRYed out, and the error
handling refined.

In particular, all the functions that follow `begin_partial_secret_recovery` are
basically analogues to methods, and it feels like it would be nicer to call them
as such instead of as functions.

Mostly, I'm unsure of how the repository maintainers would like such an
interface to work, so only took my best jab at fleshing this out.
* Create new `NoMoreSharesNeeded` `ErrorKind` to be used when a `PartialSecret`
  already holds a complete secret.
* Replaced large `if else` validation blocks with re-usable methods and
  functions.
* Created `update_diffs` function to DRY out code shared between `new` and
  `update` methods.
* Introduces the IncrementalRecovery struct, a struct that creates and updates
  many `PartialSecret`s from `Share`s, essentially introducing a higher-level
  interface.
* New validation functions were created to handle this case were not all shares
  arrive at once. Only the necessary metadata for validating further shares (the
  threshold, the secret length, the IDs that have been verified so far, and
  optionally the root hash that signed the shares validated so far) is stored by
  the `IncrementalRecovery` struct.
* Most of the changes were recommended by clippy. A few very small changes were
  my own initiative.
* Most common change using references instead of moving values when the value is
  not consumed by the function body.
* Using `&Vec<_>` instead of `&[_]` requires one more reference and cannot be
  used with non-Vec-based slices.
* Added clippy linter directives to the `Add` and `Subtract` implementations for
  `gf256`. The functions are fine, but use weird binary operators because
  XOR, add, and subtract are all the same GF(256). I had to add these to stop
  the linter from erroring out.
Since `interpolate_at` had been reduced to a 2-line convenience function used in
a single place, I thought it better to simply move those two lines to that
single place.
@psivesely
Copy link
Contributor Author

Rebased on master, so diff is much nicer to look at now.

Temporarily error messages from `verify_signatures` will suffer, but this will
be rectified soon using `error_chain`.
Work still to be done: decide how to refactor the validation module once more so
that we don't have to separately check that we have threshold shares (see TODOs
in diff).
I think this refactor finally streamlines the `validation` module (at least for
what it needs to do now). Gets rid of those TODOs and creates two "types" of
validation function for incremental and all-at-once secret recovery.
`threshold` and `ids` are no longer stored in `PartialSecret`, reducing the size
of a `Recovery` object by up to 33% (as the number of shares interpolated
grows).

Optimizing for speed (since we will call methods on `PartialSecrets` much more
often than functions in `validate`), the `ids` field of `Recovery` is now of
type `Vec<Gf256>`.

`PartialSecret` no longer holds all its own state and relies on that state being
managed from the outside. While this is not ideal for using `PartialSecret`
independently of `Recovery`, I feel comfortable tying the two classes. I have
added additional assertions to `PartialSecret` to catch errors.
Having crippled `PartialSecrets` sufficiently from its original form, I realized
it was best to carry this process out as far as possible, and make this even
more explicit. Storing an `secret: Option<u8>` didn't make sense when it no
longer stored `ids` and `thresholds` and was dependent on outside forces
providing that information, so it could know when to compute the secret. The
`compute_secret` method, now `evaluate_at_zero`, it's own function in the
`lagrange` module, also seemed out of place.

Now `threshold` doesn't needed to be passed to `BarycentricWeights`, and the
type is more accurately described by its name. An outside force is still
responsible for managing the `ids`, and when `threshold` shares have been
interpolated, it will need to call `evaluate_at_zero` with the
`BarycentricWeights` and `ids` (`Recovery` now does this).

The last change to `Recovery` was to make it hold its own `secret` (it
automatically computes this secret as soon as threshold shares have been
interpolated via calls to `new` and `update`). Thus `get_secret` will be faster
because we're just unwrapping a `Some(Vec<u8>)` and rewrapping it in `Ok`,
instead of needing to construct that `Vec<u8>` by iterating over `Option<u8>`s
we need to unwrap and `collect`.
@psivesely
Copy link
Contributor Author

Not planning on making more changes except documentation/ comments/ tests/ benchmarks, and maybe a very small diff regarding error handling in one function. I would say it's stable for the purposes of reviewing now and getting feedback. Sorry for the churn, I didn't think I would refactor this PR another 5 times after opening it.

@psivesely
Copy link
Contributor Author

Closing because I don't think this PR means much with respect to intended use cases of this library. The speedup won't matter for the low degree polynomials this library expects its users to be working with, and anyways FFT should be used for high-degree polynomials.

@psivesely psivesely closed this Jul 18, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants