Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Code generation improvements #1 #53

Closed
5 of 10 tasks
beurdouche opened this issue Aug 15, 2017 · 15 comments
Closed
5 of 10 tasks

Code generation improvements #1 #53

beurdouche opened this issue Aug 15, 2017 · 15 comments

Comments

@beurdouche
Copy link
Member

beurdouche commented Aug 15, 2017

During the successive code-review phases for the NSS production code generation we did find some good examples of potential improvements for the code generation. Some of those may be a lot of work on the F* side or the Kremlin side, but potentially some might be easy fixes.

Here is an attempt to list places where code generation could get better for production at Mozilla and Microsoft in the future.
All examples are taken in order from the Curve25519 code in the production-nss branch.

  • Generation of const keywords based on modified clauses in post-conditions

  • Substitution of intermediate variables (dev_prettify)

  {
    FStar_UInt128_t uu____429 = input[0];
    uint64_t uu____428 = FStar_Int_Cast_Full_uint128_to_uint64(uu____429);
    output[0] = uu____428;
  }
  • Important number of underscores in intermediate variables
  {
    FStar_UInt128_t uu____871 = output[0];
    uint64_t uu____874 = input[0];
    FStar_UInt128_t
    uu____870 = FStar_UInt128_add_mod(uu____871, FStar_UInt128_mul_wide(uu____874, s));
    output[0] = uu____870;
  }
  • Breaking line rules (won't fix, this can be solved by an external tool)
    uint64_t
    r0 =
      FStar_Int_Cast_Full_uint128_to_uint64(tctr)
      & ((uint64_t )1 << (uint32_t )51) - (uint64_t )1;

This could probably be something like this if too long:

uint64_t r0 = FStar_Int_Cast_Full_uint128_to_uint64(tctr) &
         ((uint64_t )1 << (uint32_t )51) - (uint64_t )1
  • Scoping of the if and for statements
    if (ctr > (uint32_t )0)
      Hacl_Bignum_Fmul_shift_reduce(input);

This is obviously correct, it might make people more comfortable to add { } though

  • Breaking line rules 2 (aligning the arguments too)
  FStar_UInt128_t
  mask =
    FStar_UInt128_sub(FStar_UInt128_shift_left(FStar_Int_Cast_Full_uint64_to_uint128((uint64_t )1),
        (uint32_t )51),
      FStar_Int_Cast_Full_uint64_to_uint128((uint64_t )1));

Might be transformed, if too long, to:

  FStar_UInt128_t mask = FStar_UInt128_sub(
      FStar_UInt128_shift_left(FStar_Int_Cast_Full_uint64_to_uint128((uint64_t )1), (uint32_t )51),
      FStar_Int_Cast_Full_uint64_to_uint128((uint64_t )1));
  • Removing the trailing whitespaces from the if branch
  • Negating the branching condition when the if block is empty
  if (ctr == (uint32_t )0)
  {
    
  }
  else
  {
    Hacl_EC_Point_swap_conditional_step(a, b, swap1, ctr);
    uint32_t i = ctr - (uint32_t )1;
    Hacl_EC_Point_swap_conditional_(a, b, swap1, i);
  }
  • Erasing unused arguments
  • This particular return of a 0 casted by a void * was non-obviously legal
void *Curve25519_op_String_Access(FStar_Monotonic_HyperStack_mem h, uint8_t *b)
{
  return (void *)(uint8_t )0;
}
@s-zanella
Copy link
Contributor

I wonder if it wouldn't be better to let external tools handle these transformations instead of having kremlin enforce any particular code style.

Some syntactic changes can be done with a code formatting tool after extraction; e.g. clang-format even comes with a predefined Mozilla format that claims to comply with Mozilla’s style guide.
Some semantic transformations may be doable using e.g. Coccinelle.

@franziskuskiefer
Copy link
Member

We run clang-format on the output to get the style we want for NSS. I don't see anything in Benjamin's list that could be solved with clang-format or similar tools.
Any tool that touches the actual code (clang-format only touches white space) would undo the guarantees we want from verified code.

@s-zanella
Copy link
Contributor

s-zanella commented Aug 16, 2017

I think 3 bullets in Benjamin's list could be solved with clang-format, namely

  • Breaking line rules
  • Breaking line rules 2 (aligning the arguments too)
  • Removing the trailing whitespaces from the if branch

Some other items in the list are purely syntactic but do more than inserting and deleting whitespace.
Some could be done in tools like Coccinelle, e.g. Remove unused variable.
Implementing some of these directly in Kremlin would unarguably improve the quality of extracted code (e.g. picking better variable names).

Only the two first items in the list interfere with semantics and are harder to prove sound:

  • Generation of const keywords based on modified clauses in post-conditions
  • Substitution of intermediate variables

I agree that using any tool that transforms the extracted code voids the guarantees of verification. However, the soundness of Kremlin is currently backed by pen-and-paper proofs that would need to be extended if additional transformations are implemented, so which tool does these transformations wouldn't be that different initially.

@msprotz
Copy link
Contributor

msprotz commented Aug 16, 2017

This particular return of a 0 casted by a void * was non-obviously legal

why?

Erasing unused arguments

example? I thought I removed a lot of them... do you have unused arguments at types other than unit?

Might be transformed, if too long, to:

this seems to break the 80-column rule I have by default...

Generation of const keywords based on modified clauses in post-conditions

Substitution of intermediate variables

Important number of underscores in intermediate variables

These three require substantial, fundamental changes from F* all the way down to C. While I agree that we want them, it's unlikely to happen anytime soon.

@martinthomson
Copy link

I see a number of places with

    i = i + (uint32_t )1;

I'm fairly sure that ++i would be fine.

Regarding substitution, I see several places where you have a temporary that is created, assigned and then assigned to another variable, or passed to a function. As long as there isn't an implicit conversion being avoided by using the explicit declaration of type (i.e., source, destination and intermediate variable are the same type), a post-processing step could be used to inline the variable. The uu_______ variables in the example might not even need to appear then.

@msprotz
Copy link
Contributor

msprotz commented Aug 17, 2017

Yes. The intermediary variables stem from the fact that, at the F* level, operations such as b[i] + b[i] are effectful and operate on the heap. F* does not distinguish between read effect and write effect, and treats both equally. Since neither C or OCaml guarantee evaluation order of arguments, the operands are hoisted to intermediary variable declarations (the uu___ variables), to ensure sequencing of the effectful operations.

KreMLin can, naturally, try to eliminate some of these temporaries via peepholes optimizations, but the problem stems from the fact that F* should have a Read effect and a Write effect, and that sequencing should only operate on the latter...

The ++i seems neater and easy to implement.

@franziskuskiefer
Copy link
Member

franziskuskiefer commented Aug 18, 2017

  • Don't generate no-op code like

(void)(buf + (uint32_t)5);

@beurdouche
Copy link
Member Author

beurdouche commented Sep 2, 2017

  • Try being more agressive with dead code elimination after unrolling

The following is the Coverity warning from the NSS CI:

Hacl_Bignum_Fmul_mul_shift_reduce_()
226             uint32_t ctr = (uint32_t)5 - (uint32_t)4 - (uint32_t)1;
227             uint32_t i1 = ctr;
228             uint32_t j = (uint32_t)4 - i1;
229             uint64_t input2i = input21[j];
230             Hacl_Bignum_Fproduct_sum_scalar_multiplication_(output, input, input2i);
231             if (ctr > (uint32_t)0)
>>>     CID 1417203:  Control flow issues  (DEADCODE)
>>>     Execution cannot reach this statement: "Hacl_Bignum_Fmul_shift_redu...".
232                 Hacl_Bignum_Fmul_shift_reduce(input);
233         }
234     }
235
236     inline static void
237     Hacl_Bignum_Fmul_fmul_(uint64_t *output, uint64_t *input, uint64_t *input21)```

@s-zanella
Copy link
Contributor

[ ] Try being more agressive with dead code elimination after unrolling

I'm not sure this is something that KreMLin is well suited to do. Rather, dead code should be eliminated from the F* source.

@karthikbhargavan
Copy link
Contributor

Folks, can you take a look at https://github.com/mitls/hacl-star/tree/dev_prettify/snapshots/hacl-c
I believe all the intermediate variables are gone, and so is the no-op code.
This required some small changes to kremlin and to the hacl-star source code.
Do take a look and recommend other changes.

Generating "const" will be harder, and requires more thought.

Passing F* comments through to C seems doable and is next on my plate for this branch.

@martinthomson
Copy link

It is certainly better. I still see intermediate values being copied from arguments in most functions. This isn't fatal (no more underscores), but it is a little unnecessary.

I see this in curve25519:

  if (i == (uint32_t )0)
  {
    
  }
  else
  {
    uint32_t i1 = i... 

@msprotz
Copy link
Contributor

msprotz commented Sep 8, 2017

Pushed a fix for that very issue on Karthik's c_prettify branch

@msprotz
Copy link
Contributor

msprotz commented Sep 8, 2017

Pushed a new option (-fcurly-braces) to always wrap blocks with curly braces. No more naked one-liner then-branches, if you turn the option on (off by default).

@msprotz
Copy link
Contributor

msprotz commented Sep 22, 2017

Karthik fixed (void)(buf + n). I added a mechanism to forward comments through.

@beurdouche beurdouche changed the title Code generation potential improvements for C extraction Code generation improvements #1 Nov 3, 2017
@beurdouche
Copy link
Member Author

Most of the improvements have been handled one way or the other... Thanks.
I will close this issue open individual issues if needbe.
The most important improvement left is regarding the const keyword

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants