Skip to content

Latest commit

 

History

History
561 lines (398 loc) · 20.3 KB

3d.rst

File metadata and controls

561 lines (398 loc) · 20.3 KB

3d: Dependent Data Descriptions for Verified Validation

Note

The documentation from this page covers the EverParse/3d releases v2021.03.05 or later. Documentations for earlier versions may be found in the Git history.

EverParse/3d lets you write a data format description and automatically produces efficient data validation code formally verified for memory-safety (absence of buffer overruns), functional correctness (validation is correct wrt. your data format) and security (absence of double memory fetches.) A gentle overview of formal verification is available in the documentation of EverCrypt; EverParse provides similar formal guarantees except secret-independence.

Summary

  1. Write your data format description in the 3d language, as MyFile1.3d, MyFile2.3d.
  2. Give MyFile1.3d and MyFile2.3d to EverParse:
    • with the Windows binary package: everparse.cmd MyFile1.3d MyFile2.3d
    • with the Linux binary package: everparse.sh MyFile1.3d MyFile2.3d
    • with the source tree: bin/3d.exe --batch MyFile1.3d MyFile2.3d

Then, EverParse/3d produces:

  • MyFile1.fsti, MyFile1.fst: the verified F* specification and Low* implementation of parsers and validators for your data format from MyFile1.3d, and similarly for MyFile2.3d
  • MyFile1.c, MyFile1.h: the verified C code generated from the verified F* and Low* files, and similarly for MyFile2.3d
  • MyFile1Wrapper.c, MyFile1Wrapper.h: an entrypoint API to call into the verified C code, and similarly for MyFile2.3d
  • EverParse.h, EverParseEndianness.h: various EverParse helpers for the implementation of verified validators, common to MyFile1.3d and MyFile2.3d
  1. Insert #include "MyFile1Wrapper.h" into any of your C compilation units that need to call a validator from MyFile1.3d, and similarly for MyFile2.3d.

Formal guarantees

For each message type in the source data formats, the generated C code provides a _validator, a procedure to determine if a given input array of bytes contains a valid representation of a message, with the following properties:

  • Memory safety: The validator is memory safe, meaning it never accesses memory out of bounds, accesses memory at an incompatible type, uses memory after it has been freed, etc.
  • Arithmetic safety: The validator has no integer overflows or underflows.
  • Double-fetch freedom: The validator reads every byte in the input array at most once, defending against certain kinds of time-of-check/time-of-use bugs.
  • Write footprint: The validator does not modify the input array, and only modifies memory as described by user-controlled imperative actions that are explicit in the specification.
  • No heap allocation: The validator does not allocate any memory on the heap.
  • Functionally correct: The validator accepts all input arrays that contain valid representations of the message according to its specification, rejecting all others.

The 3d Dependent Data Description language

3d-lang

Download and install

We provide binary (amd64 only) packages for Windows and Linux.

  1. Download the binary package corresponding to your platform, from the latest EverParse/3d release.
  2. Extract it (Windows: unzip; Linux: tar xzf) anywhere you like. No need to configure it anyhow.

Then, everything is contained in a everparse directory, whose actual entrypoint executable is everparse.cmd (for Windows) or everparse.sh (for Linux.)

Binary packages already contain all dependencies required by EverParse, such as F*, KReMLin and z3, so no need to install anything beforehand.

The Windows binary package contains a binary snapshot of clang-format to pretty-print the produced .c and .h files. On Linux, you may want to install clang-format (more info at the LLVM download page), but this is optional.

The Windows binary package is fully native, in the sense that it does not require any POSIX emulation layer such as Cygwin or WSL.

Run

On Windows: everparse.cmd [options] input_files

On Linux: everparse.sh [options] input_files

Default mode

Options are treated from left to right:

  • --odir output_directory_name

    Tell EverParse to write all files to the directory output_directory_name. This directory must already exist (EverParse will not create it.) The default is . (current directory)

  • --batch

    Verify the produced F* .fst and .fsti files and generate C .c and .h files. This is toggled by default.

  • --no_batch

    Do not verify the produced F* .fst and .fsti files or generate C .c and .h files.

  • --clang_format

    Call clang-format on extracted .c/.h files. This option automatically toggles --batch.

    This is toggled by default on Windows, since the Windows binary package contains a copy of clang-format. On Linux, this is toggled by default only if clang-format is reachable through the PATH environment variable.

  • --no_clang_format

    Do not call clang-format on extracted .c/.h files.

  • --clang_format_executable

    Provide the path to clang-format if not reachable through PATH (or to override the one provided in the Windows binary package.) This option automatically toggles --clang_format and --batch.

  • --cleanup

    With --batch, remove all produced intermediate files other than .c and .h` files (thus:.fst,.fsti,.checked,.krml``)

  • --no_cleanup

    Keep all produced intermediate files. This is toggled by default.

Options are treated from left to right, with higher priority on right-most options. Thus, any --no_clang_format will override any occurrence of --clang_format or --clang_format_executable to its left, and vice-versa.

You can provide one or more input_files to EverParse. They must all bear the .3d file name extension, and their names must start with a capital letter. Then, EverParse.h and EverParseEndianness.h will be shared across .c and .h files produced for all the .3d files that you provided.

Alternate mode: generating a Makefile

By default, with --batch, given a .3d specification, EverParse produces .c and .h files by first producing .fsti and .fst files files containing verified F* specifications and implementations of validators, and then calling F* and KReMLin on each such file to verify and extract them to C. EverParse will issue all such calls to F* and KReMLin sequentially.

Alternatively, with everparse.cmd --makefile gmake Test.3d (on Windows; use ./everparse.sh on Linux), instead of producing .c and .h files, EverParse will take Test.3d and all of the modules it depends on, and produce a GNU Makefile containing all recipes and rules to produce .c and .h files and all intermediate files with proper dependencies so that the user can leverage GNU Make parallelism for any rules that do not depend on one another. Other build systems are supported with variations of the --makefile option, as stated below.

The produced Makefile will make use of three variables:

  • EVERPARSE_CMD: the command to run EverParse. This variable must contain the path and name of the EverParse executable, and any options (--clang_format, etc.) useful to produce .
  • EVERPARSE_INPUT_DIR: the path to the input files.

Note

Currently, all .3d files must be located in the same input directory.

  • EVERPARSE_OUTPUT_DIR: the path to the output files. EverParse will append the --odir option appropriately.
  • CC: the path and name of the C compiler that will be used to compile the produced C files.
  • CFLAGS: options to provide to the C compiler to compile the produced C files.

The produced Makefile, by default called EverParse.Makefile, is meant to be included in the main Makefile of your project. On Linux, such a typical main Makefile for use with GNU Make may look like:

Sample.EverParse.Makefile

A fully commented version of that main Makefile is available in the EverParse repository.

EverParse supports the following command-line options relevant to this mode:

  • --makefile gmake: produce a Makefile for use with GNU Make.
  • --makefile nmake: produce a Makefile for use with Microsoft NMAKE.
  • --makefile_name my_file_name: the name of the produced Makefile shall be my_file_name instead of the default EverParse.Makefile.
  • --skip_o_rules: do not output rules that would produce .o files by compiling .c files.

Alternate mode: hash checking

To speed up some scenarios where the user maintains a snapshot of the generated .c and .h files to skip unnecessary verification time when .3d files are not changed, EverParse provides an optional mechanism to record the hash of .3d, .c and .h files and to compare them to skip reverification and regeneration. This mechanism is disabled by default, and has to be explicitly enabled using the --check_hashes option as described below. WARNING: this mode will weaken verification guarantees to the probability of hash collisions!

If enabled, then EverParse checks the hashes of the .3d and maybe .c and .h files (depending on the variant, as described below) and fails if hashes do not match; no verification or regeneration is performed at all.

Hashes are computed and checked using the EverCrypt verified cryptographic library, which is included in EverParse binary packages.

Options relevant to this mode:

  • --save_hashes: save the hashes of the contents of source .3d, .c and .h files into a .json file. This option can be added in default mode when generating .c and .h files.
  • --check_hashes strong : checks the hashes of the contents of source .3d, .c and .h files, by reading the recorded hash from the .json file created by --save_hashes.
  • --check_hashes weak : checks the hashes of the contents of source .3d files only, by reading the recorded hash from the .json file created by --save_hashes, leaving the option for the user to manually change the generated .c and .h files (WARNING: this will void all verification guarantees!)
  • --check_hashes inplace: same as --check_hashes weak, but uses the hash recorded in the copyright header of the generated .c and .h files instead of the .json file. The hash is introduced by using the EVERPARSEHASHES keyword in the corresponding .copyright.txt used to prepend generated .c and .h files with copyright headers; see Copyright headers for more details.
  • --check_inplace_hash MyFile.3d=MyFile.h: same as --check_hashes inplace, but instead manually specifies the pair of the source .3d file and the already-generated .h (or .c) file in which to check the hash. This option can be specified several times to provide several such pairs to check.

Alternate mode: help and version

Some options make EverParse do nothing other than show a help or version message:

  • --version

    Print the version of EverParse/3d and exit (do not process any further options to the right, nor any files at all.)

  • --help

    Show the usage message and the status of some options given so far, and exit (do not process any further options to the right, nor any files at all.)

Optional: From the sources

You may want to use the sources instead of the binary package in any of the following cases:

  • You are on a non-amd64 platform (however, there is no guarantee that the sources will compile under any such non-amd64 platform.)
  • You are not on Windows or Linux (however, there is no guarantee that the sources will compile under any such non-Windows-Linux system.)
  • You want to use your own copy of F*, KReMLin or z3.
  • You want to contribute to EverParse.

To this end, you must first install dependencies, then compile the sources.

Install dependencies

EverParse, F* and KReMLin are all components of the Everest project. Thus, the best way to install dependencies is to use the Everest script, as described below following the website of the Everest project, which EverParse is a member of.

Windows

Building EverParse on Windows requires Cygwin. Instead, you could also use the Windows Subsystem for Linux (WSL) and act as if you were on Linux (see below), but this has not been officially tested by anyone in the Everest project; you will obtain Linux binaries instead.

  1. Install Cygwin.
  2. Clone the Everest repository from GitHub.
  3. From inside the origin directory of the clone, run ./everest check and follow the instructions on how to install compilation dependencies (OCaml, etc.) ./everest check will automatically install all such compilation dependencies, but you may need to re-run it several times after refreshing the environment.

Linux

  1. Install opam, a package manager for the OCaml ecosystem.

    Note

    You need to install opam 2.x, so beware of older distributions such as Ubuntu 18.04 packaging opam 1.x by default. The opam web page gives detailed installation instructions to install opam 2.x.

  2. Run opam init and follow the instructions. This will install OCaml.

    This step will modify your configuration scripts to add the path to OCaml and its libraries to the PATH environment variable every time you login, so you may need to evaluate those configuration scripts again or log out and back in for such changes to take effect.

  3. Clone the Everest repository from GitHub.
  4. From inside the origin directory of the clone, run ./everest opam and follow the instructions. This step will install the relevant opam packages on which EverParse depends.

Compile the sources

There are two possible ways.

  • If your platform is not supported (or if binary releases do not work for some reason) and if you just want to rebuild EverParse, then you can build a binary package.
  • If you want to contribute to the EverParse sources, then you need to build in the source tree.

Build a binary package

If you just want to rebuild EverParse without contributing to its sources, you can build a binary package:

  1. Clone the EverParse repository .
  2. In the root directory of that EverParse clone, run one of the following commands:
    • make everparse

      That command will build ready-to-use binaries into the everparse/ subdirectory. That directory will behave as if you had downloaded and extracted a binary package archive from the releases page.

      This process is fully automatic. In particular, it automatically downloads a binary release of z3, and downloads and builds F*, KReMLin and EverCrypt.

    • make package

      That command will build a binary package, everparse-<version>-<platform>.zip on Windows, .tar.gz on Linux, by producing the everparse/ subdirectory as with make everparse and compressing it into an archive.

    • make package-noversion

      That command will build a binary package, as with make package, but only the name of the archive will change: everparse.zip on Windows, everparse.tar.gz on Linux.

In all cases, the produced package offers everparse.bat on Windows, everparse.sh on Linux, which you can use as directed elsewhere in this manual.

Build in the source tree

If you want to contribute to the sources of EverParse, you need to rebuild in the source tree. To do so, you first need to setup a development environment for Everest (steps 1 to 6 below). Then you can fetch and build EverParse sources:

  1. In the root directory of the Everest clone, run ./everest z3 and follow the instructions to install z3 on your system.

    This step will modify your configuration scripts to add the path to z3 to the PATH environment variable every time you login, so you may need to evaluate those configuration scripts again or log out and back in for such changes to take effect.

  2. Run ./everest pull to fetch and pull the latest versions of F*, KReMLin and EverParse.
  3. Run ./everest -j 1 FStar make kremlin make hacl-star to build F* and KReMLin. The -j option introduces a parallelism factor. You can also speed up the build by skipping F*, KReMLin and EverCrypt library proofs by setting the OTHERFLAGS environment variable to "--admit_smt_queries true".

    Note

    If, at this point, you immediately get an error of the form "menhir not found", then it means that the path to opam packages is not properly set up in your environment. To do so, you need to run eval $(opam env) (as instructed during opam init or ./everest opam), or log out and back in.

  4. Run make -C hacl-star/dist/gcc-compatible install-ocaml to build and install EverCrypt (needed for hash checking with --check_hashes, etc.)

    Note

    In fact, this step builds and installs the hacl-star-raw and hacl-star OCaml packages. If, at this point, you get an error of the form "hacl-star-raw already installed" or "hacl-star already installed", it means that either package is already present because of an earlier compilation, so you can remove it, using ocamlfind remove hacl-star-raw and/or ocamlfind remove hacl-star, and try again.

  5. Set the FSTAR_HOME environment variable to the FStar subdirectory of your Everest clone, which contains a clone of the latest F*.
  6. Set the KREMLIN_HOME environment variable to the kremlin subdirectory of your Everest clone, which contains a clone of the latest KReMLin.

    Note

    If you already have your own copy of F* or KReMLin, and if you already know how to build them, then you can skip steps 1 to 5 and set the environment variables accordingly.)

  7. Everest contains a clone of the EverParse sources in the quackyducky subdirectory. You can work from there. Alternatively, you can clone it yourself anywhere else.
  8. Set the QD_HOME environment variable to your EverParse clone as you chose it.
  9. Then, once you are all set up in your EverParse clone, you can build EverParse by make. Then, the EverParse/3d executable will be located at bin/3d.exe.

Note

You cannot use everparse.sh or everparse.cmd from the source tree. You need to use bin/3d.exe instead.

bin/3d.exe has the same options as everparse.sh or everparse.cmd, with the following caveat: no options are toggled by default (not even --batch or --clang_format) and you need to provide all options by hand.

Licenses

EverParse is Copyright 2018, 2019, 2020, 2021 Microsoft Corporation. All rights reserved. EverParse is free software, licensed by Microsoft Corporation under the Apache License 2.0.

Additionally, the licenses of all dependencies included in the EverParse binary packages (F*, KReMLin, z3, EverCrypt, clang-format) are all stored in the licenses/ subdirectory of the EverParse binary package once extracted.