Permalink
Commits on Jun 27, 2018
  1. Fix mem types (#847)

    DIJamner authored and gitoleg committed Jun 27, 2018
    * Fix store typing comparison
    
    * Fix bug in eval-time type inference
Commits on Jun 25, 2018
Commits on Jun 19, 2018
  1. Remove special handling of jumps/cpuexns during linearization. (#844)

    ethan42 authored and ivg committed Jun 19, 2018
    This is necessary for correctly handling control flow in BIL
    containing the above patterns. One example explaining why this
    is necessary is the following:
    
    ```
    if false {
      CPUExn 10
    }
    x := 5
    ```
    
    The above would be translated to:
    
    ```
        goto %takeoff
    landing:
        goto %next
    takeoff:
        when false cpuexn N return %landing
    next:
         x := 5
    ```
    
    Note that the `takeoff` basic block is missing the
    possible control flow transition to the `next` block.
    
    By removing these special cases jmps and cpuexn are handled as
    normal statements within an `if` block.
Commits on Jun 8, 2018
  1. adds traps for memory and division operations (#840)

    ivg committed Jun 8, 2018
    * adds traps for memory and division operations
    
    Division by zero as well as faulty memory operations may terminate the
    Primus Machine. Previously we were just terminating it with a specific
    exception. However, this exceptions should actually be represented by
    an CPU/ABI-specific interrupt or trap. The proposed implementation
    provides a mechanism that allows:
    
    - trap the exception and translate it to a machine specific
      interrupt;
    
    - ignore or fix it;
    
    - catch it on a low-level.
    
    The mechanism is based on the same idea as in the Linker that used the
    primus_unresolved_handler to trap unresolved names. For each trap we
    provide a corresponding observation, that could be used to install the
    trap. The trap itself, is a special name, that could be linked (either
    on a permanent basis, or from the observation). If the trap hanlder is
    assigned, then it is invoked. Concrete behavior depends on a
    particular trap, e.g.,
    
    - for linker trap - the hanlder is invoked instead of the missing code;
    - for the division by zero - if the handler returns
      then the result is undefined;
    - for memory fault - the trap should fix the problem or terminate.
    
    We also introduce the Pagefault exception, to represent memory
    traps. We keep segfault as a non-maskable (non-preventable)
    exception.
    
    In addition, we have provided several new operations in the Linker
    interface:
    - unlink: for code unlinking, useful for removing traps
    - lookup: useful for restoring other's traps
    
    As a showcase, we also reimplemented some parts of the promiscuous
    mode. Now we use the pagefault trap to prevent segmentation
    faults. Also the fixing is more efficient as instead of mapping one
    byte pages, we are mapping pages with size of up to 4k.
    
    Besides others, this commit will also provide a fix for #839.
    
    * fixes a typo in the pagefault handler function name
    
    handler
    handler
    handler
    ![handler](http://www.addletters.com/pictures/bart-simpson-generator/bart-simpson-generator.php?line=hanlder)
Commits on Jun 4, 2018
  1. Several primus fixes (#841)

    ivg committed Jun 4, 2018
    * fixes the order of the trace events
    
    previously we were leaving a function after the first block (and all
    other operations attached to the function leaving event were performed
    after the entry block)
    
    * fixes the kill operation in the Machine implementation
    
    The kill operation is a hard beast, as it is very hard to suicide,
    i.e., to kill the current machine. It is also hard to routinely kill
    other machines in the scheduler, as usually during the rescheduling we
    are in the context of a machine that we would like to kill, and
    switching to another machine will take control from us.
    
    Previous implementation of kill was just wrong, it was killing the
    state without switching the continuations. In case of the self-killing
    it basically was switching the state while leaving the machine in the
    same continuation.
    
    The proposed implementation, delays the machine killing if it is the
    current machine, and performs it on a next fork.
    
    Note, an easier to use (and to implement) primitive, would be [die x]
    that will kill the current machine and tranfer the control to the
    machine [x]. However, we can't add new primitives to the Primus
    Machine interface without bumping the version. So let's keep it for
    the future Primus 2.0
    
    * few documentation updates (totally irrelevant to Primus)
    
    sorry for putting it here, just don't want to create a separate PR for
    such a minor stuff
    
    * adds a couple of new functions to the posix interface
    
    * prevents memory leaking by killing finished machines
    
    Previously, we were waiting until the end and didn't kill any
    machines.
    
    The proposed code will kill finished machines in the greedy
    scheduler. This saves tons of gigabytes of memory.
    
    * fixes a bug in the mark-visited plugin, adds progress reports
    
    The progress report will show the percentage of visited basic blocks,
    basically the code coverage.
    
    Also, previously marker was broken as it was marking all terms.
    
    * removes the memory corruption check
    
    memory corruption check was verifying that a freed pointer was
    allocated. In real life it has an incredible amount of false
    positives.
    
    * fixes a bug in the primus's warn-unused analysis
    
    also extends it a bit.
    
    * adds machine numbers to the incident backtraces
    
    in general it is very useful to see where the backtrace changes the
    machine id, in case of the greedy scheduler and promiscuous mode those
    are exactly the points where infeasibility could be introduced.
    
    * changes the order in which machines a run by the run plugin
    
    When all-subroutines are selected, then we're running them in the
    topological order, starting with those functions that are marked as
    entry points (usually the _start function in case of an executable).
    
    Also added an option to run each machine in isolation (i.e., in a
    separate computation), might be useful for someone.
Commits on May 31, 2018
  1. added powerpc tests (#838)

    gitoleg authored and ivg committed May 31, 2018
Commits on May 23, 2018
  1. updated testsuite (#837)

    gitoleg committed May 23, 2018
    pcmpistr is not skipped anymore in veri tests
  2. Fix packed x86 (#836)

    gitoleg authored and ivg committed May 23, 2018
    * fixed pcmpeq, pcmpgt, pmin, pmax ..
    
    * added tests to oasis
    
    * fixed PCMP*STR* instructions
    
    so bap can produce bil for them again
    
    * refactored pcmpstr
    
    * edited tests for pcmp
    
    * added pmaxsb, pmaxsd insns
    
    * updated tests
Commits on May 9, 2018
  1. fixes bug in reconstructor (#834)

    gitoleg authored and ivg committed May 9, 2018
    This PR fixes a BUG in reconstructor, where we actually ignored
    lots of calls
Commits on May 7, 2018
  1. Adding support for CDQ, CDQE, CQO, CWD, CWDE, CBW opcodes (#833)

    ethan42 authored and gitoleg committed May 7, 2018
    * Adding a first implementation for CDQ.
    
    * Run ocp-indent.
Commits on May 3, 2018
  1. Fix function names in docstrings (#832)

    DIJamner authored and ivg committed May 3, 2018
    Fixes the function names in two documentation strings that incorrectly refer to another function.
Commits on Apr 30, 2018
  1. a few fixes to primus et al (#831)

    ivg committed Apr 30, 2018
    * renames eval_cond to eval-cond
    
    All our observations, plugings, options, passes attributes, and other
    textual identifiers use lisp-style dash separated notation, so
    `eval_cond` looks very inconsistent.
    
    * makes parameter evaluation visible
    
    So far we were bypassing the interpreter when Lisp parameters were
    evaluated. However, we would like to see them as first-class
    computations.
    
    In particular it helps us to track which values are used undefined.
    
    * adds a generic taint-attached observation
    
    this observation occurs every time a taint is attached to a value.
    
    * fixes the compare macro in Primus Lisp
    
    not that the macro was wrong, but it was not doing what was
    expected. The old version had type `'a -> 'a -> 'a`, i.e., when
    it is applied to a byte it will return a byte, and so on. This made
    using this macro a little bit hard, as we need to apply casting as
    usually we need just int, so our new type is `'a -> 'a -> int`.
    
    * propagates branch computations in the promiscuous mode
    
    This is actually a major breakthrough, as this issue was preventing
    taint analysis to work in promiscuous mode as it was killing all
    taints on the branch conditions.
    
    The long story - in the promiscuous mode we're flip-floping branch
    conditions to reach as much code as possible. However, since we're
    just assigning `1` to the branch condition we loose all taints
    associated with the existing condition (or any other information that
    other analyses can add).
    
    The solution is to derive the new value from the old one. I'm using
    `1 \/ c` if we need to assign `1` and `0 /\ c` if we need to assign
    0.
    
    * changes the machine spawning algorithm in the run pass
    
    The previous implementation was spawing machines by unhierarchical
    depth-first order and was essentially creating 2^n resumptions.
    The new aproach uses BFS, is linear, and, more important, is easier to
    reason about.
    
    * fixes a typo in the sign macro
Commits on Apr 18, 2018
Commits on Apr 12, 2018
  1. Add an `eval_cond` observation to allow observing non-taken condition…

    ethan42 authored and ivg committed Apr 12, 2018
    …als (#823)
    
    * Adding an `eval_cond` observation to allow observing non-taken conditionals.
    
    The `jumping` observation allows for observing the taken conditionals
    as well as the jump destination. This extra observation allows
    observing all conditions that were evaluated up until the taken branch
    (including the false ones). Conditions in branches _after_ the taken
    branch will not be observed.
    
    * Reflect new observation in Primus Lisp.
  2. Rename Solution init value to default. Add Solution.derive (#824)

    DIJamner authored and ivg committed Apr 12, 2018
    * Rename Solution init value to default. Add Solution.derive
    
    * Reset counters on derivation of a new solution
  3. intoduces a dependency between callsites and abi passes (#822)

    ivg committed Apr 12, 2018
    Disclaimer
    ----------
    
    I was really reluctant since the very beginning to introduce this
    dependency, since we shouldn't depend on implementations but on
    abstractions, but so far it looks like the easy solution, that, I
    hope, won't last too long.
    
    The Problem
    -----------
    
    `bap ./exe --taint-reg=malloc_result` may or may not work, depending
    on the order of pass invocations. The root of the problem is that the
    taint plugin depends on the callsites plugin, that requires the
    argument information to be present in the subroutine
    definitions. Since, the `abi` pass is an autorun pass it is aplied
    before all other passes, including the `callsites` pass. However,
    since for the better user experience, (haha) we decided to mark
    the `taint` pass as an autorun pass, it is also run in the autorun
    phase along all of its dependencies. So, it is quite possible that it
    will be run before the abi pass.
    
    The Solution
    ------------
    
    I've added the dependency from the `abi` pass and altough I'm
    insisting that we shouldn't depend on implementations, but on
    abstractions, we don't have a better solution. Also `abi` pass
    became an abstraction de-facto in bap, as it is actually a pass
    that has its own pipeline of passes. We may devise a better solution
    later, as currently @gitoleg is working on a generic solution with
    services and service providers that will allow us to build right
    dependencies between our analyses.
Commits on Apr 10, 2018
  1. Add an interface for binnable interval trees (#821)

    DIJamner authored and ivg committed Apr 10, 2018
    * Add an interface for binnable interval trees
    
    * Fix comments to be clearer and simplify internal definition of .
    
    * Fix comment for clarity
  2. adds mangling of duplicated subroutines (#813)

    gitoleg authored and ivg committed Apr 10, 2018
    * first approach to mangling duplicated subroutines
    
    Implemented like an iteration over existed subroutines
    
    * refactoring
    
    * work in progress: all checks are on program type
    
    * made the program type private
    
    * changes after review
    
    * reimplemented all with Map
    
    * placed everything in Program module
    
    * refactored
    
    * added a check if nothing should be changed
Commits on Apr 9, 2018
  1. rectifies powerpc lifter (#809)

    gitoleg authored and ivg committed Apr 9, 2018
    * updated model
    
    * started to fix branch insns
    
    * replaced most of wrong << using
    
    * fixed doubleword using in rotate
    
    * typo
    
    * refactored a couple of instructions
  2. enhances and documents recipes (#815)

    ivg committed Apr 9, 2018
    In case if you happen to know already what recipes in bap are, then
    the main chainges are summarized below:
    
    - the recipe is no longer requred to be a zip file, plain files, or
      directories are now accepted
    
    - the recipes interface is now documented and a couple of issues were
      fixed during the process
    
    - filenames are no longer substituted, however there is now a builtin
      parameter `$prefix` that is substituted with the path to the recipe
      root folder.
    
    For the rest, I'm copypasting excerpts from the bap man page, that
    describe the recipe system in detail as well as few motivational
    examples
    
    ```
    RECIPE DESCRIPTION
           A recipe is either a single file or a directory (optionally zipped)
           that contains a parametrized specification of command line parameters
           and support files if necessary.
    
           The purpose of the recipe is to make bap runs reproducible, so that you
           can share one simple file - the recipe, instead of a bunch of scripts
           together with some verbal instructions. Since recipes introduce an
           extra layer of abstraction they actually simplify interaction with bap
           by hiding unnecessary details. Recipes also provide a mechanism for
           building ready solutions and sharing them with users.
    
           To use a recipe, just specify its name using the --recipe command line
           parameter. If a recipe has parameters then they could be specified as
           colon separated list of <key>=<value> pairs. See the --recipe parameter
           for more information. To read the recipe description, including the
           list of parameters and resulting command line, use the --show-recipe
           command line option. To list all installed recipes use the
           --list-recipes option.
    
           The main (and the only necessary) part of a recipe is the recipe
           specification, that is a file that contains a list of recipe items in
           an arbitrary order. Each item is either a command line option, a
           parameter, or a reference to another recipe. All items share the same
           syntax - they are flat s-expressions, i.e., a whitespace separated list
           of strings enclosed in parentheses. The first string in the lists
           denotes the type of the item. E.g., (option run-entry-points malloc
           calloc free).
    
           The option command requires one mandatory parameter, the option name,
           and an arbitrary number of arguments that will be passed to the
           corresponding command line option. If there are more than one argument
           then they will be concatenated with the comman symbol, e.g., (option
           opt a b c d) will be translated to --opt=a,b,c,d. Option arguments may
           contain substitution symbols. A subsitution symbol starts with the
           dollar sign, that is followed by a named (optionally delimited with
           curly braces, to disambiguate it from the rest of the argument). There
           is one built in parameter called prefix, that is substituted with the
           path to the recipe top folder. See the parameter command to learn how
           to introduce parameters.
    
           The parameter command introduces a parameter to the recipe, i.e., a
           variable ingredient that could be changed when the recipe is used. The
           parameter command has 3 arguments, all required. The first argument is
           the parameter name, the second is the default value, that is used if
           the a parameter wasn't set, and the last argument is the parameter
           description. The substitution symbol will be replaced with the default
           value of a parameter, if a value of the parameter wasn't passed through
           the command line. Example,
    
               (parameter depth 128 "maximum depth of analysis")
               (option analysis-depth depth)
    
           If the parameter is not set through the command line, then it will be
           substituted with 128 otherwise it will receive whatever value a user
           has passed. Finally, the extend command is like the include statement
           in the C preprocessor as it includes all the ingredients from another
           recipe. (Make sure that you're not introducing loops!). The command has
           one mandatory argument, the name of the recipe to include.
    
    RECIPE GRAMMAR
               recipe ::= {<recipe-item>}
               recipe-item ::= <option> | <parameter> | <extend>
               option ::= (option <atom> {<atom>})
               parameter ::= (parameter <atom> <atom> <atom>)
               extend ::= (extend <atom>)
    ```
    
    And here comes the description of the `--recipe` command line
    parameter:
    
    ```
           --recipe=VAL
               Load the specified recipe. The  parameter specifies the name of the
               recipe along with an optional colon separated list of arguments.
               The  parameter has the following syntax: <name> : <arg1>=<val1> :
               <arg2>=<val2> : .... The <name> part could be either a path to a
               valid recipe or the name of a recipe, that would be search in the
               recipe search paths. A name may omit .recipe or .scm extensions for
               brevity. The valid representations of a recipe is either the recipe
               file itself (i.e., a file consisting of a list of s-expressions), a
               directory with a valid undefined file, or zip file, that contains a
               valid recipe directory. See the RECIPES section for more
               information. If a recipe is a zip file then it will be unpacked to
               a temporary folder that will be removed after the program
               terminates. Otherwise, all recipe resources will be used as is and
               won't be restored to their previous state, e.g., if the
               input/output files were provided, and they were modified, the
               modifications will persist.
    
    ```
    
    This is how the `--list-recipes` option works:
    
    ```
    $ bap --list-recipe
    uaf                              Performs Use-After-Free Analysis
    fuzz                             Testing Recipe
    derive-inline                    Description was not provided
    direct                           Testing Recipe
    ```
    
    And this is how a description of the recipe is printed:
    
    ```
    $ bap --show-recipe uaf
    DESCRIPTION
    
    Performs Use-After-Free Analysis
    
    Uses the Primus promiscuous mode and fuzzes a binary through the specified
    $entry points with the maximum depth set to $depth RTL instructions. Outputs
    all found Use-After-Free incidents into the `incident` file.
    
    PARAMETERS
    
    - depth - maximum number of instructions per path (defaults to 8192)
    - width - maximum number of visits to the same block (defaults to 128)
    - entry - entry points (defaults to all-subroutines)
    
    COMMAND LINE
    
    --run \
    --run-entry-points=all-subroutines \
    --primus-greedy-scheduler \
    --primus-limit-max-length=8192 \
    --primus-limit-max-visited=128 \
    --primus-promiscuous-mode \
    --primus-lisp-load=posix,memcheck-malloc,limit-malloc \
    --primus-print-obs=incident,incident-location,call-return \
    --primus-print-rules=/tmp/recipe-943c0231-491a-43a5-86de-e6c2c662396f.unzipped/select.scm \
    --primus-print-output=incidents \
    --primus-lisp-channel-redirect=<stdin>:/tmp/recipe-943c0231-491a-43a5-86de-e6c2c662396f.unzipped/stdin,<stdout>:/tmp/recipe-943c0231-491a-43a5-86de-e6c2c662396f.unzipped/stdout \
    --report-progress \
    --log-dir=log
    ```
  3. fixes an incorrect store operation introduced by #806 (#816)

    ivg committed Apr 9, 2018
    * fixes an incorrect store operation introduced by #806
    
    It was possible to store a word of arbitrary width in our byte
    storage. It is fixed now, and I've also updated documentation and
    added few sanity checks.
    
    * fixes the endianess bug in the memory primitives
    
    even for one byte write the endianness matters, when the argument is
    not a byte.
  4. fixes a few bugs in Primus (#818)

    ivg committed Apr 9, 2018
    * skips evaluations of the mem variable in the store operation
    
    since this variable doesn't have any meaningfull value
    
    * fixes the cleanup procedure after term evaluation
    
    in case of abnormal termination the cleanup procedure wasn't called.
    
    * ignores return statements
    
    this is a big change actually, and we will return to it later, to make
    it more robust.
    
    If we won't ignore them, then we will have double returns, the first
    one when a function is returned based on the IR return statement (if
    such is present and is well defined) and the second one, when function
    exec finishes and we finally call the return destination. That's was a
    disaster, and this is the fast solution. I will evolve it later
    
    * adds machine-switch and machine-fork observations
    
    also simplifies swith and fork operators, for some reason we were
    holding a machine identifier in the stored continuation, probably
    this is a reminiscent of the debugging stage, in any case we dont
    need it anymore.
    
    * halts machine after switch ion the greedy scheduler
    
    it shouldn't have any observable effect, though it is better to make
    it explicit.
    
    * disables call bypassing
    
    When we make a call we add a failsafe machine that will resume the
    computation in case if the call doesn't return. We shouldn't use
    this failsafe machine if everything went allright.
    
    * handles empty blocks with branches correctly
    
    * few cosmetic changes
    
    * fixes the parameter order in linked lisp stubs
    
    One of the previous commits [1] in PR #798 was claiming that we do not
    need to preserve the order of the arguments, so we can provide a more
    efficient frame allocation.
    
    Apparently, we do need to preserve the order, at least of the proper
    signaling the `call` and `call-return` messages. This commit reversed
    the order of the arguments, since they were reversed.
    
    [1]: 6d5afcd
    
    * fixes strncpy, strncmp and adds strpbrk summaries
    
    * parametrizes memory allocator with the zero-sentinel
    
    that's a value that is returned when malloc is called with the zero
    argument.
    
    * implements a buffer overflow check
    
    This check is a part of the memcheck check suite and it verifies that
    functions from the string.h API are called correctly. The correctness
    property is the following, if the pointer to the begining of a string
    that is passed to the function, belongs to some heap region, then the
    pointer to the end of this string must belong to the same heap region
  5. hardens IDA plugin (#817)

    ivg committed Apr 9, 2018
    To run IDA we cd to a newly created temporary folder. If IDA fails,
    then we end up there. This not only breaks the rest of BAP, but
    produces extremely confusing error messages, such as `bap ./exe` will
    produce `Sys error: ./exe no such file or directory`.
Commits on Apr 4, 2018
  1. improves macho loader (#799)

    gitoleg authored and ivg committed Apr 4, 2018
    * improves macho loader
    
    Previously, only little endian binaries were expected
    in our macho loader, so this caused unexpected behaviour
    with big endian binaries. This PR fix it.
    
    Also, this PR fixes problem with entry point for old macho
    binaries: they just don't contain so called 'LC_MAIN'
    command, and entry point as part of binary structure is absent.
    But we can workaround this problem by emiting elntry point as
    an address of a first executable section.
    
    * updated resolving of entry-point for macho
    
    * just stylistic chacnges
    
    * refactoring
    
    * better definition of relocatable files
    
    * updated entry search with help of numeric_limits
    
    * adds swapping of relocation_info structure
    
    * wrote a whole swap for relocation_info
    
    * added some useful comments for future
    
    * fix typo
    
    * bug fixed in indirect symbol address
    
    * refactoring
    
    * some debug code
    
    * more efforts on refactoring
    
    * more efforts on refactoring
    
    * solved endianess problems
    
    * fix comments
    
    * changes after review
    
    * fixed endianess issues
    
    * refactoring + updated testsuite
Commits on Mar 30, 2018
Commits on Mar 29, 2018
  1. fixes calls search in reconstructor (#807)

    gitoleg authored and ivg committed Mar 29, 2018
    fixes #801
    
    This PR fixes a problem with a disassembling of pure code:
    as there are not any functions in there, bap just doesn't
    output anything.
    
    So, what we do here is not about only those files, but
    about reconstruction at all: we consider all blocks in
    reconstructor that don't have input edges as functions
    starts. And it make sence, e.g. in case of libraries:
    there is no guarantee that every function will be
    called by some other function from this library.
Commits on Mar 28, 2018
  1. rectifies project construction (#803)

    gitoleg committed Mar 28, 2018
    * adds concat and concat_merge to future
    
    also started to edit project
    
    * rectifies project creation
    
    This PR moves some preliminary parts of
    project creation to the module `Project` itself.
    It adds some simplicity to a project creation.
    Previously, we were creating project somehow
    like following:
    ```
    let input = Project.Input.file "exe";;
    let Ok proj = Project.create ?rooter:(Rooter.Factory.find "internal") input;;
    ```
    and now we can do it much more simplier:
    ```
    let input = Project.Input.file "/bin/ls" ;;
    let pr = Project.create input ;;
    ```
    
    * rephrases the docstirngs in the future library
    
    * fixes a typo in docstrings
  2. changes the Primus memory representation to preserve value identity (#…

    ivg committed Mar 28, 2018
    …806)
    
    Historically, our implementation of the memory storage was a mapping
    from bitvectors to bitvectors, thus any store operation was forgetful
    as the identity of a value was discarded (where a value in Primus is a
    tuple of data and unique identifier). For a long time we were
    considering this as a feature, articulating that `x` and `load
    a (store a x)` are two different values, even though, they are usually
    structuraly equal. However, it was also observed, that in many cases
    during the analysis we would like to preserve this identities, thus we
    were forced to maintain our own mappings between words and value
    identities in this or that way.
    
    The proposed solution should satisfy both camps, those who believe
    that `x` and `load a (store a x)` are in general different, and those
    who say, that in practice they are equal as in Leibniz equality. The
    memory component will now store values instead of bitvectors, so it
    won't loose any information. At the same time components may define
    semantics of the load/store operations as they wish, e.g., to
    implement a mapping of control registers to memory, and other
    non-trivial memory representations and mechanisms.
    
    By default, however, the interpreter will ensure that `load a (store a
    x)` and `x` has the same idenity if `x` is a byte. If `x` is a word of
    an arbitrary length then it will be projected into several bytes with
    the extract operation during the store operation, and assembled back
    with the concat operation durintg the load operation.
  3. removes duplicated sections (#804)

    gitoleg authored and ivg committed Mar 28, 2018
    Sections were printed twice in disassembly output
    when using bap with pure binary code. This PR
    removes this behavior:
    ```
    $ bap tmp_test -dasm --source-type x86-code --read-symbols-from start
    
    Disassembly of section bap.user
    
    0: <_start>
    0:
    0: 66 31 c0                                       xorw %ax, %ax
    3: 66 40                                          incw %ax
    5: 66 89 c1                                       movw %ax, %cx
    ```
Commits on Mar 27, 2018
Commits on Mar 23, 2018
  1. fixes travis tests (#800)

    gitoleg authored and ivg committed Mar 23, 2018
    * fixes travis tests
    
    This PR fixes `bap-veri` installation on travis for `bap` tests,
    i.e. don't install if already installed.
    
    * updated qerry for installes pkg
    
    * refactoring
    
    * just try to find a mistake
    
    * test this
    
    * updated
    
    * add more output
    
    * remove bap-veri after build
    
    * clean up
    
    * remove bap-veri manually
    
    * clean up again
    
    * updated
    
    * just few edition for beauty
Commits on Mar 19, 2018
  1. fixed minor bugs (#733)

    gitoleg authored and ivg committed Mar 19, 2018
    * restored forgotten parentheses in x86 shift insns
    
    * updated testsuite
  2. Primus Lisp enhancements (#798)

    ivg committed Mar 19, 2018
    * publishes the interrupt operation
    
    * provides jumping observation and resolve_xxx functions
    
    The jumping observation is made just before a jump is taken and has
    two parameters: jump condition and jump destination. This is a much
    more precise and useful observation than the `{enter,leave}-term`
    one, as it doesn't loose the origin of values that define the
    jump (i.e., we can track (taint) both the condition and the
    destination).
    
    We also added three functions to the Linker interface, mostly for
    convenience. These functions resolve code names into addresses,
    symbolic names, and tid.
    
    * rectifies the primus-lisp plugin
    
    So far the plugin loads some stuff during the link time, some after
    the configuration is available, and the rest during the analysis phase
    when it is invoked as a pass. It may even be invoked  twice.
    
    Now it is rectified. Everything that doesn't require an access to the
    project data structure is loaded automatically just after the
    configuration phase has finished. The rest (e.g., linking with the
    program) is done during the analysis.
    
    * adds an optional fill parameter to the allocate-memory primitive
    
    if the parameter is not set, then the allocated memory chunk is not
    filled (i.e., filled with random values), if it is set, then we fill
    the chunk with the specified value.
    
    The motivation, is to make calloc more efficient.
    
    * adds the parameter definition abstract representation
    
    * adds parameters to the program data structure
    
    and applies the reindexer to all runtime definitions.
    
    * prints methods and parameters when printing a program
    
    previously only definitions were printed
    
    * adds defparameters form to the parser specification
    
    * modifies defconstant grammar to fit with C. Lisp and defparameter
    
    To be in line with Common Lisp we've decided to put docstring and
    declarations after the value of defparameter and defconstant. This
    indeed looks more natural.
    
    * more efficient Lisp stack and a bug fix in signal handlers
    
    1. Provides an efficient stack that now is equipped with a map data
    structure that allows us effeciently query for locally bound
    variables.
    
    In Primus Lisp variables live in two scope:
     1. the Primus global such as CPU registers, parameters, etc.
     2. the Primus Lisp local scope that is populated by lexically scoped
        local variables such as function parameters and let bound
        variables.
    
    Any Lisp operation on a variable involves checking whether this is a
    local variable or global variable. Since we're using lexical scoping
    we should be able to distinguish between them in compile time. And
    that's what we will eventually do in the future. But so far it
    requires to many changes to the interpreter. Previously, we were using
    an assoc list, that was traversed every time we read or set a
    variable. The new implementation uses a balanced tree that maps
    variables to the total number of their occurences on the stack. That
    makes the check to be logarithmic in the number of unique variable
    names on the stack.
    
    Few other small optimizations:
    
    a) faster function frame costructor, that is not monadic (as it
    shouldn't be), and that is not preserving the order of arguments (so
    that we can append them in the reversed order, using fold left), and
    that counts the total number of arguments, so that we don't call
    List.length when we're poping the frame off.
    
    b) a faster frame pushing mechanism - instead of just appending
    arguments, we are prepending them in the reverse order. This is not
    only faster, but since the frame is in the reverse order already, it
    preserves the existing order (so no changes in semantics)
    
    2. Fixes an nasty bug in the signal dispatcher. When a signal is
    dispatched to several methods each method ignores effects from other
    methods as it resets the state as it was before the dispatcher was
    entered.
    
    * keeps primitives during linking and moves init_env to loader
    
    1. we do not want primitives, that were registered by other components
    to be dropped when a new program is linked. In fact, primitives is a
    complete different beast than other definitions, as they are defined
    by OCaml modules (i.e., plugins) rather than by the Lisp code, so
    probably they shouldn't be a part of the program data structure at
    all. Though having them in the program makes things much easier.
    
    Besides, when we drop primitives during the program linking time, we
    are discarding all primitives that were registered prior to the call
    to program_link. So far, just by accident, the link_program function
    was called before any other components (since it was registered last -
    after all other lisp components). This is so fragile, that I consider
    this is a bug. Nasty one.
    
    2. The `init_env` function in the Primus.Lisp module was adding
    special variables of the form `%<id>` and `@<id>` that hold addresses
    of the correspondingly named program terms. It was using the local
    variables stack. Neither should it be using the local stack for
    storing this variables (as they are globals by their nature), nor this
    should be done during the linking time of the program. This function
    is now moved to the loader (that is the component that's responsible
    for setting up the environemnt).
    
    * adds the runtime support for parameters
    
    parameters are lazy, and are not evaluated unless are used, and not
    set before.
    
    * parametrize malloc allocator
    
    This implementation adds the following features, that are useful
    during the analysis:
    
    - the upper limit to the maximum size of the allocated memory chunk
    - the upper limit to the maximum size of the malloc heap
    - malloc guards with optional coloring
    - efficient calloc that doesn't
      a) take memory
      b) take time
    
    * publishes Primus Lisp eval functions
    
    There are two entities in Primus Lisp that could be evaluated -
    functions and methods. Both are now runable directly from OCaml via
    the `eval_fun` and `eval_method`.
    
    * adds Primus Lisp autodoc generator
    
    so far we can generate only index, but still not that bad.
    
    * adds the documentation option that renders primus lisp documentation
    
    * normalizes the generated documentation
    
    - merges different definitions of the same entity
    - removes quotations
    - removes duplicating whitespaces
    
    * adds documentation to various primus lisp definitions
    
    * fixes the name initialization in the loader
    
    an address can have multiple addresses, but a name shall have only
    one, not vice verse
    
    * drops the primus-lisp dependency
Commits on Mar 16, 2018
  1. Bil test integration (#737)

    gitoleg authored and ivg committed Mar 16, 2018
    * just added a new stage to travis
    
    * Update .run_travis_tests.sh
    
    * updated installation of bap-veri
    
    * reduces jobs for test purpose
    
    * rolled back
    
    * adds install of veri deps
    
    * adds core install
    
    * added bap-veri install in install script
    
    * fixed stages names