Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP
Branch: master
Commits on Aug 22, 2010
Commits on Jun 5, 2010
  1. Implement more precise time keeping in the time loop code.

    authored
    Avoid accumulating the error arising from the difference
    between the internal and external simulation steps.
Commits on Oct 20, 2009
Commits on Sep 23, 2009
  1. Use 1D CUDA textures wherever possible.

    authored
    Support accessing a subset of row of a 2D array as a
    texture, and use it to simplify 2D references with
    a constant second index. This required increasing the
    pitch alignment requirement for arrays.
Commits on Sep 20, 2009
  1. Fix the iteration order uncertainty in split-by-cse.

    authored
    The order of iteration in this loop is significant, because
    it affects the assignment of canonic expression ids, and thus
    the sorting order of expressions. In order to stabilize it,
    convert the key list to an ordered set.
  2. Make MADD-aware loop splicing depend on :treeify-madd.

    authored
    It turns out that it is not always faster, so make it controllable.
  3. Nuke the software access realignment code.

    authored
    It is useless with modern cards, and doesn't work anyway.
  4. Support using fast division in CUDA mode.

    authored
    CUDA division instruction produces 0 as a result of x/y
    where abs(y) > 8.5e+37. To overcome this, normal division
    is implemented with a range check and a branch. This adds
    support for using hardware division where applicable.
  5. Assist the use of the MADD instruction in some cases.

    authored
    Implement MADD-aware treeify of sums when requested via
    :cuda-flags (:treeify-madd t); also force MADD in two
    special cases of ordered inner loop code.
  6. Get rid of the _fdiv variable hack.

    authored
    After implementing split-by-level and split-by-cse for
    flattened trees, it is possible to make the main CSE
    pass work on them as well. This will ensure automatic
    extraction of common divisors, so the hack is no longer
    necessary.
Commits on Sep 19, 2009
  1. Rewrite count-subexprs in a more clean way.

    authored
    This makes it a bit more usable as well.
  2. Implement splitting of flat trees by common subsets.

    authored
    Add code to determine that e.g. (+ a b c d) and
    (+ a c d) have a common subpart, and split it off
    so that it can be further exploited by the common
    subexpression elimination pass.
    
    Works by doing a O(N^2) intersection of all similar
    terms, and then extracting subsets in a certain order
    of precedence.
Commits on Sep 18, 2009
  1. Include flattening in optimize-ifsign.

    authored
    Optimizing ifsign may cause some additional expressions to
    cancel out or reduce to 0, so include the relevant pass.
    
    This uncovered a bug in the implementation of :fallback-to
  2. Adjust diagnostic information output.

    authored
    Improve readability of flatten simplification diagnostics
    using trivialize-refs, and reduce verbosity of ifsign.
Commits on Sep 17, 2009
  1. Allow simplifying sign-dependent conditions.

    authored
    Add an interface for specifying hints about the sign
    of variable values, and use this information to simplify
    expressions involving a new (ifsign...) macro, namely,
    remove factors that actually don't affect the sign.
  2. Keep the original texture symbol quoted inside texture-ref.

    authored
    This allows tracking the reference properly.
Commits on Sep 13, 2009
  1. Ensure that the names passed to the driver API are base-string.

    authored
    The FFI wrappers throw an exception if that's not the case.
  2. Support adding pre-execute hooks, which can cancel execute.

    authored
    This is useful for some delayed initialization, or execute
    script interface hijacking (e.g. for dumps)
    
    The hooks are allowed to be symbols in order to accommodate
    for global function redefinition.
Commits on Sep 12, 2009
  1. Add time iteration and checkpoint control code.

    authored
    This code is not limited to only one application, so
    it should be located in the library part.
Commits on Sep 5, 2009
  1. Rename loop-indexes to do-indexes.

    authored
    The 'loop' prefix confuses common-lisp-indent-function.
  2. Refactor the formula parser.

    authored
    Wrap the lexer in a defcontext and reformat the code
    in Emacs. Also add the following enhancements:
    
    - Use && and || for boolean operations.
    - Add a boolean not operator: !
    - Forbid non-whitespace between ; and the following
      newline to avoid confusion with lisp comments.
    - Support using #| ... |# comments inside formulas.
    - Support reverting to the lisp syntax via $(...)
    - Support backquote antiquotations via $,...
Commits on Sep 4, 2009
  1. Rewrite a few more transformations to use canonic trees.

    authored
    This allows removing the cached-simplifier hack. As a side
    effect, it uncovered a bug in the simplify pass.
Commits on Sep 3, 2009
  1. Add support for a new (code...) command to form compilers.

    authored
    It expands to a sequence of (text) and (recurse) calls.
    The code becomes less verbose, but somewhat more obscure.
  2. Fully reindent all code in Emacs.

    authored
    Some custom macro indentation rules are added to formula.el
    Apparently, &rest is broken in nested lists, so some of them
    use long sequences of fixed numbers to work around it.
Commits on Aug 30, 2009
  1. Ditch Standard-CL, except for a few functions & use Alexandria.

    authored
    The latter seems to be more organized.
  2. Optimize the tree before code motion in CUDA.

    authored
    It actually decreases the register pressure in
    the largest kernel. Todo: make code motion use
    the flattened canonic tree.
  3. Add a custom formula indentation mode for Emacs.

    authored
    The standard common lisp indentation engine cannot properly
    handle code that uses the reader extension for infix formula
    input. This commit adds a minor mode that overrides the
    standard behavior for text within curly braces.
Commits on Aug 16, 2009
  1. Refactor the form compiler framework.

    authored
    Since specifying the standard &allow-other-keys keyword
    disables errors about unknown keys, there is no need to
    reinvent the wheel. Make the form compilers use the normal
    key parameters.
  2. Convert type annotation code to def-rewrite-pass.

    authored
    Add support for additional mandatory parameters to
    def-rewrite-pass, and use it to refactor annotate-types.
    Also fix map-rewrite-structure and make it handle
    (symbol-macrolet), (safety-check) and (temporary).
Commits on Aug 11, 2009
  1. Implement a generic tree rewriting interface.

    authored
    Group tree rewriting engine implementations in one file,
    and add a convenient macro front-end for them. Use it
    where applicable. Reindent modified code in emacs.
Commits on Aug 9, 2009
  1. Implement splitting of multi-arity expressions by loop level.

    authored
    If a significant number of terms belong to an outer loop
    level, group them into one term (significant number being
    3, or 2 non-numeric).
    
    Additionally, convert the level computation from lists
    to FSet, implement a new macro for processing pipeline
    definition, and use it to include splitting into optimize-tree.
    
    Finally, fix correct-loop-levels: it must invalidate the
    level cache, or the destructive changes may be ignored.
Commits on Aug 8, 2009
  1. Reorganize flattening to work on canonical expressions.

    authored
    As a positive side effect, add automatic removal of
    redundant items, i.e. (- a a) or (* a (/ a)).
  2. Use a structure for the part of the range that must be shared.

    authored
    The refactoring code destructively modifies some of the index
    range parameters, so they must remain shared throughout the
    expression restructuring passes. When the ranging spec is a
    list, ensuring this is difficult.
    
    This changes the mutable part of the range specification into
    a structure, providing macros to easily create and match such
    nodes. Code throughout the library is switched to the new format.
Something went wrong with that request. Please try again.