Permalink
Commits on Aug 22, 2010
Commits on Jun 5, 2010
  1. Implement more precise time keeping in the time loop code.

    Avoid accumulating the error arising from the difference
    between the internal and external simulation steps.
    committed Jun 5, 2010
Commits on Oct 20, 2009
Commits on Sep 23, 2009
  1. Use 1D CUDA textures wherever possible.

    Support accessing a subset of row of a 2D array as a
    texture, and use it to simplify 2D references with
    a constant second index. This required increasing the
    pitch alignment requirement for arrays.
    committed Sep 23, 2009
Commits on Sep 20, 2009
  1. Fix the iteration order uncertainty in split-by-cse.

    The order of iteration in this loop is significant, because
    it affects the assignment of canonic expression ids, and thus
    the sorting order of expressions. In order to stabilize it,
    convert the key list to an ordered set.
    committed Sep 20, 2009
  2. Make MADD-aware loop splicing depend on :treeify-madd.

    It turns out that it is not always faster, so make it controllable.
    committed Sep 20, 2009
  3. Nuke the software access realignment code.

    It is useless with modern cards, and doesn't work anyway.
    committed Sep 20, 2009
  4. Support using fast division in CUDA mode.

    CUDA division instruction produces 0 as a result of x/y
    where abs(y) > 8.5e+37. To overcome this, normal division
    is implemented with a range check and a branch. This adds
    support for using hardware division where applicable.
    committed Sep 20, 2009
  5. Assist the use of the MADD instruction in some cases.

    Implement MADD-aware treeify of sums when requested via
    :cuda-flags (:treeify-madd t); also force MADD in two
    special cases of ordered inner loop code.
    committed Sep 20, 2009
  6. Get rid of the _fdiv variable hack.

    After implementing split-by-level and split-by-cse for
    flattened trees, it is possible to make the main CSE
    pass work on them as well. This will ensure automatic
    extraction of common divisors, so the hack is no longer
    necessary.
    committed Sep 20, 2009
Commits on Sep 19, 2009
  1. Rewrite count-subexprs in a more clean way.

    This makes it a bit more usable as well.
    committed Sep 19, 2009
  2. Implement splitting of flat trees by common subsets.

    Add code to determine that e.g. (+ a b c d) and
    (+ a c d) have a common subpart, and split it off
    so that it can be further exploited by the common
    subexpression elimination pass.
    
    Works by doing a O(N^2) intersection of all similar
    terms, and then extracting subsets in a certain order
    of precedence.
    committed Sep 19, 2009
Commits on Sep 18, 2009
  1. Include flattening in optimize-ifsign.

    Optimizing ifsign may cause some additional expressions to
    cancel out or reduce to 0, so include the relevant pass.
    
    This uncovered a bug in the implementation of :fallback-to
    committed Sep 18, 2009
  2. Adjust diagnostic information output.

    Improve readability of flatten simplification diagnostics
    using trivialize-refs, and reduce verbosity of ifsign.
    committed Sep 18, 2009
Commits on Sep 17, 2009
  1. Allow simplifying sign-dependent conditions.

    Add an interface for specifying hints about the sign
    of variable values, and use this information to simplify
    expressions involving a new (ifsign...) macro, namely,
    remove factors that actually don't affect the sign.
    committed Sep 17, 2009
  2. Keep the original texture symbol quoted inside texture-ref.

    This allows tracking the reference properly.
    committed Sep 17, 2009
Commits on Sep 13, 2009
  1. Ensure that the names passed to the driver API are base-string.

    The FFI wrappers throw an exception if that's not the case.
    committed Sep 13, 2009
  2. Support adding pre-execute hooks, which can cancel execute.

    This is useful for some delayed initialization, or execute
    script interface hijacking (e.g. for dumps)
    
    The hooks are allowed to be symbols in order to accommodate
    for global function redefinition.
    committed Sep 13, 2009
Commits on Sep 12, 2009
  1. Add time iteration and checkpoint control code.

    This code is not limited to only one application, so
    it should be located in the library part.
    committed Sep 12, 2009
Commits on Sep 5, 2009
  1. Rename loop-indexes to do-indexes.

    The 'loop' prefix confuses common-lisp-indent-function.
    committed Sep 5, 2009
  2. Refactor the formula parser.

    Wrap the lexer in a defcontext and reformat the code
    in Emacs. Also add the following enhancements:
    
    - Use && and || for boolean operations.
    - Add a boolean not operator: !
    - Forbid non-whitespace between ; and the following
      newline to avoid confusion with lisp comments.
    - Support using #| ... |# comments inside formulas.
    - Support reverting to the lisp syntax via $(...)
    - Support backquote antiquotations via $,...
    committed Sep 5, 2009
Commits on Sep 4, 2009
  1. Rewrite a few more transformations to use canonic trees.

    This allows removing the cached-simplifier hack. As a side
    effect, it uncovered a bug in the simplify pass.
    committed Sep 4, 2009
Commits on Sep 3, 2009
  1. Add support for a new (code...) command to form compilers.

    It expands to a sequence of (text) and (recurse) calls.
    The code becomes less verbose, but somewhat more obscure.
    committed Sep 3, 2009
  2. Fully reindent all code in Emacs.

    Some custom macro indentation rules are added to formula.el
    Apparently, &rest is broken in nested lists, so some of them
    use long sequences of fixed numbers to work around it.
    committed Sep 3, 2009
Commits on Aug 30, 2009
  1. Ditch Standard-CL, except for a few functions & use Alexandria.

    The latter seems to be more organized.
    committed Aug 30, 2009
  2. Optimize the tree before code motion in CUDA.

    It actually decreases the register pressure in
    the largest kernel. Todo: make code motion use
    the flattened canonic tree.
    committed Aug 30, 2009
  3. Add a custom formula indentation mode for Emacs.

    The standard common lisp indentation engine cannot properly
    handle code that uses the reader extension for infix formula
    input. This commit adds a minor mode that overrides the
    standard behavior for text within curly braces.
    committed Aug 30, 2009
Commits on Aug 16, 2009
  1. Refactor the form compiler framework.

    Since specifying the standard &allow-other-keys keyword
    disables errors about unknown keys, there is no need to
    reinvent the wheel. Make the form compilers use the normal
    key parameters.
    committed Aug 16, 2009
  2. Convert type annotation code to def-rewrite-pass.

    Add support for additional mandatory parameters to
    def-rewrite-pass, and use it to refactor annotate-types.
    Also fix map-rewrite-structure and make it handle
    (symbol-macrolet), (safety-check) and (temporary).
    committed Aug 16, 2009
Commits on Aug 11, 2009
  1. Implement a generic tree rewriting interface.

    Group tree rewriting engine implementations in one file,
    and add a convenient macro front-end for them. Use it
    where applicable. Reindent modified code in emacs.
    committed Aug 11, 2009
Commits on Aug 9, 2009
  1. Implement splitting of multi-arity expressions by loop level.

    If a significant number of terms belong to an outer loop
    level, group them into one term (significant number being
    3, or 2 non-numeric).
    
    Additionally, convert the level computation from lists
    to FSet, implement a new macro for processing pipeline
    definition, and use it to include splitting into optimize-tree.
    
    Finally, fix correct-loop-levels: it must invalidate the
    level cache, or the destructive changes may be ignored.
    committed Aug 9, 2009
Commits on Aug 8, 2009
  1. Reorganize flattening to work on canonical expressions.

    As a positive side effect, add automatic removal of
    redundant items, i.e. (- a a) or (* a (/ a)).
    committed Aug 8, 2009
  2. Use a structure for the part of the range that must be shared.

    The refactoring code destructively modifies some of the index
    range parameters, so they must remain shared throughout the
    expression restructuring passes. When the ranging spec is a
    list, ensuring this is difficult.
    
    This changes the mutable part of the range specification into
    a structure, providing macros to easily create and match such
    nodes. Code throughout the library is switched to the new format.
    committed Aug 8, 2009