Switch branches/tags
Nothing to show
Commits on Aug 22, 2010
Commits on Jun 5, 2010
  1. Implement more precise time keeping in the time loop code.

    angavrilov committed Jun 5, 2010
    Avoid accumulating the error arising from the difference
    between the internal and external simulation steps.
Commits on Oct 20, 2009
Commits on Sep 23, 2009
  1. Use 1D CUDA textures wherever possible.

    angavrilov committed Sep 23, 2009
    Support accessing a subset of row of a 2D array as a
    texture, and use it to simplify 2D references with
    a constant second index. This required increasing the
    pitch alignment requirement for arrays.
Commits on Sep 20, 2009
  1. Fix the iteration order uncertainty in split-by-cse.

    angavrilov committed Sep 20, 2009
    The order of iteration in this loop is significant, because
    it affects the assignment of canonic expression ids, and thus
    the sorting order of expressions. In order to stabilize it,
    convert the key list to an ordered set.
  2. Make MADD-aware loop splicing depend on :treeify-madd.

    angavrilov committed Sep 20, 2009
    It turns out that it is not always faster, so make it controllable.
  3. Nuke the software access realignment code.

    angavrilov committed Sep 20, 2009
    It is useless with modern cards, and doesn't work anyway.
  4. Support using fast division in CUDA mode.

    angavrilov committed Sep 20, 2009
    CUDA division instruction produces 0 as a result of x/y
    where abs(y) > 8.5e+37. To overcome this, normal division
    is implemented with a range check and a branch. This adds
    support for using hardware division where applicable.
  5. Assist the use of the MADD instruction in some cases.

    angavrilov committed Sep 20, 2009
    Implement MADD-aware treeify of sums when requested via
    :cuda-flags (:treeify-madd t); also force MADD in two
    special cases of ordered inner loop code.
  6. Get rid of the _fdiv variable hack.

    angavrilov committed Sep 20, 2009
    After implementing split-by-level and split-by-cse for
    flattened trees, it is possible to make the main CSE
    pass work on them as well. This will ensure automatic
    extraction of common divisors, so the hack is no longer
Commits on Sep 19, 2009
  1. Rewrite count-subexprs in a more clean way.

    angavrilov committed Sep 19, 2009
    This makes it a bit more usable as well.
  2. Implement splitting of flat trees by common subsets.

    angavrilov committed Sep 19, 2009
    Add code to determine that e.g. (+ a b c d) and
    (+ a c d) have a common subpart, and split it off
    so that it can be further exploited by the common
    subexpression elimination pass.
    Works by doing a O(N^2) intersection of all similar
    terms, and then extracting subsets in a certain order
    of precedence.
Commits on Sep 18, 2009
  1. Include flattening in optimize-ifsign.

    angavrilov committed Sep 18, 2009
    Optimizing ifsign may cause some additional expressions to
    cancel out or reduce to 0, so include the relevant pass.
    This uncovered a bug in the implementation of :fallback-to
  2. Adjust diagnostic information output.

    angavrilov committed Sep 18, 2009
    Improve readability of flatten simplification diagnostics
    using trivialize-refs, and reduce verbosity of ifsign.
Commits on Sep 17, 2009
  1. Allow simplifying sign-dependent conditions.

    angavrilov committed Sep 17, 2009
    Add an interface for specifying hints about the sign
    of variable values, and use this information to simplify
    expressions involving a new (ifsign...) macro, namely,
    remove factors that actually don't affect the sign.
  2. Keep the original texture symbol quoted inside texture-ref.

    angavrilov committed Sep 17, 2009
    This allows tracking the reference properly.
Commits on Sep 13, 2009
  1. Ensure that the names passed to the driver API are base-string.

    angavrilov committed Sep 13, 2009
    The FFI wrappers throw an exception if that's not the case.
  2. Support adding pre-execute hooks, which can cancel execute.

    angavrilov committed Sep 13, 2009
    This is useful for some delayed initialization, or execute
    script interface hijacking (e.g. for dumps)
    The hooks are allowed to be symbols in order to accommodate
    for global function redefinition.
Commits on Sep 12, 2009
  1. Add time iteration and checkpoint control code.

    angavrilov committed Sep 12, 2009
    This code is not limited to only one application, so
    it should be located in the library part.
Commits on Sep 5, 2009
  1. Rename loop-indexes to do-indexes.

    angavrilov committed Sep 5, 2009
    The 'loop' prefix confuses common-lisp-indent-function.
  2. Refactor the formula parser.

    angavrilov committed Sep 5, 2009
    Wrap the lexer in a defcontext and reformat the code
    in Emacs. Also add the following enhancements:
    - Use && and || for boolean operations.
    - Add a boolean not operator: !
    - Forbid non-whitespace between ; and the following
      newline to avoid confusion with lisp comments.
    - Support using #| ... |# comments inside formulas.
    - Support reverting to the lisp syntax via $(...)
    - Support backquote antiquotations via $,...
Commits on Sep 4, 2009
  1. Rewrite a few more transformations to use canonic trees.

    angavrilov committed Sep 4, 2009
    This allows removing the cached-simplifier hack. As a side
    effect, it uncovered a bug in the simplify pass.
Commits on Sep 3, 2009
  1. Add support for a new (code...) command to form compilers.

    angavrilov committed Sep 3, 2009
    It expands to a sequence of (text) and (recurse) calls.
    The code becomes less verbose, but somewhat more obscure.
  2. Fully reindent all code in Emacs.

    angavrilov committed Sep 3, 2009
    Some custom macro indentation rules are added to formula.el
    Apparently, &rest is broken in nested lists, so some of them
    use long sequences of fixed numbers to work around it.
Commits on Aug 30, 2009
  1. Ditch Standard-CL, except for a few functions & use Alexandria.

    angavrilov committed Aug 30, 2009
    The latter seems to be more organized.
  2. Optimize the tree before code motion in CUDA.

    angavrilov committed Aug 30, 2009
    It actually decreases the register pressure in
    the largest kernel. Todo: make code motion use
    the flattened canonic tree.
  3. Add a custom formula indentation mode for Emacs.

    angavrilov committed Aug 30, 2009
    The standard common lisp indentation engine cannot properly
    handle code that uses the reader extension for infix formula
    input. This commit adds a minor mode that overrides the
    standard behavior for text within curly braces.
Commits on Aug 16, 2009
  1. Refactor the form compiler framework.

    angavrilov committed Aug 16, 2009
    Since specifying the standard &allow-other-keys keyword
    disables errors about unknown keys, there is no need to
    reinvent the wheel. Make the form compilers use the normal
    key parameters.
  2. Convert type annotation code to def-rewrite-pass.

    angavrilov committed Aug 16, 2009
    Add support for additional mandatory parameters to
    def-rewrite-pass, and use it to refactor annotate-types.
    Also fix map-rewrite-structure and make it handle
    (symbol-macrolet), (safety-check) and (temporary).
Commits on Aug 11, 2009
  1. Implement a generic tree rewriting interface.

    angavrilov committed Aug 11, 2009
    Group tree rewriting engine implementations in one file,
    and add a convenient macro front-end for them. Use it
    where applicable. Reindent modified code in emacs.
Commits on Aug 9, 2009
  1. Implement splitting of multi-arity expressions by loop level.

    angavrilov committed Aug 9, 2009
    If a significant number of terms belong to an outer loop
    level, group them into one term (significant number being
    3, or 2 non-numeric).
    Additionally, convert the level computation from lists
    to FSet, implement a new macro for processing pipeline
    definition, and use it to include splitting into optimize-tree.
    Finally, fix correct-loop-levels: it must invalidate the
    level cache, or the destructive changes may be ignored.
Commits on Aug 8, 2009
  1. Reorganize flattening to work on canonical expressions.

    angavrilov committed Aug 8, 2009
    As a positive side effect, add automatic removal of
    redundant items, i.e. (- a a) or (* a (/ a)).
  2. Use a structure for the part of the range that must be shared.

    angavrilov committed Aug 8, 2009
    The refactoring code destructively modifies some of the index
    range parameters, so they must remain shared throughout the
    expression restructuring passes. When the ranging spec is a
    list, ensuring this is difficult.
    This changes the mutable part of the range specification into
    a structure, providing macros to easily create and match such
    nodes. Code throughout the library is switched to the new format.