Switch branches/tags
Nothing to show
Commits on Aug 4, 2012
  1. Move catstream and *_onto behind-the-scenes.

    dubiousjim committed Aug 4, 2012
    The programmer shouldn't have to deliberate about whether to use map or
    map_onto, and so on. Let's make them compile-time optimizations instead,
    on the same model as Pure's compile time optimizations for ``xs!!(i..j)``.
    The patterns have to be visible to the compiler. So these will be
        map f xs + base;
        cat [map f xs, more, stillmore];
    but these won't:
        ys + base when ys = map fs xs end;
        plus (map f xs) base when plus = (+) end;
    Also, we make the optimizations only available when --enable=list-opt.
    Changes implemented in this commit:
    * Move reverse_onto into __std__; it's not intended to be called directly
      outside a few core stdlibs. Instead users should just use
      ``reverse xs + base``, it will automatically be optimized. The
      optimization handles the detection of string bases properly.
    * Move map_onto and revmap_onto into __std__; these also are not
      intended to be called directly, but will automatically be invoked
      as optimizations for ``map f xs + base`` and so on.
    * Don't expose `zip*_onto`, but also invoke these as compile-time
    * Eliminate catmap_onto. The semantics of this was especially hard to
      remember. Was it equivalent to ``catmap f xs + base``, so that
      `cat` is not applied to `base`? Or to: ``cat (map_onto f base xs)``,
      so that `cat` *is* applied to base? In fact the reverse-based and
      skip-ahead implementations I had committed earlier diverged in that
    * Move catstream into __std__. This will automatically be invoked
      when the user says ``stream xs + ...`` or ``cat [stream xs, ...]``
      or ``cat(map stream xs + ...)`` and so on.
      To get this optimization, operands have to be literally wrapped with
      ``stream`` in the source, not just be streams. But the semantics
      should be the same even if the slower, existing handling of
      stream concatenation in the definitions of `+` and `cat` is used.
    The implementations of some of these optimizations are interleaved.
    For the macro optimizations, as far as I can see there
    is no feasible way to avoid declaring ``stream`` a nonfix. But note again
    that we only do this when --enable=list-opt.
    Note also that with regard to the stream handling, we make some
    assumptions about the semantics of list and stream operations. For
    example, we assume that (+) is associative with respect to those
    arguments and (+[]) is a right identity. These are of course true for
    actual lists and streams. But we further assume them to be true for
    expressions like ``stream xs`` and ``map stream xs``, even when
    those expressions don't reduce further.
    So for example, we count: ``cat(map stream xs) + cat ys`` as equivalent
    to ``cat(map stream xs + ys)``, for any ``xs``. We also count
    ``(stream xs + stream ys) + zs`` as equivalent to
    ``stream xs + (stream ys + zs)``.
    We *don't* assume that (+) is associative or (+[]) is an identity for
    arbitrary arguments.
    The assumptions made here seem reasonable ones, especially given the
    elegant way they enable us to move these optimizations behind the
    scenes. But we should carefully review them, and keep them in mind
    in case they introduce problems elsewhere.
    (Fewer assumptions would be needed if (+) were infixr rather than
    infixl, or if we could implement the optimizations making (+) a
    --quoteargs macro. I tried the latter, but in order to recurse on the
    operands while keeping a (+) applied to them, yet suppressing its
    macro definitions, the only way I can see is to use a const
    proxy for (+), and consts can't also be infix operators. So the result
    is that when one says ``:show my_code`` in the interpreter, everything
    would always look like:
        (+) x y
    instead of:
        x + y
    I don't know if that implementation strategy would in all other respects
    be as good or better than the one pursued here. I'm just reporting what
    persuaded me not to pursue it further.)
  2. Optimize list _::list.

    dubiousjim committed Aug 2, 2012
    Perhaps there's some subtle reason I'm missing why the list needed to
    be accumulated and reversed, but on the face of it, traversing
    then returning the original suffices.
  3. Special-case implementation of ==? for arrays and heaps.

    dubiousjim committed Aug 1, 2012
    To be used when `==` is not defined for all the elements. This is more
    efficient than recursing once for `==`, failing, then recursing again
    (in C) for `===`. More importantly, it gives better results when `==` is
    defined for some but not all of the elements.
    To prevent the prelude's general case rule from shadowing these
    defintions, though, we need to add special-case checks for these
    types to prelude.pure. Since we only have a small number of collections
    in the stdlib, this seems tolerable.
    If we're going to end up wanting special-case implementations of ==? for
    more and more collections though, it might become less tolerable. In
    that case, we may want to have a convention where, say, a module
    declares the result of ``tuple2 x y`` as belonging to the type
    ``egalable`` (a better name would be nice) if the module knows how to
    provide an implementation of ==? for that x and y. Then the prelude's
    general case ==? would just apply when ``tuple2 x y`` is not egalable.
    (I call it a "convention", but one might arguably characterize it
    as "magic".)
    The dynamic extensibility of Pure's types would make this work
    correctly. Until module X was loaded, its special-case rules for ==?
    would be ignored, and the general rule in the prelude would apply
    (if operands of the appropriate type were even constructable before
    module X was loaded). After module X was loaded, it would take over
    for the operands it wants.
    But I'm not sure how much of a performance hit there would be: all those
    function calls to check operand types every time ==? is applied. *Plus*
    building then discarding all those tuple2s.
    But let's cross that bridge if and when we have to. With the scheme presented
    here, we do have more operand type-checking, but we at least avoid
    building lots of temporary tuple2s.
Commits on Jul 31, 2012
  1. Update test/prelude*.log.

    dubiousjim committed Jul 31, 2012
  2. Tuple toys.

    dubiousjim committed Jul 31, 2012
  3. Add weak-{key,value} dicts and weak sets.

    dubiousjim committed Jul 31, 2012
    Also update Makefile.
  4. Permit chaining additional sentries on refs.

    dubiousjim committed Jul 30, 2012
    This scheme permits one to do things like this:
        gasp name::string x = case get_sentry x of
            unref next = sentry (unref (next . announcer)) x;
            get_sentry _ = sentry announcer x;
            _ = throw "sentry conflict"
        end with
            announcer _ = printf "%s is dying\n" name;
    This makes any object announce when it's dying, even if
    the object is a refcell. (Was useful for testing the weakref
  5. Low-level weakref system.

    dubiousjim committed Jul 30, 2012
  6. Add test092, test093.

    dubiousjim committed Jul 31, 2012
    These check the stream and [m]fold/[m]any/[m]all functions for dicts and
    sets. The first does so with avltrees, the second with trees23 (these
    generate different log output).
  7. Optimizations to set relations and operators.

    dubiousjim committed Jul 31, 2012
    Some operations used more traversals than necessary, and the
    intermediate coercions for mixed arguments could be avoided in all but a
    few cases.
    Note that we're making some assumptions here that may not always be
    true. (These assumptions were already being made before these
    optimizations.) Speciically, just because m1 is an ordered set, and so
    '<' is defined on its elements, and m2 is an ordered set, and so '<' is
    defined on its elements, it doesn't follow that '<' will be defined on
    elements taken pairwise from m1 and m2. Yet we assume that it will.
    If the user of this library wants to deal with such cases, she'll have
    to do the appropriate coercions herself. Perhaps this should be
    explained in the library documentation?
    I have inspected these optimizations carefully, and the existing tests
    all pass. But it would be good for other eyes to also review them
  8. Add union to set.

    dubiousjim committed Jul 31, 2012
  9. Add map to array.

    dubiousjim committed Jul 31, 2012
  10. Add folds, all/any, to baltrees, set, dict.

    dubiousjim committed Jul 31, 2012
    The simple fold methods in the baltree backends take an additional
    preprocessing function to apply to each value before applying the fold
    function to it. This lets us reuse some of the backend functions for
    several different high-level collections. Note that it doesn't suffice
    just to compose the preprocessing function with the fold function, when
    dealing with foldl1 and foldr1.
    The mhfold functions are most complex, and need to be special-cased.
    This is what we expose in the high-level API:
        In set.pure:
         * foldl, foldl1, foldr, foldr1, any, all -- bags yield their values
           one at a time
         * mfoldl, mfoldl1, mfoldr, mfoldr1, many, mall -- sets and bags
           both yield lists of values
        In dict.pure:
         * foldl, foldl1, foldr, foldr1, any, all -- mdicts yield their k=>v
           one at a time
         * mfoldl, mfoldl1, mfoldr, mfoldr1, many, mall -- all dicts yield
  11. Add all/any to array.

    dubiousjim committed Jul 31, 2012
  12. Add rotate to lists and array.

    dubiousjim committed Jul 31, 2012
    In an earlier commit (not on this branch) I had implemented for lists
    only what is here `rotate 1 xs`. That is equivalent to
    `last xs:init xs`, but uses only a single traversal.
    Since I wanted to supply a general rotate for arrays, it seemed easies
    to just give the list and array operations the same interface. We still
    follow an optimized path for lists when n == 1.
  13. Add multiplicity to set.

    dubiousjim committed Jul 31, 2012
    Requires mhcount in baltrees, for HBags.
  14. Deploy a pruned version of trees23.

    dubiousjim committed Jul 31, 2012
    Integrate trees23 into stdlib. Also roll out related changes to testing
    system: test015, test090.
    If you run Pure with --enable=trees23, then trees23 will be used (under
    the namespace "baltree") as the balanced tree. If you then also do
    "using avltrees" (no reason why you should), that module will use the
    namespace "avl".
    If you run Pure with --disable=trees23 (currently the default), then
    avltrees will be used for the implementation and *it* will occupy
    namespace "baltree". If you then also do "using trees23" (no reason why
    you should), that module will use the namespace "trees23".
  15. Refine/expand avltrees and dict.

    dubiousjim committed Jul 31, 2012
    * mdeletea uses same code as deletek
    * hdeletekv doesn't need to traverse past first matching key
      (but mhdeletekv does)
    * Rename: m[h]getk --> m[h]getka
              m[h]deletek --> m[h]deleteko
              m[h]deletekv --> m[h]deletekvo
    * Provide missing methods: m[h]getk{o,n}
    * Factor out method to "replace first x satisfying p with y,
      else add y to end"; in other places use general list functions.
    * Add grob* and delete_other to dict. (These are intended to be just
      placeholder names.)
  16. Cosmetic changes to avltrees.

    dubiousjim committed Jul 31, 2012