Skip to content

v1.2.24

Compare
Choose a tag to compare
@lisaong lisaong released this 10 Mar 07:49
· 5 commits to main since this release

What's Changed

  • Merged PR 3150: Change high precision fp to not perform contraction.
    [Mason Remy]

    Change high precision fp to not perform contraction

    Also change value library FMA to use the math dialect FmaOp and
    vectorize to the vector dialect FMAOp

  • Merged PR 3147: Fix vector cast with same bitwidth. [Mason Remy]

    Fix vector cast with same bitwidth.

    accv.cast vector<16xi8> to vector<16xui8>
    was erroneously lowering to
    cast vector<16xi8> to ui8

  • Merged PR 3149: Improve 1-D horizontal sum reductions for 8xf32 and
    8xi32. [Mason Remy]

    Improve 1-D horizontal sum reductions for 8xf32 and 8xi32

  • Merged PR 3148: Adds Package level FP precision override. [Kern Handa]

  • Merged PR 3144: Removes fp precision as an option for Package.build.
    [Kern Handa]

    The fp-contract option being used in accc.py was overriding the recent addition of the fp precision specification at the function level. Since there's now an equivalent default for each function, we shouldn't have need of the option to be specified to llc and opt during build time.

  • Merged PR 3143: Add dsl test for profiling op. [Denny Sun]

    1. add profiling enable flag to Package.build()
    2. add a dsl test
  • Merged PR 3022: Assert the arg order in debug mode. [Denny Sun]

    Dimension arg should precede array arg in the arg list for debug mode.

  • Merged PR 3137: expose profiling function to DSL. [Denny Sun]

    expose profiling function to DSL

  • Merged PR 3142: [Release] Tie accera-llvm versioning to LLVM version.
    [Lisa Ong]

    This change introduces a new versioning schema for accera-llvm that follows LLVM's versioning, while allowing for Accera versioned forks:

    <llvm_major>.<llvm_minor>.<llvm_micro><accera_micro> = (N+).(N+).(N+)(N{2})

    This overloads the micro version field due to constraints on Python versioning: https://peps.python.org/pep-0440/

    Examples:

    • Current LLVM fork is 14.0.6-2: accera_llvm.14.0.602, which means LLVM 14.0.6 + accera fork v2
    • If/when upgrading to LLVM 15.0.7: accera_llvm.15.0.700
    • Then when we rev the Accera fork to LLVM 15.0.7-1: accera_llvm.15.0.701

    Limitations:

    • We don't expect Accera's fork to span beyond 2-digit versions

    Alternatives:

    • Omit the 0 delimiters, if we think it is unlikely that Accera forks will rev micro versions beyond single-digit. Accera forks may rev more often if we don't update LLVM.
    • Use a dev version, e.g. accera_llvm.14.0.6.dev4. Downside is that this looks unofficial - devN is intended for developmental releases rather than official PyPI releases. That said, the whole Accera project is developmental :)
  • Merged PR 3139: Allows setting precision of fp ops per function. [Kern
    Handa]

    Allows setting precision of fp ops per function

  • Merged PR 3140: Fix bug with reinterpret casts of unrealized
    conversion casts. [Mason Remy]

    Fix bug with reinterpret casts of unrealized conversion casts.

    This happens when we do a heap alloc followed by a reinterpret cast, but
    it can come up in other scenarios too

  • Merged PR 3135: [nfc] Add XeonE5 benchmark machine to targets, bump
    hatlib dependency. [Lisa Ong]

    Best guesses at cache sizes and cache lines from: https://en.wikichip.org/wiki/intel/xeon_e5/e5-2673_v4

Full Changelog: v1.2.23...v1.2.24