- Seattle, WA
Xbox 360 Emulator Research Project
google/tracing-framework Public archive
Web Tracing Framework libraries and extensions.
A retargetable MLIR-based machine learning compiler and runtime toolkit.
An advanced WebGL debugging toolkit
google/xrtl Public archive
Cross-platform Real-Time Rendering Library
1,458 contributions in the last year
Contributed to openxla/iree, benvanik/iree-template-runtime-cmake, benvanik/obs-ml-filter and 6 other repositories
Created a pull request in openxla/iree that received 3 comments
Adding compilation reentrancy tests and new HAL pipeline phases.
--compile-to= phases are supported:
executable-sources: run just past interface materialization where
hal.executable ops with target confi…
+262 −45 • 3 comments
Opened 14 other pull requests in 1 repository
- Documenting and updating test for NULL thread priority override.
- Reworking CUDA channel creation and plumbing group/ID.
- Upload initial_data when returning existing cached buffers.
- Finally moving VM type registration to iree_vm_instance_t.
- Adding a local executable plugin mechanism.
- Adding FatELF support to the embedded ELF loader.
- Adding better diagnostics on input shape mismatch.
- Updating architecture diagram with "plugins" in a few places.
LLVMCPUin files and flags.
- Moving LLVMCPU asm listing to the dump-executable-intermediates flag.
- Making iree-benchmark-trace work with stateful traces.
- Adding state functionality to iree-run-trace and improving ergonomics.
- Adding min/max VM ops and VM buffer allocation alignment.
- Starting support for HAL dispatch specialization.
Reviewed 38 pull requests in 2 repositories
25 pull requests
- Add smoke tests to test inline hal compilation flow
- Fix errors in pad consumer fusion with integrate PR #12653.
- Add compiler CAPI for output to memory and mapping.
- Add C APIs sufficient for tunneling LLVM command line handling into derived tools.
- IREE dispatch profiler for functional verification and performance profiling
- Fix the generated cmake files due to PR conflicts
- Update file paths in generate_exports.py and run.
- Ukernel tools: port to C and generalize
ukernel/arch/arm_64: simplify build, allow non-GCC-compatible toolchains
- Simplify ARM logic and detect Win/ARM64.
- Updating CPU feature flag naming for consistency.
- Do not fuse producer-consumer if they have no common outer parallel loops
- [hal][cts] Add more tests for drivers device creation APIs
- Use iree_string_view_atoi_int32 in CUDA driver module flag parsing.
- Add math.tan expansion to polynomial approximation
- Be consistent about the use of IREE Bazel macros
- [flow] Add ops and passes for dynamicize static shapes
- Print meaningful error when too much shared memory is allocated on CUDA
- Handle complex in primitive type lowering
- [WIP] Add --iree-llvmcpu-linker-flags
- [LLVMGPU] Optimize shared memory allocation size
- [flow] Fix dispatch.region parser operand order
- Move compiler APIs to final locations.
- [vulkan] Add flag to enable robust buffer access
Centralize CPU features, fix
iree_cpu_lookup_data_by_key, add a
- Some pull request reviews not shown.
1 pull request
Created an issue in openxla/iree that received 3 comments
WebGPU/tint SPIR-V lowering relying on detensorizing.
While testing #12503 I had to disable the linalg detensorize pass. When I did I noticed some unique compilation failures in tint that were exposed.…