-
Notifications
You must be signed in to change notification settings - Fork 112
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add TensorForge [Tracking PR] #1102
base: master
Are you sure you want to change the base?
Conversation
Codecov ReportAttention: Patch coverage is
❗ Your organization needs to install the Codecov GitHub app to enable full functionality. Additional details and impacted files@@ Coverage Diff @@
## master #1102 +/- ##
==========================================
- Coverage 14.33% 13.79% -0.55%
==========================================
Files 253 268 +15
Lines 14303 15011 +708
==========================================
+ Hits 2051 2071 +20
- Misses 12252 12940 +688 ☔ View full report in Codecov by Sentry. |
a21fae0
to
22cc55d
Compare
We add TensorForge as a replacement and amalgamate of Yateto, gemmforge and chainforge. As a result, we can then handle tensors (and thus fused simulations etc.)—and we won't need any extra Python packages anymore for GPU support.
A secondary goal (maybe part of a future PR) is to remove the explicity SYCL code (i.e. DR) and move everything into the code generation (including also plasticity etc.)—resulting in (hopefully) simpler code since we can merge the CPU and GPU paths, faster code by merging kernels and reducing the latencies etc., but also simplify the build complexity on Nvidia and AMD systems, except if we want to use SYCL kernels or SYCL as runtime in the background.
A tertiary goal (maybe part of a future PR) is to get rid of the scratchpads (by kernel merging), so that we can play around with scheduling much more.
We also begin to add some basic support for
viscoelastic2
andporoelastic
on GPUs—note that both kernels still don't work at the moment.Still WIP, but already with some progress (currently, the codegen is a bit broken still):
viscoelastic2
andporoelastic
kernels(parts of this PR may be spun out)