You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There is the need to handle transfers between devices more easily.
The current attempt to sync from backend to another is not sufficient/does not scale with more backends.
There are two things to think of (and fallback each):
Inter Framework
Fallback to do a Framework A -> Native -> Framework B
Inter Device (if the framework does not handle it, i.e. CUDA afaik)
Fallback to do a Framework A/Device A -> Native -> Framework A/Device B
Note that the matrix is supposedly symmetrical, but the transfer functions are not identical! Read is not write after all.
Note that this allows to scale very quickly, basically if this becomes a bottleneck, special functions can be registered. If not, and host memory is sufficient, this will default.
Note that: To and from Native is obviously always populated.
Note that: Maybe a big framework matrix is best suited, and then, if necessary a inter device matrix within the framework.
The text was updated successfully, but these errors were encountered:
Backends may define transfers asymmetrically; for example, CUDA may know how to transfer to and from Native backend, while Native may know nothing about CUDA at all. So if the first attempt fails, we change the order and try again.
Removing that would require moving the logic to the Sync implementations, which could increase complexity. Although that's a disadvantage, the advantage of transferring the responsibility to frameworks would make adding other frameworks less of a hassle as the core codebase wouldn't need to be aware of individual frameworks (i.e., transferring from CUDA to OpenCL).
This may be a case of over-engineering, though. Transferring from framework-x to framework-y is rarely, if ever, done.
There is the need to handle transfers between devices more easily.
The current attempt to sync from backend to another is not sufficient/does not scale with more backends.
There are two things to think of (and fallback each):
Framework A
->Native
->Framework B
Framework A/Device A
->Native
->Framework A/Device B
Note that the matrix is supposedly symmetrical, but the transfer functions are not identical! Read is not write after all.
Note that this allows to scale very quickly, basically if this becomes a bottleneck, special functions can be registered. If not, and host memory is sufficient, this will default.
Note that: To and from
Native
is obviously always populated.Note that: Maybe a big framework matrix is best suited, and then, if necessary a inter device matrix within the framework.
The text was updated successfully, but these errors were encountered: