You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Plan is for FD2.2 to use a ring buffer, w/ a the ring buffer the binaries can be packed in DRAM and written w/ one linear write
However, w/ multiple kernel groups, we'd still pay the dram latency for each kernel group which will dominate for typical kernel sizes.
To address both now and post ring buffer, we could:
Create a packed read command which would read from multiple dram locations
Either modify the existing packed write command to handle larger amounts of data (currently limited to 1 page per write) or create a packed_write_large command for larger transfers
Today, do a packed read followed by a packed_write for all binaries. Post ring buffer, we could pack the binaries for a kernel group in dram and then for single kernel groups do a read and linear write while multiple kernel groups would do a packed read and packed write (w/ fewer total transfers)
The text was updated successfully, but these errors were encountered:
FD2 doesn't pack kernel binaries.
Plan is for FD2.2 to use a ring buffer, w/ a the ring buffer the binaries can be packed in DRAM and written w/ one linear write
However, w/ multiple kernel groups, we'd still pay the dram latency for each kernel group which will dominate for typical kernel sizes.
To address both now and post ring buffer, we could:
The text was updated successfully, but these errors were encountered: