Paper title:

Newton: A DRAM-maker’s Accelerator-in-Memory (AiM) Architecture for Machine Learning

Publication:

MICRO’2020

Problem to solve:

Memory-bound models in machine learning require unprecedented high-bandwidth connection between compute and memory.
Previous PIM (processing-in memory) and PNM (processing-near memory) approaches advocate full processor cores which do not conform to PIM’s severe area and power constraints.
For digital PIM, compute area and power are constrained to avoid excessive DRAM density loss and thermal challenges.

Major contributions:

Proposed Newton, which places a minimal compute of only multiply-accumulate units and buffers in the DRAM which avoids the full-core area and power overheads of previous work.
Employed a DRAM-like interface for the host to issue commands to the PIM compute.
Introduced optimizations to prevent the PIM-host interface from becoming a bottleneck.
Employed an unusually wide interleaved layout for the matrix to capture output vector reuse with reasonable buffering.

Provide feedback

Saved searches