This page contains a survey of Process-In-Memory (PIM) and Near-Data-Processing (NDP) papers. To distinguish between PIM and NDP (from technology perspective), we assume that PIM architecture either involves analog computation using memory array, or incorparating digital computing logic and memory components on the same die; whereas NDP architecture has seperate implementations of computing logic and memory components in different dies. Therefore in our categorization, recent 3D stacking based design belongs to NDP architecture.
From an architecture perspective, although some hardware uses memory technology to implement computation, they are still used as an accelerator for the host (for example, attached to PCIe as a slave device). These hardware designs assume separate physical address space from the host processor, and kenerl execution is similar to GPU (data copy-->kernel launch-->finish computation-->data copy). In contrast, some designs, though categorized as "NDP" in our survey, are truly "process-in-memory" from architectural standpoint. For example, "HMC + logic layer" can be used as memory device (read and written by the host) and a computation device (computation offloading). Also, some designs that have relatively large on-chip managed memory (For example, GPU has scratchpad memory, and DianNao has eDRAM) should be categorized as "memory-rich processor". These memory are local to the processor, and have no computing capability, so we do not include these papers in our survey.
We only include circuit, architecture and system level researches (The list is expected to grow as we add more new / dated papers).
I collect all related papers (not 100% matching) in PIM / NDP domain. All of the papers are arranged in chronological order in the following page:
- The list of paper (continuously updated) (1989 - 2020):
The following image shows the trend of PIM / NDP publication count, the trend for commodity DRAM bandwidth, the trend for GDDR bandwidth, and the trend for HBM bandwidth. We can see that as the bandwidth is increasing in a slower pace these years, more and more researchers are exploring PIM / NDP technology to tackle the memory wall.
The outline of the survey:
Application Scenario Marker
- General Purpose
- Machine Learning / Neural Network
- Graph Processing
- Bioinformatics
- Data Analytics
- Associative Computing
- Automata Computing
- Data Manipulation
- Security
- Others
[IEEE Transactions on Computers 1970][A Logic-in-Memory Computer]
Arch: small processing elements are combined with small amounts of RAM to provide a distributed array of memories that perform computation
[IEEE Database 1981][The NON-VON Database Machine: An Overview]
Arch: small processing elements are combined with small amounts of RAM to provide a distributed array of memories that perform computation
[WoNDP 2013][A Processing-in-Memory Taxonomy and a Case for Studying Fixed-function PIM]
[Micro 2014][Near-Data Processing: Insights from a MICRO-46 Workshop]
[MemSys 2016][Data-Centric Computing Frontiers: A Survey On Processing-In-Memory]
[IEEE Solid-State Circuits Magazine 2016][Making the Case for Feature-Rich Memory Systems: The March Toward Specialized Systems]
[Advances in Computers 2017][Simple Operations in Memory to Reduce Data Movement]
[Nature Electronics 2018][The future of electronics based on memristive systems]
[arXiv 2018][Neuro-memristive circuits for edge computing: A review]
[GLVLSI 2020][Security Challenges of Processing-In-Memory Systems]
[GLVLSI 2020][A Review of In-Memory Computing Architectures for Machine Learning Applications]
[GLVLSI 2020][Modeling and Benchmarking Computing-in-Memory for Design Space Exploration]