# Snehasish Kumar

## Research Interests

- Scalable compiler directed workload analysis
- Hardware software co-design for specialized architectures
- Core micro-architecture with a focus on the cache memory hierarchy

## Education

05/13 - 11/16 **PhD in Computing Science**, Simon Fraser University, British Columbia, Canada, 4.0/4.0.

Senior Supervisor: Dr. Arrvindh Shriraman

My research is directed at facilitating energy efficient computation via specialization. I have adopted a two-pronged approach. First, a top down approach uses program analysis to determine program regions amenable for specialization using LLVM. Second, a bottom up approach evaluated architectural specialization to enable the efficient offload of accelerated program regions.

The former workload-first approach, uses program analysis to analyse and reconstruct program regions to aid the design and evaluation of specialized accelerators. An analysis of twenty-nine workloads revealed significant merit in analysis at the path granularity for specialization (IISWC'16). Further analysis of instruction dependency chains in frequently executed paths revealed opportunities for specialized macro-instructions (MICRO'16). An insight into the nature of frequently occurring paths led to the development of a new program abstraction for accelerators (submitted HPCA'17). Robust alias analysis at the path granularity also enabled low overhead memory access interfaces for accelerators (submitted ASPLOS'17). I am also leading an ongoing effort to transparently generate application binaries with specialized regions offloaded to a tightly coupled FPGA substrate.

For the latter *architecture-first* approach, I designed and evaluated a hardware accelerator for software data structures. The access of and compute on data structures is offloaded to an array of processing elements which are tightly coupled to the last level cache (ICS'15). I also evaluated a specialized coherence protocol for fixed function accelerators (ISCA'15) which improves performance and reduces energy consumption by mitigating redundant data movement.

Publications: IISWC'16, MICRO'16, ICS'16, ISCA'15, ICS'15

01/11 – 04/13 **MSc in Computing Science**, Simon Fraser University, British Columbia, Canada, 3.8/4.0.

Senior Supervisor: Dr. Arrvindh Shriraman

Designed and evaluated a variable granularity cache memory hierarchy. The system adaptively varies the cache line size to eliminate data fetches not used by the application. Workloads benefited from increased effective cache space. Overall cache miss rates improved and dynamic energy consumption was reduced. The proposed architecture was modeled using the RUBY memory system simulator and evaluated on twenty-two workloads drawn from popular benchmark suites. A subsequent research work evaluated a variable granularity cache coherence protocol.

Publications: ISCA'13, MICRO'12

08/06 - 04/10 B. Tech in Computer Engineering, Biju Patnaik University of Technology, Orissa, India, 8.3/10.0.

Supervisor: Dr. Satyananda Champati Rai

Designed and implemented a genetic algorithm to address the problem of channel allocation in cellular networks. The algorithm computes a pseudo optimal borrowing scheme amongst neighbouring cells. The implementation used variable separation to reduce the search space. The approach improved over the state of the art and consistently computed near optimal solutions.

#### Publications

2016 ChainSaw: Creating Von-Neumann Accelerators with Fused Instruction Chains,

Amirali Sharifian, <u>Snehasish Kumar</u>, Apala Guha, and Arrvindh Shriraman, In *Proceedings of the 49th Annual IEEE/ACM International Symposium on Microarchitecture*, MICRO 2016.

SPEC-AX: Extracting Accelerator Benchmarks from Microprocessor Benchmarks,

<u>Snehasish Kumar</u>, Nick Sumner, and Arrvindh Shriraman, In *Workload Characterization (IISWC), 2016 IEEE International Symposium on*, IISWC 2016.

## Peruse and Profit: Estimating the Accelerabilty of Loops,

<u>Snehasish Kumar</u>, Vijayalakshmi Srinivasan, Amirali Sharifian, Nick Sumner, and Arrvindh Shriraman, In *Proceedings of the 30th ACM International Conference on Supercomputing*, ICS 2016.

## 2015 Fusion: Design Tradeoffs in Coherence Hierarchies for Accelerators,

<u>Snehasish Kumar</u>, Arrvindh Shriraman, and Naveen Vedula, In *Proceedings of the 42nd Annual International Symposium on Computer Architecture*, ISCA 2015.

#### **DASX**: Hardware Accelerator for Software Data Structures,

<u>Snehasish Kumar</u>, Naveen Vedula, Arrvindh Shriraman, and Vijayalakshmi Srinivasan, In *Proceedings of the 29th ACM International Conference on Supercomputing*, ICS 2015.

### 2013 Protozoa: Adaptive Granularity Cache Coherence,

Hongzhou Zhao, Arrvindh Shriraman, <u>Snehasish Kumar</u>, and Sandhya Dwarkadas, In *Proceedings of the 40th Annual International Symposium on Computer Architecture*, ISCA 2013.

## Architectural Support for a Variable Granularity Cache Memory System,

Snehasish Kumar, MSc Thesis.

### 2012 Amoeba-Cache: Adaptive Blocks for Eliminating Waste in the Memory Hierarchy,

<u>Snehasish Kumar</u>, Hongzhou Zhao, Arrvindh Shriraman, Eric Matthews, Sandhya Dwarkadas, and Lesley Shannon, In *Proceedings of the 45th Annual IEEE/ACM International Symposium on Microarchitecture*, MICRO 2012.

# Workshops, Posters & Presentations

- 01/16 SRC India Design Review, Intel Bangalore CoolCaches
- 06/15 SFU-ZU workshop on Big Data Data Structure Accelerators
- 12/13, 08/14 WoNDP'13, PACT'14 SQRL: Hardware Accelerator for Collecting Software Data Structures

### Awards

- 08/16 President's PhD Scholarship, Simon Fraser University
- '16, '14, '12 Graduate Fellowship, Simon Fraser University
  - 01/14 Special Graduate Entrance Scholarship, Simon Fraser University

# Projects

- 01/15 Networks: Parallel implementation of Kou, Markowsky and Berman (1981) algorithm
- 04/14 Natural Language Processing: Optimizing the Bitpar CKY parser
- 12/11 Computational Geometry: Interactive demo for the Linear Cell Complex (CGAL)
- 04/11 Machine Learning: Non-Negative Matrix Factorisation for large datasets

# Professional and Academic Experience

06/13 - 12/13 Research Intern: Systems Technology and Architecture

IBM, T.J. Watson Research Centre

'11 - '16 Research Assistant: SYNAR Group, Simon Fraser University

## Skills

Languages C++11, C, Python

Frameworks LLVM Compiler Infrastructure, Intel PIN

Simulators Multifacet GEMS (Ruby), MacSim