## **Curriculum Vitae**

## Nachiket Ganesh Kapre

Nanyang Technological University Office: 6513 8042

Assistant Professor Email: nachiket@ntu.edu.sg
Date of Birth: September 2nd, 1980 Email: nachiket@imperial.ac.uk

Citizenship: India

## **Research Fields**

Reconfigurable Computing, Spatial Architectures, Parallel Processing, FPGA-based Systems

## **Current Positions**

**Assistant Professor**, School of Computer Engineering, Nanyang Technological University, 2012-present

Honorary Research Fellow, Department of Electrical and Electronic Engineering, Imperial College London, 2012-present

Chief Technology Officer, Plunify Inc, 2014-present

## **Education**

Ph.D. Computer Science, California Institute of Technology,

September 2010

Thesis: SPICE<sup>2</sup>: A Spatial Parallel Architecture for Accelerating the SPICE Circuit Simulator

Committee: André DeHon (UPenn), Steven Trimberger (Xilinx), Shuki Bruck (Caltech), Alain Martin (Caltech), Dan Meiron (Caltech)

M.S. Computer Science, California Institute of Technology

Iune 2006

Thesis: Packet-Switched FPGA-Overlay Networks

M.S. Electrical Engineering, California Institute of Technology

June 2005

**B.E. Electronics and Telecommunication Engineering**, University of Pune, India

August 2002

Thesis: FPGA-based Testing System for Siemens Railway Signalling Relays

Ranked 1st in a class of 773 students in the Sophomore, Junior, and Senior years

## **Previous Employment**

**Imperial College London** Junior Research Fellow, Oct 2010–Sept 2012 An Imperial College fellowship awarded through a competitive selection process. Maxeler Inc. Consultant, July 2011–July 2012

Working towards commercialization of certain aspects of previous PhD research.

**University of Pennsylvania** Visiting Graduate Student, Fall 2006–Fall 2010 Working remotely towards a Caltech PhD

Xilinx Inc. Summer Intern, Summer 2005

Developed architectural modeling tools for memory controllers in streaming designs

Koch Lab (Caltech) Research Assistant, Spring 2004

Worked on analysis and engineering of a parallel, streaming saliency-detector using FPGAs

Paxonet Communications Inc. (now Conexant) Design Engineer, 2002–2003

Worked on design and verification of ASIC/FPGA IP cores for optical telecommunication protocols

Siemens Inc. Intern, 2002

Worked on automated testing of mechanical relays

## **Teaching**

Course Co-ordinator for CE4054/ES6154: **Programmable System-on-Chip**, Semester 2 2013, 2014, 2015: Nanyang Technological University

Designing SoC-based applications and hardware for final year undergraduate and MSc students.

Course Co-ordinator for CE4052/ES6152: **Embedded Software Development**, Semester 1 2013 (co-lecturer), 2014, 2015: Nanyang Technological University

Created Android-centric programming curriculum for final year undergraduate and MSc students.

Course Co-ordinator for CE7451: **Research Methods in Computer Science and Computer Engineering**, Semester 1 2013 (new course): Nanyang Technological University

Developed, and managed a course on core research-related topics that are relevant to PhD and MSc students at NTU.

Course Co-ordinator for ES7501: **Electronic Design Automation**, Semester 1 2013 (new course) : Nanyang Technological University

Developed a course on the underlying algorithms in FPGA CAD flow from design capture down to place-and-route tools.

Tutorials and Labs for CE1005: **Digital Logic**, Semester 2 2013, Semester 1 2014: Nanyang Technological University

Prepared and delivered a tutorial materials each week as well as conduct lab sessions for first year undergraduate students.

Tutorials and Labs for CE3001: **Advanced Computer Architecture**, Semester 1 2015 : Nanyang Technological University

Prepared and delivered a tutorial materials each week as well as conduct lab sessions for second/third year undergraduate students.

Guest Lecturer for DoC: **Custom Computing**, Winter 2011: Imperial College London Prepared and delivered a guest lecture on "Stream Programming" for undergraduate class managed by Prof. Wayne Luk.

Guest Lecturer for ISE2: **Computer Architecture**, Fall 2011: Imperial College London Prepared and delivered two new lectures for a 2nd year undergraduate class as part of the Integrated Systems Engineering curriculum.

TA for ESE68os2: **Computer Organization**, Spring 2007: University of Pennsylvania Conducted office hours and helped with grading homework assignments.

TA for CS137: **Electronic Design Automation**, Winter 2006 : California Institute of Technology Prepared and delivered project lectures for PhD/research students.

## **Professional Activities**

Technical Program Co-Chair for IEEE International Conference on Field-Programmable Technology 2015

Program Committee Member for IEEE International Conference on Field-Programmable Custom Computing Machines 2013, 2014, 2015, 2016.

Program Committee Member for IEEE International Conference on Field-Programmable Technology 2011, 2012, 2013, 2014, 2015, 2016.

Program Committee Member for International Conference on Field-Programmable Logic and Application 2013, 2014, 2015, 2016.

Program Committee Member for ASAP 2015, 2016.

Program Committee Member for RAW workshop 2015, 2016.

Program Committee Member for RECOSOC 2012, 2013.

Program Committee Member for HEART 2012, 2013, 2014, 2015, 2016.

Reviewer ACM Transactions on Reconfigurable Technology and Systems

Reviewer Design Automation Conference

Reviewer Design and Test of Computers

Reviewer IEEE Transactions on Computers

Reviewer IEEE Symposium on Circuits and Systems

Professional Member of the IEEE and ACM

## **Publications**

**Iournal Articles** 

### "Optimizing Soft Vector Processing in FPGA-based Embedded Systems"

Nachiket Kapre

IEEE Transactions on Reconfigurable Technology and Systems, Upcoming

### "A Case for Embedded FPGA-based SoCs in Energy-Efficient Acceleration of Graph Problem"

Pradeep Moorthy, Nachiket Kapre

Supercomputing Frontiers and Innovations (Special Best Papers Issue), Upcoming

## "Communication Optimization of Iterative Sparse Matrix-Vector Multiply on GPUs and FP-GAs" (11 pages)

4

Abid Rafique, George Constantinides, Nachiket Kapre

IEEE Transactions on Parallel and Distributed Systems Volume 26 Issue 1 Page 24-34, January 2015

# "SPICE<sup>2</sup> - Spatial Processors Interconnected for Concurrent Execution for acceleration the SPICE Circuit Simulator using an FPGA" (14 pages)

Nachiket Kapre, and André DeHon

in IEEE Transactions in Computer Aided Design of Integrated Circuits and Systems (Special Issue on Parallel CAD) Volume 31 Issue 1 Page 9-22 January 2012

## "An NoC Traffic Compiler for efficient FPGA implementation of Sparse Graph-Oriented Workloads" (12 pages)

Nachiket Kapre, and André DeHon

International Journal of Reconfigurable Computing Volume 2011 Article ID 745147 (Open-Access article)

## "Spatial Hardware Implementation for Sparse Graph Algorithms in GraphStep" (20 pages)

Michael deLorimier, Nachiket Kapre, Nikil Mehta and André DeHon

ACM Transactions on Autonomous and Adaptive Systems: Spatial Computing Special Issue Volume 6 Issue 3 Page 17:1-17:20 September 2011

## "Pipelined Saturated Accumulation" (11 pages)

Karl Papadantonakis, Nachiket Kapre, Stephanie Chan, and André DeHon

IEEE Transactions on Computers Volume 58 Issue 2 Page 208-219 February 2009

### Conference Articles

## "Marathon: Statically-Scheduled Conflict-Free Routing on FPGA Overlay NoCs" (8 pages) Nachiket Kapre

International Symposium on Field Programmable Custom Computing Machines, May 2016

## "Improving Classification Accuracy of a Machine Learning approach for FPGA Timing Closure" (4 pages)

Que Yanghua, Nachiket Kapre, Harnhua Ng, Kirvy Teo

International Symposium on Field Programmable Custom Computing Machines, May 2016

## "GPU-Accelerated High-Level Synthesis for Bitwidth Optimization of FPGA Datapaths" (8 pages)

Nachiket Kapre, Ye Deheng

International Symposium on Field-Programmable Gate Arrays, Feb 2016

### "Reliable Timing Closure for FPGA Designs through Machine Learning" (4 pages)

Que Yanghua, Chinnakkannu Adaikkal Raj, Harnhua Ng, Kirvy Teo and Nachiket Kapre *International Symposium on Field-Programmable Gate Arrays*, Feb 2016

## "Hoplite: Building Austere Overlay NoCs for FPGAs" (6 pages)

Nachiket Kapre, Jan Gray

International Conference on Field Programmable Logic and Applications, Sep 2015

Michael Servit Best Paper Award

## "Limits of FPGA Acceleration of 3D Green's Function Computation for Geophysical Applications" (6 pages)

Nachiket Kapre, Selvakumar Jayakrishnan, Parjanya Gupta, Sagar Masuti, Sylvain Barbot *International Conference on Field Programmable Logic and Applications*, Sep 2015

5

## "Custom FPGA-based Soft-Processors for Sparse Graph Acceleration" (8 pages)

Nachiket Kapre

International Conference on Application-Specific Systems, Architectures and Processors, July 2015

## "Zedwulf: Power-Performance Tradeoffs of a 32-node Zynq SoC cluster" (8 pages)

Pradeep Moorthy, Nachiket Kapre

International Symposium on Field Programmable Custom Computing Machines, May 2015

## "Driving Timing Convergence of FPGA Designs through Machine Learning and Cloud Computing" (8 pages)

Nachiket Kapre, Bibin Chandrashekharan, Harnhua Ng, Kirvy Teo

International Symposium on Field Programmable Custom Computing Machines, May 2015

## "Energy-Efficient Acceleration of OpenCV Saliency Computation using Soft Vector Processors" (8 pages)

Gopalakrishna Hegde, Nachiket Kapre

International Symposium on Field Programmable Custom Computing Machines, May 2015

## "InTime: A Machine Learning Approach for Efficient Selection of FPGA CAD Tool Parameters" (4 pages)

Nachiket Kapre, Harnhua Ng, Kirvy Teo and Jaco Naude

International Symposium on Field-Programmable Gate Arrays, Page 23-26, February 2015

### "On Data Forwarding in Deeply Pipelined Soft Processor" (8 pages)

Hui Yan Cheah, Suhaib A. Fahmy and Nachiket Kapre

International Symposium on Field-Programmable Gate Arrays, Page 181-189, February 2015

## "Relax-Miracle: GPU Parallelization of Semi-Analytic Fourier-Domain solvers for Earthquake Modeling" (8 pages)

Sagar Masuti, Sylvain Barbot, and Nachiket Kapre

International Conference on High Performance Computing December 2014

## "Comparing Soft and Hard Vector Processing in FPGA-based Embedded Systems" (6 pages) Soh Jun Jie, and Nachiket Kapre

International Conference on Field Programmable Logic and Applications, Page 1-7, September 2014 (Best Paper Nominee)

## "Fanout Decomposition Dataflow Optimizations for FPGA-based Sparse LU Factorization" (4 pages)

Siddhartha, and Nachiket Kapre

International Conference on Field-Programmable Technology, Page 252-255, December 2014

### "Analysis and Optimization of a Deeply Pipelined FPGA Soft Processor" (4 pages)

Hui Yan Cheah, Suhaib A. Fahmy and Nachiket Kapre

International Conference on Field-Programmable Technology, Page 235-238, December 2014

## "Heterogeneous Dataflow Architectures for FPGA-based Sparse LU Factorization" (4 pages) Siddhartha, and Nachiket Kapre

International Conference on Field Programmable Logic and Applications, Page 1-4, September 2014

# **"Breaking Sequential Dependencies in FPGA-based Sparse LU Factorization"** (4 pages) Siddhartha and Nachiket Kapre

IEEE Symposium on Field-Programmable Custom Computing Machines, Page 60-63, May 2014

## "MixFX-SCORE: Heterogeneous Fixed-Point Compilation of Dataflow Computations" (4 pages) Deheng Ye and Nachiket Kapre

IEEE Symposium on Field-Programmable Custom Computing Machines, Page 206-209, May 2014

## "Timing Fault Detection in FPGA-based Circuits" (4 pages)

Edward Stott, Joshua M. Levine, Peter Y. K. Cheung, and Nachiket Kapre *IEEE Symposium on Field-Programmable Custom Computing Machines*, Page 96-99, May 2014

## "System-Level FPGA Device Driver with High-Level Synthesis Support" (8 pages)

Vipin Kizhepatt, Shreejit Shanker, Dulitha Gunasekara, Suhaib A Fahmy, Nachiket Kapre *International Conference on Field Programmable Technology*, Page 128-135, December 2013

# "Exploiting Input Parameter Uncertainty for Reducing Datapath Precision of SPICE Device Models" (8 pages)

Nachiket Kapre

IEEE Symposium on Field-Programmable Custom Computing Machines, Page 189-197, April 2013

## "Application Composition and Communication Optimization of Iterative Solvers using FP-GAs" (8 pages)

Abid Rafique, Nachiket Kapre and George Constantinides

IEEE Symposium on Field-Programmable Custom Computing Machines, Page 153-160, April 2013 (HiPEAC Paper Award)

### "Enhancing Performance of Tall-Skinny QR factorization" (8 pages)

Abid Rafique, Nachiket Kapre, and George Constantinides

International Conference on Field Programmable Logic and Applications, Page 443-450, August 2012

## "A High Throughput FPGA-based Implementation of the Lanczos Method for the Symmetric Extremal Eigenvalue Problem" (12 pages)

Abid Rafique, Nachiket Kapre, and George Constantinides

International Symposium on Applied Reconfigurable Computing, Page 239-250, March 2012

## "FX-SCORE: A Framework for Fixed-Point Compilation of SPICE Device Models using Gappa++" (8 pages)

Helene Martorell, and Nachiket Kapre

IEEE Symposium on Field-Programmable Custom Computing Machines, Page 77-84, April 2012

## "VLIW-SCORE: Beyond C for Sequential Control of SPICE FPGA Acceleration" (9 pages)

Nachiket Kapre, and André DeHon

International Conference on Field Programmable Technology, Page 1-9, December 2011 (Best Paper Award)

## "Parallelizing Sparse Matrix-Solve for SPICE Circuit Simulation using FPGAs" (9 pages)

Nachiket Kapre, and André DeHon

International Conference on Field Programmable Technology, Page 190-198, December 2009

# "Performance Comparison of Single-Precision SPICE Model-Evaluation on FPGA, GPU, Cell, and Multi-Core Processors" (8 pages)

Nachiket Kapre, and André DeHon

International Conference on Field Programmable Logic and Applications, Page 65-72, September 2009

#### "Acceleration SPICE Model-Evaluation using FPGAs" (8 pages)

Nachiket Kapre, and André DeHon

IEEE Symposium on Field-Programmable Custom Computing Machines, Page 37-44, April 2009

## "Optimistic Parallelization of Floating-Point Accumulation" (9 pages)

Nachiket Kapre, and André DeHon

IEEE Symposium on Computer Arithmetic, Page 205-216, June 2007

## "Packet-Switched vs. Time-Multiplexed FPGA Overlay Networks" (10 pages)

Nachiket Kapre, Nikil Mehta, Michael deLorimier, Raphael Rubin, Henry Barnor, Michael Wilson, Michael Wrighton, and André DeHon

IEEE Symposium on Field-Programmable Custom Computing Machines, Page 205-216, April 2006 (FCCM20 25-most Influential Papers Award)

### "GraphStep: A System Architecture for Sparse Graph Algorithms" (9 pages)

Michael deLorimier, Nachiket Kapre, Nikil Mehta, Dominic Rizzo, Ian Eslick, Raphael Rubin, Tomas Uribe, Thomas Knight Jr., and André DeHon

IEEE Symposium on Field-Programmable Custom Computing Machines, Page 143-151, April 2006

### "Pipelined Saturated Accumulation" (11 pages)

Karl Papadantonakis, Nachiket Kapre, Stephanie Chan, and André DeHon *International Conference on Field-Programmable Technology*, Page 19-26, December 2005

### "Design Patterns for Reconfigurable Computing" (11 pages)

André DeHon, Joshua Adams, Michael deLorimier, Nachiket Kapre, Yuki Matsuda, Helia Naeimi, Michael Vanier, and Michael Wrighton

IEEE Symposium on Field-Programmable Custom Computing Machines, Page 13-23, April 2004

## Workshop Articles

## "Limits of Statically Scheduled Token Dataflow Processing" (8 pages)

Nachiket Kapre, and Siddhartha

4th International Workshop on Data-Flow Execution Models for Extreme Scale Computing (co-located with PACT 2014), Page 1-8, August 2014

# "SPICE<sup>2</sup> - A Spatial Parallel Architecture for Accelerating the SPICE Circuit Simulator" (6 pages)

Nachiket Kapre, and André DeHon

First Workshop on Intersections between Computer Architecture and Reconfigurable Logic, co-located with MICRO, December 2010

# "An NoC Traffic Compiler for efficient FPGA Implementation of Parallel Graph Applications" (8 pages)

Nachiket Kapre, and André DeHon

Reconfigurable Communication-centric Systems-on-Chip May 2010

#### Posters

#### "Evaluating Embedded FPGA Accelerators for Deep Learning Applications"

Gopalakrishna Hegde, Siddhartha, Nachiappan Ramasamy, Vamsi Buddha, <u>Nachiket Kapre International Conference on Field-Programmable Custom Computing Machines May 2016

# "Communication Optimization for the 16-core Epiphany Floating-Point Processor Array" Siddhartha, <u>Nachiket Kapre

International Conference on Field-Programmable Custom Computing Machines May 2016

## "Machine-Learning driven Auto-Tuning of High-Level Synthesis for FPGAs"

Li Ting, Harri Sapto Wijaya, <u>Nachiket Kapre International Symposium on Field-Programmable Gate Arrays Feb 2016

## "Sparse Graph Processing with Soft Processors"

Nachiket Kapre

International Conference on Field-Programmable Custom Computing Machines May 2015

## "FPGA Acceleration of Irregular Iterative Computations using Criticality-Aware Dataflow Optimizations"

Siddhartha, and Nachiket Kapre

International Symposium on Field-Programmable Gate Arrays February 2015

### "Measuring Timing Errors in FPGA-based Circuits"

Joshua M. Levine, Edward Stott, and Nachiket Kapre

IEEE Workshop on Silicon Errors in Logic - System Effects April 2014

## **Book Chapter**

## "Programming FPGA Applications in VHDL" (129-153)

Nachiket Kapre, and André DeHon

From *Reconfigurable Computing: The Theory and Practice of FPGA-based Computation* Published by Morgan Kaufmann, Copyright 2008, ISBN-13: 978-0-12-370522-8 By Scott Hauck and André DeHon

### "Accelerating the SPICE Circuit Simulator using an FPGA - A Case Study" (389-427)

Nachiket Kapre, and André DeHon From *High Performance Computing using FPGAs* Published by Springer New York, Copyright 2013, ISBN-13: 978-1-4614-1790-3 By Wim Vanderbauwhere and Khaled Benkrid

## Magazine Articles

Nachiket Kapre, Dirk, Walther, and Christof Koch, and André DeHon "Saliency on a chip: a digital approach with an FPGA." *The Neuromorphic Engineer*, Volume 1, Issue 2, Autumn 2004

#### Grants

AcRF Tier1 Grant (PI): S\$100K for Financial Year 2015-2016.

MIT-SMART Innovation Grant (Co-PI): \$\$50K for Financial Year 2015-2017.

Delta Electronics Grant (Co-PI): S\$100K for Financial Year 2015-2017.

**CELT Edex Excellence in Education** (PI): S\$37K for Financial Year 2015.

**CELT Edex Excellence in Education** (PI): S\$40K for Financial Year 2014.

**Relax FPGA Acceleration – Earth Observatory of Singapore** (Collaborator, PI: Sylvain Barbot, EOS Singapore), 2013-2014

CELT Edex Excellence in Education (PI): S\$25K for Financial Year 2013.

AcRF Tier1 Grant (PI): S\$150K for Financial Year 2013-2014.

NTU College of Engineering Seed Grant (PI): \$\$50K for Financial Year 2012.

NTU Startup Grant (PI): S\$100K for 3 years 2012–2015.

Compute-Oriented FPGA Device Architectures and Tools – NSERC, Canada (Collaborator, PI: Prof. Guy Lemieux, UBC Vancouver), 2010-2013

### Selected Talks

"A Case for Embedded FPGA-based SoCs for Energy-Efficient Acceleration of Graph Problems"

Supercomputing Frontiers 2015, A\*Star, Singapore, March 2015

"Assessment Engineering for High-Impact, Programming-Centric Courses in Science and Technology" and "Continuous Automated Assessment for Project-Centric Learning" at the *Good to Great* teaching workshops in 2014 and 2015, NTU Singapore.

### "FX-SCORE Precision Analysis", and "Libraries for NVIDIA GPUs"

Invited Lecturer PAPAA Summer School 2014, University of Hong Kong, July 2014

### "Rebooting SCORE for the Next Generation"

Invited Talk at ASPLOS 2012 as part of CCPC Workshop, London, March 2012, and Guest Lecture for Custom Computing CO108 (Imperial College), March 2012

"Parallelizing SPICE using FPGAs" Invited speaker at the ASC 2012 Summer School, Imperial College London.

## "Accelerating SPICE using FPGAs: A Retrospective and Vision for the Future"

Research Seminars at *University of Southampton*, Oxford, University of Glasgow, University of York, National University of Singapore, Mahanakorn University of Technology (2011-2013).

## "Spatial SPICE Mapping and Lessons" (work-in-progress)

CASCADIA Workshop, *University of British Columbia (UBC)*, Vancouver Canada, August 2010, *IBM, Inc.*, Austin USA, August 2009,

Indian Institute of Science, Bengaluru India, March 2010,

*Xilinx, Inc,* San Jose USA, February 2009.

### "Exploiting Application Structure in On-Chip Network Design"

Gent, Belgium University of Gent and Munich, Germany TU Munich July-August 2007.

### **Patents**

André DeHon and Nachiket Kapre

"Method and a circuit using an associative calculator for calculating a sequence of non-associative operations"

US 2007/0234128, Under review, Applied in January 2007

## Advising and Supervision

#### **Current PhD Students:**

Magzhan Ikram (NTU+A\*Star) started August 2015.

Siddhartha (NTU) started January 2013.

Ye Deheng (NTU) started August 2012.

Cheah Hui Yan (NTU) started August 2011.

Andrew Bean (Imperial College) started September 2011.

#### **Current MSc/UG Students:**

Chethan Kumar Basavaraju, Nachiappan Ramasamy, Gourav Modi, Joseph George, Jacob Ginu, Manoj Venkat (NTU) MSc 2015-16

#### **Past Students:**

Que Yanghua (NTU), Bachelor's 2015

Adaikkal Raj, Selva Kumar Jayakrishnan, Jianrong, Swetha Venugopal, Kiran Ganapathi, Kunal Gokhale, (NTU) MSc 2014-15

Kanchan Kaur, (NTU) MSc 2013-14

Pradeep Moorthy, Han Jianglei (NTU), Bachelor's 2014

Soh Jun Jie (NTU), Bachelor's 2013-14

Lim Hui Hui (NTU), Bachelor's 2013

Abid Rafique (Imperial College), PhD 2013.

Siddhartha and Ramitha (Imperial College), BEng 2012.

Dulitha Gunasekara (Imperial College), MEng 2012.

Helene Martorell, Emmanouil Spanakis, Fang Zhou, Wei Lizhong (Imperial College), MSc 2011.

Coryan Wilson-Shah (Cambridge/Imperial College), UROP 2011.

Cody Huang (UC Davis/Imperial College), Exchange Student 2011.

### **Awards and Honors**

**Ranked 1st** in Second, Third and Fourth years of undergraduate engineering BEng degree at College of Engineering, Pune, India (1998-2002) in a class of 773 students

**Teaching and Research Assistantships** at California Institute of Technology and University of Pennsylvania (2003-2010)

**Young Researchers Meeting, Providence, USA 2010** invitation by Dept. of Science and Technology, India and the Indo-US Science and Technology Forum

Imperial College Junior Research Fellowship (2010-2012) awarded through a competitive recruitment process

**Imperial College Honorary Research Fellowship (2012-2015)** by invitation of the Circuits and Systems Group

**Best Paper Award**, Final publication of PhD research, International Conference on Field-Programmable Technology 2011

FCCM20 25-most Influential Papers Award at FCCM 2013, One of the earliest papers exploring NoC designs on FPGA substrates, IEEE Symposium on Field-Programmable Custom Computing Machines 2006

**HiPEAC Paper Award, 2013**, For the FCCM 2013 paper 'Application Composition and Communication Optimization of Iterative Solvers using FPGAs'

**Best Paper Award**, Hoplite NoC Paper, International Conference on Field-Programmable Logic and Applications 2015