# SOMESH SINGH

Ph.D. Candidate Dept. of Computer Science and Engineering Indian Institute of Technology Madras Chennai - 600 036 India Email: somesh.singh1992@gmail.com ssomesh@cse.iitm.ac.in Homepage: https://ssomesh.github.io GitHub: https://github.com/ssomesh/

Phone: +91 9791264972

## RESEARCH INTERESTS

High-Performance Computing; Parallel Computing; Graph Analytics.

#### Area of Research

The focus of my dissertation research is on accelerating large-scale (irregular) graph processing on graphics processing unit (GPU). I approach this problem by designing approximate computing techniques for improving execution performance of parallel graph analytics by trading off computational accuracy.

## Graduate Courses

Mathematical Concepts for Computer Science, Advanced Data Structures and Algorithms, Computer Architecture, High-Performance Parallel Computing, Program Analysis, Modern Compilers, Indexing and Searching in Large Datasets, Probability and Computing, Pattern Recognition and Machine Learning, Digital Design Verification, CAD for VLSI Systems.

## Programming Languages

• Fluent: C/C++, CUDA, OpenMP

• Familiar: OpenCL, Python, MATLAB, LLVM

## Publications

- Somesh Singh and Rupesh Nasre, "Scalable and Performant Graph Processing on GPUs using Approximate Computing", *IEEE Transactions on Multi-Scale Computing Systems (TMSCS)*, vol. 4, no. 3, pp. 190–203, 2018. https://doi.org/10.1109/TMSCS.2018.2795543. [Citation count: 3]
- Somesh Singh and Rupesh Nasre, "Optimizing Graph Processing on GPUs using Approximate Computing: Poster", 24th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming (PPoPP 2019), pp. 395–396. https://doi.org/10.1145/3293883.3295736.
- R. De Maria, J. Andersson, V.K.B. Olsen, L. Field, M. Giovannozzi, P.D. Hermes, N. Høimyr, S. Kostoglou, G. Iadarola, E. McIntosh, A. Mereghetti, J. Molson, D. Pellegrini, T. Persson, M. Schwinzerl, E.H. Maclean, K.N. Sjobak, I. Zacharov and S. Singh, "SixTrack Project: Status, Runtime Environment and New Developments", 13th International Computational Accelerator Physics Conference (ICAP 2018), pp. 172–178. https://doi.org/10.18429/JACOW-ICAP2018-TUPAF02.
- R. De Maria, J. Andersson, V.K.B. Olsen, L. Field, M. Giovannozzi, P.D. Hermes, N. Høimyr, S. Kostoglou, G. Iadarola, E. McIntosh, A. Mereghetti, J. Molson, D. Pellegrini, T. Persson, M. Schwinzerl, E.H. Maclean, K.N. Sjobak, I. Zacharov and S. Singh, "SixTrack V and runtime environment", *International Journal of Modern Physics A (IJMPA)*, vol. 34, no. 36, 1942035, 2019. https://doi.org/10.1142/S0217751X19420351. (Invited paper)

## WORKS UNDER SUBMISSION

- Approximate computing techniques targeting GPU-specific aspects for efficient graph processing on SIMT architectures.
- Faster estimation of top-k betweenness centrality vertices in a graph on heterogeneous architectures using approximate computing.

#### ACCOMPLISHMENTS AND AWARDS

- Google Summer of Code 2018 participant with CERN-HSF.
  - Developed a standalone optimized parallel implementation of (a part of) SixTrackLib, a particle-tracking library.
  - The work contributed to the IJMPA 2019 and ICAP 2018 papers.
  - Major challenges: writing library code that is performance-portable across multicore CPUs and manycore GPUs.
  - Technologies involved: C/C++, OpenCL 1.2
- Google Summer of Code 2017 participant with CERN-HSF.
  - Developed SALLOC, an arena based memory allocator for SIMT architectures, with support for the *vector* container, in CUDA.
  - The arena supports allocation of multiple vectors; vector container on the arena supports push\_back(), pop\_back() and getIndex() operations.
  - Major challenges: designing a suitable data structure for the arena that is amenable to parallelization of vector operations on GPU; deciding the APIs to be exposed to the user.
- Secured 4th place in HiPC 2016 Student Parallel Programming Challenge (Intel Xeon-Phi track) (Team of 2).
  - Implemented an efficient scheme for labeling connected clusters in a 3-dimensional grid using the Union-Find data structure. All points in a cluster were to be assigned the same label.
  - Major challenges: choosing or designing a data structure that supports set membership; designing an algorithm for the task, with reduced computational complexity.
  - Technologies involved: C++, OpenMP.
- Secured 4th place in HiPC 2015 Student Parallel Programming Challenge (Intel Xeon-Phi track) (Team of 3).
  - Implemented an efficient parallel version of the KMeans++ algorithm for assigning membership to each data point in a high dimensional unlabeled data set, to maximize the Dunn-index.
  - Major challenge: choosing or designing a clustering algorithm with low computational complexity and sufficient data parallelism.
  - Technologies involved: C++, OpenMP.
- Organized CUDA Workshop during Exebit 2018 at the Indian Institute of Technology Madras.
- Awarded ACM SIGPLAN PAC grant for attending PPoPP 2019.
- Awarded the STAR TA award for contributions as a Teaching Assistant to the course "GPU Programming" for the period July November 2017.

## SERVICES

- Committee Member in Artifact Evaluation Committee for ECOOP 2020.
- Committee Member in Artifact Evaluation Committee for PPoPP 2018.
- Reviewer for INAE Letters in 2018.
- Reviewer for IEEE Embedded Systems Letters in 2017.

#### PROJECTS AND INTERNSHIPS

#### Course Projects

• Supergraph Containment Search (Team of 2).

October - November 2016

- Implemented an efficient supergraph containment search technique using the *filtering* and verification framework, in C++.
- Optimized the online processing time required for finding the (small) graphs, in the database, that are present in the (large) query graph. Our team **won** the contest for minimizing the querying time over 200 query graphs for a database containing 70K graphs.
- Major challenges: indexing the database graphs; selecting graph features that help minimize the number of database graphs to be searched in the query graph during the verification phase.

• Five stage RISC pipeline.

- October November 2015
- Implemented a five stage pipeline for a RISC processor with operand-forwarding using Bluespec.
- Domain Specific Language For Circuit Design (Team of 2).

March - April 2015

- Implemented an internal DSL, in Python, that allows specifying a boolean expression in the Disjunctive Normal Form (DNF), and supports generating a netlist, comprising AND, OR, NOT logic gates, for the minimal form of the boolean expression.
- Major challenges: deciding the APIs to be exposed to the user; choice of the right data structure for storing the boolean function, that is amenable to the algorithm (Quine-McCluskey algorithm) for minimizing the boolean function.

#### Other Projects

• Object Tracking in Video Using Parallel Computing

March - April 2014

 Implemented a sum of absolute differences (SAD) based parallel block-matching algorithm for tracking the object of interest in a video, in CUDA.

• Online Gaming April 20

 Designed an interactive single player online game, using HTML5 and JavaScript, that can be played on an internet browser.

#### Internships

• RTOS based Embedded Software Design and Verification of Serial Communication

June - July 2013

- Intern at Larsen and Toubro SIPL, Bengaluru, India.
- Implemented a device driver for an external UART device for the RTOS, VxWorks; established communication between host-PC and target PowerPC board using a serial protocol.
- Autonomous Mobile Robots A Study

May - June 2012

- Intern at the Indian Institute of Technology Delhi, India.
- Programmed a mobile robot (iRobot) to move autonomously in an unstructured environment, using 'kinect' for visual feedback, using the Player/Stage software.

# MENTORING

• Mentor for a masters project

2017-18

- Objective of the project: Fast estimation of top-k betweenness centrality vertices in a graph, aided by approximate computing.
- Mentored two undergraduate students. They worked on:
  - Graph-based Image Segmentation

December 2016

- \* Modeled an image as a weighted graph and performed image segmentation using various graph algorithmic techniques on the underlying graph.
- Image Segmentation and Object Tracking on GPU

May - June 2015

\* Implemented a parallel seed-based region growing algorithm for image segmentation in CUDA.

## EDUCATION

Doctor of Philosophy (Ph.D.)

July 2014 - August 2020 (expected)

Indian Institute of Technology MadrasAdviser: Dr. Rupesh Nasre

• CGPA: 7.79 of 10

Bachelor of Technology in Computer Science and Engineering National Institute of Technology Uttarakhand

July 2010 - May 2014

• CGPA: 8.68 of 10