# Jagadish B. Kotra

# Researcher, Member of Technical Staff, AMD Research, Austin, Texas

Website: https://jbk5155.github.io/ Email: jagadishkotra@gmail.com

Email: <u>jagadishkotra@gmail.com</u> Mobile: **(813)-468-8220** 

#### **RESEARCH INTERESTS**

Computer Architecture, Operating Systems, Hardware-OS co-design, Heterogeneous CPU/GPU Systems.

## **EDUCATION**

Doctor of Philosophy, The Pennsylvania State University

2010 - 2017

Advisor: Dr. Mahmut T. Kandemir.

PhD Dissertation Topic: Hardware-Software co-design for optimizing memory-hierarchy in large and many-core systems.

**Bachelor of Technology**, Acharya Nagarjuna University Major: Electronics and Communications Engineering

2002 - 2006

#### **PUBLICATIONS**

- "Improving the Utilization of Micro-operations cache in x86 Processors". **Jagadish Kotra**, John Kalamatianos [MICRO 2020]
- "DSM: A Case for Hardware-Assisted Merging of DRAM Rows with Same Content". Armin Vakil, Mahmut Kandemir, Jagadish Kotra
   [SIGMETRICS 2020]
- "Centaur: A Novel Architecture for Reliable, Low-Wear, High-Density 3D NAND Storage". Chun-yi Liu, Jagadish Kotra, Myoungsoo Jung, Mahmut Kandemir. [SIGMETRICS 2020]
- "PreFAM: Understanding the Impact of Prefetching in Fabric-Attached Memory Architectures". Vamsee R.K, Jagadish Kotra, Clayton H., Hammond S.D, Amro Awad. [MemSys 2020]
- "Optimization of Inter-Cache Traffic Entanglement in Tagless Caches with Tiling Opportunities". S.R. Swamy, Sumitha G., Hariram G., Jagadish Kotra, Madhu M., Jack S., Mahmut K., Vijay N. [CASES 2020]
- "CASH: Compiler Assisted Hardware Design for improving DRAM energy efficiency in CPU-based Inference Systems". Anup Sarma, Huaipann Jiang, Ashutosh Pattnaik, Jagadish Kotra, Mahmut Kandemir, Chita Das
   [MemSys 2019]
- "SOML Read: Rethinking the read operation granularity of 3D NAND SSDs". Chun-yi Liu, Jagadish Kotra, Myoungsoo Jung, Mahmut Kandemir, Chita R. Das.
   [ASPLOS 2019]
- "CHAMELEON: A co-design based dynamically reconfigurable heterogeneous memory system". Jagadish Kotra, Haibo Zhang, Alaa R. Alameldeen, Chris Wilkerson, Mahmut Kandemir. [MICRO 2018]
- "MDACache: Caching for Multi-Dimensional-Access Memories". Sumitha George, Minli Liao, Huaipan Jiang, Jagadish Kotra, Mahmut Kandemir, John Sampson, Vijaykrishnan Narayanan. [MICRO 2018]
- "PEN: A Design of Partial-Erase for 3D NAND-based High Capacity SSDs". Chun-Yi Liu, **Jagadish Kotra**, Myoungsoo Jung, Mahmut Kandemir. [FAST 2018]
- "Enhancing Computation-to-Core Assignment with Physical Location Information". Orhan Kislal, Jagadish Kotra, Xulong Tang, Mahmut Kandemir, Myoungsoo Jung, Mustafa Karakoy.
   [PLDI 2018]
- "A Learning-guided Hierarchical Approach for Biomedical Image Segmentation". Huaipan Jiang, Anup Sarma, Jihyun Ryoo, **Jagadish Kotra**, Meena A., Chita Das, Mahmut Kandemir. [Socc 2018]
- "Hardware-software co-design to mitigate DRAM refresh overheads: A case for DRAM refresh-aware process scheduling". Jagadish Kotra, Narges S., Zeshan Chishti, Mahmut Kandemir. [ASPLOS 2017]

- "Congestion Aware Memory Management on NUMA platforms: A VMware ESXi case study". Jagadish Kotra, Seongbeom Kim, Kamesh Madduri, Mahmut Kandemir. [IISWC 2017]
- "Quantifying the Potential Benefits of Near-Data Computing in Manycores". **Jagadish Kotra**, Diana Guttman, Nachiappan C, Mahmut Kandemir, Chita Das. [MASCOTS 2017]
- "Location-Aware Computation Mapping for Manycores". Orhan Kislal, Jagadish Kotra, Xulong Tang, Mahmut Kandemir, Myoungsoo Jung, Mustafa Karakoy.
   [PACT 2017]
- "Re-NUCA: A practical NUCA architecture for Re-RAM based last-level caches". **Jagadish Kotra**, Mohammed Arjomand, Diana Guttman, Mahmut Kandemir, Chita Das. [IPDPS 2016]
- "Improving Bank-Level Parallelism for Irregular Applications". Xulong Tang, Mahmut Kandemir, Praveen Yedlapalli, Jagadish Kotra.

  [MICRO 2016]

# Best paper award nominee

- "Cache-Aware Approximate Computing for Decision Tree Learning". Orhan Kislal, Mahmut Kandemir, Jagadish Kotra. [IPDPS-Parlearning 2016]
- "Thermal-aware Application Scheduling on Device-heterogeneous Embedded Architectures". Karthik Swaminathan, Jagadish Kotra, Mahmut Kandemir and Vijaykrishnan Narayanan. [VLSID 2015]
- "Network Footprint Reduction through Data Access and Computation Placement in NoC-Based Many cores". Jun Liu, Jagadish Kotra, Wei Ding, Mahmut Kandemir.

  [DAC 2015]
- "Phase Detection with Hidden Markov Models for DVFS on Many-Core Processors". Joshua Booth,
   Jagadish Kotra, Hui Zhao, Mahmut Kandemir, Padma Raghavan. [ICDCS 2015]
- "Meeting Midway: Improving DRAM Performance and Off-Chip Latencies with Memory-Side Prefetching". Praveen Yedlapalli, Jagadish Kotra, Emre Kultursay, Mahmut Kandemir, Anand Sivasubramaniam, Chita R. Das.

## **CONFERENCE TALKS**

- "Congestion-aware Memory Management: A VMWare ESXi case study". Seattle, USA. IISWC-2017.
- "Hardware-Software co-design: A case for refresh-aware process scheduling". Xian, China. ASPLOS-2017.
- "Quantifying the potential benefits of Near-data Computing in Manycores". Banff, Quebec, Canada. MASCOTS-2017.
- "Re-NUCA: A NUCA architecture for Re-RAM based last-level caches". Chicago, USA. IPDPS-2016.
- "Cache-Aware Approximate Computing for Decision Tree Learning". Chicago, USA. IPDPS-2016.
- "Network Footprint Reduction through Data Access and Computation Placement in NoC-Based Many cores". San Francisco, USA. DAC-2016.

## **U.S PATENTS**

- "A hardware-assisted DRAM row merging mechanism for energy-efficiency". Jagadish Kotra. (About to be filed in USPTO, AMD)
- "A method and apparatus for reducing average latency of long latency load instructions". **Jagadish Kotra**, John Kalamatianos. (About to be filed in USPTO, AMD).
- "A case for atomics arbitration". Sergey Blagodurov, John Alsop, **Jagadish Kotra**, Marko Scrbak. (About to be filed in USPTO, AMD)
- "GPU Reach Optimizations". Jagadish Kotra, Michael Lebeane (About to be filed in USPTO, AMD)
- "Method and Apparatus for Speculative Data Promotion from the Cache to the Physical Register File".

  Jagadish Kotra, John Kalamatianos. (About to be filed in USPTO, AMD)
- 170499-US-NP. "A method and apparatus for Optimizing Micro-op Cache". **Jagadish Kotra**, John Kalamatianos (filed in USPTO, on behalf of AMD).
- "Hardware-software collaborative address mapping scheme for efficient processing-in-memory systems". Mahzabeen I., Shaizeen A., Nuwan J., **Jagadish Kotra**. (About to be filed in USPTO, AMD)

- 180128-US-NP. "A Method and Apparatus for temperature-gradient aware data-placement for 3D stacked DRAMs in GPUs". Jagadish Kotra, Karthik R., J. Greathouse. (filed in USPTO, on behalf of AMD).
- 180399-US-NP. "Method and apparatus for improving the utilization of micro-op caches via compaction". Jagadish Kotra, John Kalamatianos. (filed in USPTO, on behalf of AMD)
- 180243-US-NP. "A Method for a Generative Adversarial Network Resource Scheduler." Sergey Blagodurov, Abhinav Vishnu, Thaleia dimitra Doudali, Jagadish Kotra. (filed in USPTO, on behalf of AMD)
- "Method and Apparatus for Speculative Data Promotion from the Cache to the Physical Register File".

  Jagadish Kotra, John Kalamatianos. (About to be filed in USPTO, AMD)
- "Methods for Configuring Span of Control Under Varying Temperature". Tony Gutierrez, Yasuko Eckert,
   Sergey Blagoduriv, Jagadish Kotra. (About to be filed in USPTO, AMD)
- "Micro-operations cache Allocation Filter". **Jagadish Kotra**, Marko Scrback, Mahzabeen Islam, John Kalamatianos (filed in USPTO, on behalf of AMD)
- "Mechanisms for Temporal Link Encoding". Onur Kayiran, Steve Raasch, Sergey Blagodurov, Jagadish Kotra. (filed in USPTO, on behalf of AMD)
- US 2017/0371777 A1. "Memory Congestion Aware NUMA Management". Jagadish Kotra, S. Kim, Fei Guo. (filed in USPTO, on behalf of VMware). (https://patents.google.com/patent/US20170371777A1/en)
- US 2018/0088853 A1. "H/W-S/W co-design for heterogeneous memory management". Jagadish Kotra, Alaa A., Chris Wilkerson, Jaewoong S. (https://patents.google.com/patent/US20180088853A1/en), Intel.
- US8627230B2. "Intelligent Command Prediction". **Jagadish Kotra**, Anuja Deedwaniya, Shayne Grant, et al. Granted-2014. (https://patents.google.com/patent/US8627230), IBM.

#### **TEACHING EXPERIENCE**

- Served as a Teaching Assistant for undergraduate Operating Systems (FALL-2011), graduate Operating Systems (Spring-2011) and beginners programming languages course (FALL-2010)
- Guest Lectured undergrad computer architecture course (FALL-2015).

#### PROFESSIONAL EXPERIENCE

## Researcher, Member of Technical Staff, AMD Research, Austin.

[April 2018 - Present]

- Worked on Exa-scale Path Forward Research program funded by DoE. Worked on projects involving GPU virtual memory optimizations and CPU critical load prediction.
- Several patent applications were filed in USPTO by AMD and several papers are under submission.

#### Post-doctoral Researcher, AMD Research, Austin.

[Sept 2017 – March 2018]

- Worked on Exa-scale Path Forward project funded by DoE. Work included characterizing MPI applications for identifying hardware bottlenecks.
- Worked on optimizing CPU front-end features to improve performance and energy efficiency. Paper published in MICRO-2020.

**Research Intern, Intel Labs, Oregon.** Mentors: Alaa R. Alameldeen, Zeshan C. [Jan 2016 – May 2016]

- Worked on a hardware-software co-design for heterogeneous memory management. (Patent filed on fast-track in USPTO. Paper accepted in MICRO 2018.
- Worked on a hardware-software co-design to optimize DRAM refresh overheads. Paper accepted in ASPLOS 2017.

# **Performance Intern, VMware Performance Team, CA**. Mentor: S. Kim.

[June 2015 – Sept 2015]

 Worked on congestion-aware memory management in VMware ESXi. Proposed a novel dynamic latency probing algorithm that detects congestion in a NUMA system. Proposed and evaluated congestionaware memory allocation/migration techniques in commercial VMware ESXi hypervisor. This work is currently part of commercial VMware ESXi. Patent filed in USPTO, paper accepted in IISWC 2017.

# Graduate Intern, Intel Micro-server Team, Oregon. Mentor: Brinda Ganesh. [I

[May 2013 – Aug 2013]

Worked on System Agent (SA) of Intel's future SoC product. SA is a common interface to main memory
for the requests coming from Core and I/O devices. I implemented an out-of-order processing for
requests going to the main memory at the System Agent to address head-of-queue blocking to better
utilize memory resources. This out-of-order processing led to a 17% increase in the performance for the
micro-server evaluated using a simulator.

# Systems Software Engineer, IBM Software Labs, India.

[June 2006 – July 2010]

Worked as IBM JVM developer. As part of the JVM team, I worked on development of various JVM components like Java Class loading. I also worked on IBM middleware products like IBM WMQ and Message Broker on various OS's like z/OS (Mainframes) and various UNIX flavors.

# **HONORS AND AWARDS** (SELECTIVELY LISTED)

- Best Paper Nomination, MICRO-2016.
- Assisted in writing 2 NSF proposals at Penn State.
- IBM Invention Achievement Awards, IBM Labs.
- IBM Thank-you awards, IBM Labs

#### PROFESSIONAL SERVICE

- Student Research Competition (SRC) Chair: CGO-2019.
- Technical Program Committee Member: MASCOTS-2018, ICCD-2019, ICPP-2019, HPCA-2020 (Industry), MICRO-2020.
- External Review Committee Member: ASPLOS-2019/2020, HPCA-2019, ISCA-2020.
- Fundraising committee Member: MICRO-2019. Web-chair for AIM Workshop 2017.
- Reviewer of TPDS, TCAD, TACO, TC, TODEAS Journals.

## **STUDENTS MENTORED**

**Soheil Khadirsharbiyani,** PhD Student, Penn State.

Armin Vakil, PhD Student, Penn State.

Minli Liao, PhD Student, Penn State.

**Anup Sarma, PhD Student, Penn State.** 

[Jan 2019 - Present]

[March 2019 - Present]

[July 2020 – Present]

[2018]