# MICHAEL B. SULLIVAN

# August 2025

Architecture Research Group NVIDIA Corporation Austin, TX 78717



## **EMPLOYMENT**

| 2015-             | <b>NVIDIA Corporation</b> , Santa Clara, CA & Austin, TX Senior Research Scientist, Architecture Research Group (ARG)                           |
|-------------------|-------------------------------------------------------------------------------------------------------------------------------------------------|
| 2010-2015         | University of Texas, Austin, TX<br>Research Assistant, Locality Parallelism and Hierarchy Lab (LPH)                                             |
| 2011              | Los Alamos National Lab, Los Alamos, NM<br>Research Assistant, Applied Computer Science (CCS-7)                                                 |
| 2008<br>2007–2008 | George Mason University, Fairfax, VA<br>Research Asst., Lab for the Study and Sim. of Human Mvmt.<br>Research Assistant, Neural Engineering Lab |
| 2007              | <b>Argonne National Lab</b> , Argonne, IL<br>Research Assistant, Mathematics and Computer Science (MCS)                                         |
| 2006              | <b>University of California at Irvine</b> , Irvine, CA<br>Research Assistant, Nanotechnology Lab                                                |

## **EDUCATION**

|          | <b>Cockrell School of Engineering</b> , University of Texas at Austin |
|----------|-----------------------------------------------------------------------|
| MAY 2015 | Ph.D. in Computer Engineering                                         |
| MAY 2011 | M.S.E. in Computer Engineering                                        |
|          |                                                                       |
|          | Volgeneu School of Engineering, George Mason University               |
| JAN 2009 | M.S. in Computer Science                                              |
| MAY 2008 | B.S. in Computer Engineering, summa cum laude                         |
|          |                                                                       |
|          | College of Science, George Mason University                           |
| MAY 2008 | B.A. in Mathematical Sciences, summa cum laude                        |
|          |                                                                       |

## **PUBLICATIONS**

- Park, S., Namkoong, H., Choi, B., Sullivan, M. B., Kim, J. "CacheCraft: Enhancing GPU Performance under Memory Protection through Reconstructed Caching" *Proceedings of the International Symposium on Microarchitecture (MICRO)*, 2024.
- Kim, D., Lee, J., Jung, W., Sullivan, M. B., Kim, J. "Unity ECC: Unified Memory Protection Against Bit and Chip Errors" *Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis (SC)*, 2023.
- Sullivan, M. B., Ziad, M. T. I., Jaleel, A., Keckler, S. W. "Implicit Memory Tagging: No-Overhead Memory Safety Using Alias-Free Tagged ECC" *Proceedings of the International Symposium on Computer Architecture (ISCA)*, 2023.
- Li, G., Hari, S. K. S., Sullivan, M. B., Tsai, T., Pattabiraman, K., Emer, J., Keckler, S. W. "Understanding Error Propagation in Deep-Learning Neural Networks Accelerators and Applications." *IEEE Top Picks in Test and Reliability (TPTR)*, presented at the *IEEE International Test Conference (ITC)*, 2023.
- Sullivan, M. B., Saxena, N., O'Connor, M., Lee, D., Racunas, P., Hukerikar, S., Tsai, T., Hari, S. K. S., Keckler, S.W. "Through the Data Fog: Clarity on GPU DRAM Soft Errors." *IEEE Top Picks in Test and Reliability (TPTR), presented at the IEEE International Test Conference (ITC)*, 2023.
- Sullivan, M. B., Saxena, N., O'Connor, M., Lee, D., Racunas, P., Hukerikar, S., Tsai, T., Hari, S. K. S., Keckler, S. W. "Characterizing and Mitigating Soft Errors in GPU DRAM (Top Picks)" *IEEE MICRO Top Picks from the 2021 Computer Architecture Conferences*, 2022.
- Jha, S., Cui, S., Tsai, T., Hari, S. K. S., Sullivan, M. B., Kalbarczyk, Z. T., Keckler, S. W., Iyer, R. K. "Exploiting Temporal Data Diversity for Detecting Safety-Critical Faults in AV Compute Systems" *Proceedings of the International Conference on Dependable Systems and Networks (DSN)*, 2022.
- Song, Y., Park, S., Sullivan, M. B., Kim, J. "SEC-BADEC: An Efficient ECC with No Vacancy for Strong Memory Protection" *IEEE Access*, 2022.
- O'Connor, M., Lee, D., Chatterjee, N., Sullivan, M. B., Keckler, S. W. "Saving PAM4 Bus Energy with SMOREs: Sparse Multi-Level Opportunistic Restricted Encodings" *Proceedings of the International Symposium on High Performance Computer Architecture (HPCA)*, 2022.

- Sullivan, M. B., Saxena, N., O'Connor, M., Lee, D., Racunas, P., Hukerikar, S., Tsai, T., Hari, S. K. S., Keckler, S. W. "Characterizing and Mitigating Soft Errors in GPU DRAM" *Proceedings of the International Symposium on Microarchitecture (MICRO)*, 2021.
- Tsai, T., Hari, S. K. S., Sullivan, M. B., Villa, O., Keckler, S. W. "NVBitFI: Dynamic Fault Injection for GPUs" *Proceedings of the International Conference on Dependable Systems and Networks (DSN)*, 2021.
- Zhao, H., Hari, S. K. S., Tsai, T., Sullivan, M. B., Keckler, S. W., Zhao, J. "Suraksha: A Quantitative AV Safety Evaluation Framework to Analyze Safety Implications of Perception Design Choices" *Proceedings of the International Conference on Dependable Systems and Networks, Workshops (DSN-W)*, 2021.
- Hari, S. K. S., Sullivan, M. B., Tsai, T., Keckler, S. W. "Making Convolutions Resilient via Algorithm-Based Error Detection Techniques" *IEEE Transactions on Dependable and Secure Computing*, 2021.
- Dos Santos, F. F., Brandalero, M., Sullivan, M. B., Rech, R. L., Basso, P. M., Hubner, M., Carro, L., Rech, P. "Reduced-Precision DWC: An Efficient Hardening Strategy for Mixed-Precision Architectures" *IEEE Transactions on Computers*, 2021.
- Mahmoud, A., Hari, S. K. S., Fletcher C. W., Adve, S. V., Sakr, C., Shanbhag N. R., Molchanov, P., Sullivan, M. B., Tsai, T., Keckler, S. W. "Optimizing Selective Protection for CNN Resilience" *Proceedings of the International Symposium on Software Reliability Engineering (ISSRE)*, 2021.
- Anwer, A. R., Li, G., Pattabiraman, K., Sullivan, M. B., Tsai, T., Hari, S. K. S. "GPU-Trident: Efficient Modeling of Error Propagation in GPU Programs" *Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis (SC)*, 2020.
- Choukse, E., Sullivan, M. B., O'Connor, M., Erez, M., Pool, J., Nellans, D., Keckler, S. W. "Buddy Compression: Enabling Larger Memory for Deep Learning and HPC Workloads on GPUs" *Proceedings of the International Symposium on Computer Architecture (ISCA)*, 2020.
- Li, G., Li, Y., Jha, S., Tsai, T., Sullivan, M. B., Hari, S. K. S., Kalbarczyk, Z., Iyer, R. K. "AV-Fuzzer: Finding Safety Violations in Autonomous Driving Systems" *Proceedings of the International Symposium on Software Reliability Engineering (ISSRE)*, 2020.

- Mahmoud, A., Hari, S. K. S., Fletcher C. W., Adve, S. V., Sakr, C., Shanbhag N. R., Molchanov, P., Sullivan, M. B., Tsai, T., Keckler, S. W. "HarDNN: Feature Map Vulnerability Evaluation in CNNs" *Proceedings of the Workshop on Secure and Resilient Autonomy (SARA)*, 2020.
- Lee, K., Sullivan, M. B., Hari, S. K. S., Tsai, T., Keckler, S. W., Erez, M. "On the Trend of Resilience for GPU-Dense Systems" *Proceedings of the International Conference on Dependable Systems and Networks, Supplemental Volume (DSN-S)*, 2019.
- Jha, S., Banerjee, S., Tsai, T., Hari, S. K. S., Sullivan, M. B., Kalbarczyk, Z., Keckler, S. W., Iyer, R. K. "ML-Based Fault Injection for Autonomous Vehicles: A Case for Bayesian Fault Injection" *Proceedings of the International Conference on Dependable Systems and Networks (DSN)*, 2019.
- Lee, K., Sullivan, M. B., Hari, S. K. S., Tsai, T., Keckler, S. W., Erez, M. "GPU Snapshot: Checkpoint Offloading for GPU-dense Systems" *Proceedings of the International Conference on Supercomputing (ICS)*, 2019.
- Sullivan, M. B., Hari, S. K. S., Zimmer, B., Tsai, T., Keckler, S. W. "SwapCodes: Error Codes for Hardware-Software Cooperative GPU Pipeline Error Detection," *Proceedings of the International Symposium on Microarchitecture (MICRO)*, 2018.
- Abdulrahman, M., Hari, S. K. S., Sullivan, M. B., Tsai, T., Keckler, S. W. "Optimizing Software-Directed Instruction Replication for GPU Error Detection," *Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis (SC)*, 2018.
- Chang, C. K., Lym, S., Kelly, N., Sullivan, M. B., Erez, M. "Evaluating and Accelerating High-Fidelity Error Injection for HPC," *Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis (SC)*, 2018.
- Garg, R., Mohan, A., Sullivan, M. B., Cooperman, G. "CRUM: Checkpoint-Restart Support for CUDA's Unified Memory" *Proceedings of the International Conference on Cluster Computing (CLUSTER)*, 2018.
- Li, G., Hari, S. K. S., Sullivan, M. B., Tsai, T., Pattabiraman, K. "Modeling Soft-Error Propagation in Programs," *Proceedings of the International Conference on Dependable Systems and Networks (DSN)*, 2018.
- 2018 Chang, C. K., Lym, S., Kelly, N., Sullivan, M. B., Erez, M. "Hamartia: A Fast and Accurate Error Injection Framework," *Proceedings of the International Conference on Dependable Systems and Networks (DSN)*, 2018.

- Gong, S. L., Kim, J., Lym, S., Sullivan, M. B., David, H., Erez, M. "DUO: Exposing On-chip Redundancy to Rank-Level ECC for High Reliability," *Proceedings of the International Symposium on High Performance Computer Architecture (HPCA)*, 2018.
- Li, G., Hari, S. K. S., Sullivan, M. B., Tsai, T., Pattabiraman, K., Emer, J., Keckler, S. W. "Understanding Error Propagation in Deep Learning Neural Network (DNN) Accelerators and Applications," *Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis (SC)*, 2017.
- Sullivan, M. B., Zimmer, B., Hari, S. K. S., Tsai, T., Keckler, S. W. "An Analytical Model for Hardened Latch Selection and Exploration," *Proceedings of the Workshop on Silicon Errors in Logic–System Effects (SELSE)*, 2016.
- Kim, J., Sullivan, M. B., Choukse, E., Erez, M. "Bit-Plane Compression: Transforming Data for Better Compression in Many-core Architectures," *Proceedings* of the International Symposium on Computer Architecture (ISCA), 2016.
- Kim, J., Sullivan, M. B., Lym, S., Erez, M. "All Inclusive ECC: Thorough End-to-End Protection for Reliable Computer Memory," *Proceedings of the International Symposium on Computer Architecture (ISCA)*, 2016.
- Kim, J., Sullivan, M. B., Gong, S. L., Erez, M. "Frugal ECC: Efficient and Versatile Memory Error Protection through Fine-Grained Compression", *Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis (SC)*, 2015.
- Kim, J., Sullivan, M. B., Erez, M. "Bamboo ECC: Strong, Safe, and Flexible Codes for Reliable Computer Memory", *Proceedings of the International Symposium on High Performance Computer Architecture (HPCA)*, 2015.
- Rhu, M., Sullivan, M. B., Leng, J., Erez, M. "A Locality-Aware Memory Hierarchy for Energy-Efficient GPU Architectures", *Proceedings of the International Symposium on Microarchitecture (MICRO)*, Davis, CA, December 7, 2013.
- Sullivan, M. B., Swartzlander, E. E. "On Separable Error Detection for Addition", *Proceedings of the Asilomar Conference on Signals and Systems*, Pacific Grove, CA, November 3, 2013.
- Chung, J., Lee, I., Sullivan, M. B., Ryoo, J. H., Kim, D. W., Yoon, D. H., Kaplan, L., Erez, M. "Containment Domains: A Scalable, Efficient, and Flexible Resilience Scheme for Exascale Systems," *Scientific Programming*, Vol. 21, Number 3-4, (January 2013): 197–212.

- Sullivan, M. B., Swartzlander, E. E. "Truncated Logarithmic Approximation," Proceedings of the International Symposium on Computer Arithmetic (ARITH), Austin, TX, April 7, 2013.
- Chung, J., Lee, I., Sullivan, M. B., Ryoo, J. H., Kim, D. W., Yoon, D. H., Kaplan, L., Erez, M. "Containment Domains: A Scalable, Efficient, and Flexible Resilience Scheme for Exascale Systems," *Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis (SC)*, Salt Lake City, UT, November 12, 2012.
- Sullivan, M. B., Swartzlander, E. E. "Truncated Error Correction for Flexible Approximate Multiplication," *Proceedings of the Asilomar Conference on Signals and Systems*, Pacific Grove, CA, November 3, 2012.
- Yoon, D. H., Sullivan, M. B., Jeong, M. K., Erez, M. "Towards Proportional Memory Systems", *Intel Technology Journal*, Vol. 17, Issue 1, 2012.
- Willert, J., Kelley, C. T., Knoll, D. A., Dong, H., Ravishankar, M., Sathre, P., Sullivan, M. B., Taitano, W. "Hybrid Deterministic/Monte Carlo Neutronics Using GPU Accelerators," *International Symposium on Distributed Computing and Applications to Business, Engineering & Science (DCABES)*, Guilin, China, October 19, 2012.
- Sullivan, M. B., Swartzlander, E. E. "Long Residue Checking for Adders," *Proceedings* of the International Conference on Application-specific Systems, Architectures and *Processors (ASAP)*, Delft, Netherlands, July 9, 2012.
- Yoon, D. H., Sullivan, M. B., Jeong, M. K., Erez, M. "The Dynamic Granularity Memory System," *Proceedings of the International Symposium on Computer Architecture (ISCA)*, Portland, OR, June 9, 2012.
- Jeong, M. K., Yoon, D. H., Sunwooz, D., Sullivan, M. B., Lee, I., Erez, M. "Balancing DRAM Locality and Parallelism in Shared Memory CMP Systems," *Proceedings of the International Symposium on High Performance Computer Architecture (HPCA)*, New Orleans, LA, February 25, 2012.
- Sullivan, M. B., Swartzlander, E. E. "Hybrid Residue Generators for Increased Efficiency," *Proceedings of the Asilomar Conference on Signals*, Pacific Grove, CA, November 3, 2011.
- Powell, M. R., Sullivan, M. B., Vlassiouk, I., Constantin, D., Sundre, O., Martens, C. C., Eisenberg, R. E., and Siwy, Z. S.. "Nanoprecipitation-assisted ion current oscillations," *Nature Nanotechnology*, Vol. 3, No. 1 (January 2008): 51–57.

#### **PATENTS**

- Sullivan, M. B., Hassan, M. T. B. M., Jaleel, A. "Alias-Free Tagged Error Correcting Codes for Machine Memory Operations" *U.S. Patent* 12,321,230, 2025.
- Tsai, T., Jha, S., Hari, S. K. S., Sullivan, M. B. "Hardware Fault Detection for Feedback Control Systems in Autonomous Machine Applications" *U.S. Patent* 12,054,164, 2024.
- Sullivan, M. B., Saxena, N. R., Keckler, S. W. "Single-Cycle Byte Correcting and Multi-Byte Detecting Error Code" *U.S. Patent* 12,149,259, 2024.
- Hassan, M. T. B. M., Jaleel, A., Stephenson, M., Sullivan, M. B. "Implementing Compiler-Based Memory Safety for a Graphic Processing Unit." *U.S. Patent* 11,836,361, 2023.
- Sullivan, M. B., Pool, J. M., Huang, Y., Tsai, T. K., Hari, S. K. S., Keckler, S. W. "Packed Error Correction Code (ECC) for Compressed Data Protection" *U.S. Patent* 11,522,565, 2022.
- Sullivan, M. B., Hari, S. K. S, Zimmer, B., Tsai, T., Keckler, S. W. "System and Methods for Hardware-Software Cooperative Pipeline Error Detection" *U.S. Patent* 11,409,597, 2022.
- Mills, P., Sullivan, M. B., Saxena, N., Brooks, J. "Techniques for Storing Data to Enhance Recovery and Detection of Data Corruption Errors" *U.S. Patent* 11,474,897, 2022.
- Sullivan, M. B., Hari, S. K. S, Zimmer, B., Tsai, T., Keckler, S. W. "System and Methods for Hardware-Software Cooperative Pipeline Error Detection" *U.S. Patent* 10,621,022, 2020.
- Hari, S. K. S, Sullivan, M. B., Tsai, T., Keckler, S. W., Mahmoud, A. "Optimizing Software-Directed Instruction Replication for GPU Error Detection" *U.S. Patent* 10,817,289, 2020.

#### TECHNICAL REPORTS & SELF-PUBLISHED PAPERS

Hari, S. K. S., Rech, P., Tsai, T., Stephenson, M., Zulfiqar, A., Sullivan, M. B., Shirvani, P., Racunas, P., Emer, J., Keckler, S. W. "Estimating Silent Data Corruption Rates Using a Two-Level Model" *arXiv preprint arXiv:2005.01445*, 2020.

- Mahmoud, A., Hari, S. K. S., Fletcher, C. W., Adve, S. V., Sakr, C., Shanbhag, N., Molchanov, P., Sullivan, M. B., Tsai, T., Keckler, S. W. "HarDNN: Feature Map Vulnerability Evaluation in CNNs" *arXiv preprint arXiv:2002.09786*, 2020.
- Jha, S., Tsai, T., Hari, S. K. S., Sullivan, M. B., Kalbarczyk, Z., Keckler, S. W., Iyer, R. K. "Kayotee: A Fault Injection-Based System to Assess the Safety and Reliability of Autonomous Vehicles to Faults and Errors" *arXiv preprint arXiv:1907.01024*, 2019.
- Lee, I., Basoglu, M., Sullivan, M. B., Yoon, D. H., Kaplan, L., and Erez, M. "Survey of Error and Fault Detection Mechanisms," Technical Report TR-LPH-2011–002, LPH Group, Department of Electrical and Computer Engineering, The University of Texas at Austin, April, 2011.
- Sullivan, M. B., Yoon, D. H., and Erez, M. "Containment Domains: A Full-System Approach to Computational Resiliency". Technical Report TR-LPH-2011–001, LPH Group, Department of Electrical and Computer Engineering, The University of Texas at Austin, January, 2011.

#### AWARDS, FELLOWSHIPS, AND RESEARCH GRANTS

- IEEE Top Pick in Test and Reliability, "Understanding Error Propagation in Deep Learning Neural Network (DNN) Accelerators and Applications."
- IEEE Top Pick in Test and Reliability, "Characterizing and Mitigating Soft Errors in GPU DRAM."
- IEEE Micro Top Pick, "Characterizing and Mitigating Soft Errors in GPU DRAM."
- Best Paper Award, ISSRE, "AV-Fuzzer: Finding Safety Violations in Autonomous Driving Systems."
- 2019 Chosen for Best Paper, SELSE, "On the Trend of Resilience for GPU-Dense Systems."
- Best Paper Runner-Up, DSN, "Modeling Soft-Error Propagation in Programs."
- Best Paper Finalist, HPCA, "Bamboo ECC: Strong, Safe, and Flexible Codes for Reliable Computer Memory."
- 2010–2013 Temple Foundation MCD Fellowship
  - Best Paper Finalist, SC, "Containment Domains: A Scalable, Efficient, and Flexible Resilience Scheme for Exascale Systems."
- 2008–2010 National Defense Science and Engineering Graduate Fellowship
  - 2009 Graduate Dean Prestigious Fellowship Supplement
  - NSF Graduate Research Fellowship Program Honorable Mention
- 2004–2008 George Mason University Scholar
- 2006–2008 Northern Virginia Technology Council Bannister Scholarship
- 2005–2008 AFCEA-NOVA Scholarship
  - 2007 GMU Undergraduate Faculty-Student Research Apprenticeship Grant
  - 2007 DoE Undergraduate Laboratory Internship Program
  - 2007 NSF-REU Chemistry Leadership Group Travel Award

- NSF Research Experience for Undergraduates Program
- 2003 National Merit Scholarship Finalist

#### PROFESSIONAL SERVICE

| 2016-2024 | 3× External Reviewer, Symposium on Microarchitecture (MICRO)                  |
|-----------|-------------------------------------------------------------------------------|
| 2018-2024 | 2× Reviewer, Computer Architecture Letters (CAL)                              |
| 2021-2023 | 3× Program Committee, Symposium on Microarchitecture (MICRO)                  |
| 2023      | Reviewer, Transactions on Computer Aided Design (TCAD)                        |
| 2016-2023 | 4× Program Committee, SELSE Workshop                                          |
| 2022      | Program Committee, International Symposium on Computer Architecture (ISCA)    |
| 2022      | Reviewer, Transactions on Nuclear Science                                     |
| 2021-2022 | 2× Program Committee, SuperCheck Workshop                                     |
| 2021      | Program Committee, Design Automation Conference (DAC)                         |
| 2019      | Program Chair, SELSE Workshop                                                 |
| 2016-2018 | 2× Reviewer, Transactions on Computers                                        |
| 2018      | Publicity Chair, SELSE Workshop                                               |
| 2018      | Reviewer, Transactions on Sustainable Computing                               |
| 2017      | Reviewer, IEEE MICRO                                                          |
| 2016      | External Reviewer, High Performance Computer Architecture (HPCA)              |
| 2015-2016 | 2× External Reviewer, International Symposium on Computer Architecture (ISCA) |
| 2014      | External Reviewer, International Conference on Architectural Support for      |
|           | Programming Languages and Operating Systems (ASPLOS)                          |

#### POSTER SESSIONS

- Sullivan, M. B., Swartzlander, E. E. "Long Residue Checking for Adders," Presented at the TexasWISE Workshop on VLSI, Round Top, TX, March 8, 2013.
- Sullivan, M. B., Swartzlander, E. E. "Hybrid Residue Generators for Increased Efficiency," Presented at the 45th Asilomar Conference on Signals, Pacific Grove, CA, November 3, 2011.
- Sullivan, M. B., Basoglu, M., Lee, I., Krimer, E., Erez, M. "Echelon: Reliability at the Exascale," Locality, Parallelism, and Hierarchy (LPH) Research Highlight, Austin, Texas, March 3, 2011.
- Sullivan, M. B., Siwy, Z. S., Powell, M. R., and Kalman, E. "Voltage-Gating in Synthetic Nanopores Induced by Cobalt Ions," American Chemical Society, Chicago, Illinois, March 26, 2007. Also presented at Innovations 2007, George Mason University, Fairfax, Virginia, April 25, 2007.

Sullivan, M. B., Siwy, Z. S., Powell, M. R., and Kalman, E. "Voltage-Gating in Synthetic Nanopores Induced by Cobalt Ions," IM-SURE Symposium, University of California, Irvine, August 2006.

## TEACHING EXPERIENCE

| 2013-2015 | University of Texas, Austin, TX<br>Guest Lecturer, High Speed Computer Arithmetic I                                                |
|-----------|------------------------------------------------------------------------------------------------------------------------------------|
| 2003-2004 | <b>Thomas Jefferson School for Science and Technology</b> , Fairfax, Virginia Instructional Assistant, Introduction to Programming |
| 2004      | <b>George Mason University</b> , Fairfax, Virginia<br>Mentor, School of Music                                                      |

#### OTHER WORK EXPERIENCE

**George Mason University**, Fairfax, Virginia
2005–2007 Computer Lab Manager, University Scholars Program

## PROFESSIONAL AFFILIATIONS

Alpha Chi Honor Society
Alpha Lambda Delta Honor Society
American Chemical Society
Armed Forces Communications & Electronics Association
Institute of Electrical and Electronics Engineers
Golden Key International Honor Society