# Wentao Hou

#### PHD STUDENT · UNIVERSITY OF WISCONSIN-MADISON

1210 W. Dayton Street, Madison, WI 53706

■ taoh@cs.wisc.edu | ★ thoment.qithub.io | ☑ qithub.com/Thoment | ☐ www.linkedin.com/in/wentao-hou-55825a220

EducationUniversity of Wisconsin-MadisonMadison, United StatesPHD IN COMPUTER SCIENCESSep 2022 - present• Advisor: Ming LiuBeijing, ChinaTsinghua UniversityBe in Electronic EngineeringBE IN ELECTRONIC EngineeringSep 2018 - Jun 2022

Publications \_\_\_

**Wentao Hou\***, Kai Zhong\*, Shulin Zeng, Guohao Dai, Huazhong Yang, Yu Wang, "NTGAT: A Graph Attention Network Accelerator with Runtime Node Tailoring", 28th Asia and South Pacific Design Automation Conference (ASP-DAC 2023) (Accepted)

Awards and Honors \_\_\_\_\_

2020 Scholarship for excellence in the academy (top 30%), Tsinghua University

Dec. 2018

35th National College Physics Competition (group of non-physics major), First prize,

Beijing Institute of Physics

Research Experience \_\_\_\_\_

## Tsinghua University - Dept of Electronic Engineering

Beijing, China

ADVISOR: PROF. YU WANG

Oct 2021 - Jul 2022

- Proposed a runtime node tailoring algorithm based on attention coefficients sorting to accelerate graph attention network.
- Proposed a hardware-efficient pipeline insertion sorting scheme for fast node tailoring.
- Designed an accelerator architecture and dedicated processing units for Graph Attention Convolution.

### **University of Virginia - Department of Computer Science**

Remote

Advisor: Prof. Samira Khan

Jun 2021 - Oct 2021

- Conducted a breakdown performance evaluation of pre-processing in DNN with an image dataset.
- Measured the overheads of different pre-processing steps and looked for bottlenecks in certain scenarios.
- Profiled overheads of differential privacy in machine learning.

## Tsinghua University - Dept of Electronic Engineering

Beijing, China

ADVISOR: PROF. YU WANG

Dec 2020 - Oct 2021

- Worked on a software and hardware co-designed GNN accelerator which optimizes loading scattered features.
- Wrote several modules of the GNN and implemented a data path between host, device and on-chip memory via openCL and AXI channel in Xilinx SDx environment. Measured the bandwidth and delay of reading features over different granularities.
- Found the bandwidth saturates at 1KB per addressing in continuous memory access.

## Tsinghua University - Dept of Electronic Engineering

Beijing, China

Advisor: Prof. Yongpan Liu

Jun 2021 - Jul 2021

- Simulated a CNN accelerator with gem5. Simulated sparse acceleration by rounding small weights and skipping all-zero weight groups.
- Simulated a binary neural network on RRAM array by modifying with gem5. Simulated the effects of random noise in RRAM, and measured the relation between noise amplitude and accuracy. Designed an algorithm to reuse weights on different layers when on-chip memory is enough.