## dan.luu@gmail.com

## EXPERIENCE

| Recurse, Sabbatical; New York, NY                                           | Summer 201'                                          |
|-----------------------------------------------------------------------------|------------------------------------------------------|
| Microsoft, Engineer; Seattle, WA                                            | 2015-201                                             |
| $\diamond~$ BitFunnel search engine; near order of magnitude throughp       | $ut/cost$ improvement $C+\tau$                       |
| $\circ~$ Found an algorithmic simplification, reducing largest and          | d most complicated part of the system to 30LOC       |
| • Replaced poorly-understood ML config system with opti                     | mal mathematical formula; 2x perf improvement        |
| ♦ SmartNIC; multiple order of magnitude tail latency improv                 | ement System Verilo                                  |
| • Half the latency of Amazon "enhanced networking"                          |                                                      |
| Google, Engineer; Madison, WI                                               | 2013 - 2014                                          |
| ♦ TPU (deep learning hardware accelerator); 2nd person on p                 | project                                              |
| o Order of magnitude performance improvement over GPU                       | Js                                                   |
| $\circ\ https://www.google.com/patents/WO2016186801A1$                      |                                                      |
| $\circ\ https://www.google.com/patents/US20160342889$                       |                                                      |
| Recurse, Sabbatical; New York, NY                                           | Spring 2013                                          |
| Centaur Technology (acquired by VIA), Member of Te                          | echnical Staff; Austin, TX 2005 – 2013               |
| ♦ Here's one sample six-month project (adding an ARM front                  | ,                                                    |
| • Helped reverse engineer the ARMv7 ISA (this was pre-A                     | Arch64)                                              |
| <ul> <li>Created architectural simulator and got Android running</li> </ul> | g on it                                              |
| $\circ$ Implemented $^{1}/_{2}$ of the translator, and wrote associated     | microcode Verilog / Templating language              |
| $\circ~$ Created test generator that found 90% of the first 1000            | bugs on the project $F_{\pi}$                        |
| ♦ Other projects included adding fault tolerance to a distribu              | ted system, post-silicon debug, test tooling, etc.   |
| • Improved job scheduling system, improving machine util                    | lization from $60\%$ to $92\%$                       |
| Ultrafast Optics and Fiber Communications Lab, Rese                         | arch Assistant; Lafayette, IN 2003 – 2008            |
| $\diamond~$ Lab work, included speeding up parallel (256 wavelength) p      | olarimeter by 40x MATLAB and C                       |
| IBM, Intern; Austin, TX                                                     | Summer 2003                                          |
| $\diamond~$ Semi-formal / constrained random POWER6 completion ur           | nit functional verification VHD                      |
| Micron Technology, Intern; Boise, ID                                        | Summer 2009                                          |
| ♦ Flash product engineering / characterization. Automated p                 | reviously manual tasks. Per                          |
| Spatial Systems Research Laboratory, Research Assista                       | ant; Madison, WI 200                                 |
| ♦ Studied tilings and related combinatorial models, e.g., alter             | nating sign matrices and square ice                  |
| EDUCATION                                                                   |                                                      |
| BS Math & CMPE (Wisconsin, '00-'03), MS EE (Purc                            | lue, '03-'05)                                        |
| NON-WORK PROJECTS                                                           | , , , , , , , , , , , , , , , , , , , ,              |
|                                                                             | 111 //1 1 /01 :                                      |
| Randomized algorithms can beat LRU/pseudo-LRU caches:                       |                                                      |
| ♦ A fuzzer written in an hour that found ~20 bugs in Julia                  | https://github.com/danluu/Fuzz.j                     |
| ♦ Web performance benchmarks for slow/flaky connections                     | https://danluu.com/web-bloat/                        |
|                                                                             | tps://github.com/danluu/secvisor-formal-verification |
| ♦ Combining AFL and QuickCheck for directed fuzzing                         | https://danluu.com/testing/                          |
| ♦ Terminal latency                                                          | https://danluu.com/term-latency/                     |
| ♦ Sega system on FPGA                                                       | https://github.com/danluu/sega-system-for-fpgc       |
| ♦ Keyboard vs. mousing speed                                                | $https://danluu.com/keyboard-v-mouse_{/}$            |
| $\diamond$ See $https://github.com/danluu/$ and $http://danluu.com$ for     | more!                                                |

## MISCELLANEOUS

 $\diamond\,$  Work Authorization: U.S. Citizen