Skip to content

Latest commit

 

History

History
executable file
·
61 lines (38 loc) · 3.89 KB

05-xla.md

File metadata and controls

executable file
·
61 lines (38 loc) · 3.89 KB

XLA Summary

Project link

Detailed Reports are included in our XLA-Report repository.

Group Members

Name Github Account ID
宋小牛 Jeffery-Song PB15000301
王若冰 MalcolmUnWang PB15121735
陈翊辉 cyh-ustc PB15111656

Introduction

XLA (Accelerated Linear Algebra)(Github) is a domain-specific compiler for linear algebra that optimizes TensorFlow computations. The results are improvements in speed, memory usage, and portability on server and mobile platforms. Users can use XLA via JIT (just-in-time) compilation or AOT (ahead-of-time) compilation.

In this project, we will do a small research on XLA, which covers in

  • What kind of acceleration can XLA do
  • How does XLA do it
  • Where does JIT and AOT came in

Summary of Our Work

12.23 Commit

  • Discussed what will do and assign the research tasks.
  • Get start with TensorFlow and learning the background knowledge. We have wrote a report for TensorFlow Programming.
  • Installed TensorFlow with GPU support and start toying with the basic operation.
  • Learned the general concepts and implementation about XLA. We have wrote two report individually for XLA Overview and XLA Just-in-time Compilation.

1.3 Commit

  • Compiled TensorFlow from source since XLA can only be included when TensorFlow is compiled from source.
  • Turned on JIT compilation to include XLA and run some optimized example.
  • Did some basic profiling with CUDA Profiling Tools Interface (CUPTI).
  • Report the test and profiling we have done in the repository.

1.13 Commit

  • Build our XLA on multiple platform. Including CPU version on PC, GPU version by CUDA 8.0 on PC (Nvidia GTX 970M), GPU version by CUDA 9.0 on high performance cluster (Nvidia Tesla V100), GPU version AOT compilation for x86-64 and ARM.

  • We have done full benchmark and profiling for JIT compilation on V100 cluster. Compiler optimization. Link

  • AOT compilation is used mainly on mobile platform, we have do research on AOT with AMD card and the usage of it. See report.

  • Boardcasting is widely used in XLA to provide more support for flexible matrix operation, research of it can be found in our repository. Link

  • Communicate with team NNVM and team Darkroom. Link

  • Made our final report.

Reference

TensorFlow Programming

XLA Overview

XLA Just-in-time Compilation

XLA Ahead-of-time Compilation