Skip to content

Must read research papers and links to tools and datasets that are related to using machine learning for compilers and systems optimisation

License

Notifications You must be signed in to change notification settings

zoucan520/awesome-machine-learning-in-compilers

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 

Repository files navigation

Awesome machine learning for compilers and program optimisation

Awesome Maintenance

A curated list of awesome research papers, datasets, and tools for applying machine learning techniques to compilers and program optimisation.

Contents

Papers

Survey

Iterative Compilation and Compiler Option Tuning

Instruction-level Optimisation

Auto-tuning and Design Space Exploration

Parallelism Mapping and Task Scheduling

Domain-specific Optimisation

Languages and Compilation

Code Size Reduction

Cost and Performance Models

Learning Program Representation

Enabling ML in Compilers

Memory/Cache Modeling/Analysis

  • 10-pages Learning Memory Access Patterns - Milad Hashemi, Kevin Swersky, Jamie A. Smith, Grant Ayers, Heiner Litz, Jichuan Chang, Christos Kozyrakis, Parthasarathy Ranganathan. ICML 2018

Books

Talks and Tutorials

Software

  • CompilerGym - reinforcement learning environments for compiler optimizations
  • CodeBert - pre-trained DNN models for programming languages (paper).
  • programl - LLVM and XLA IR program representation for machine learning (paper).
  • NeuroVectorizer - Using deep reinforcement learning (RL) to predict optimal vectorization compiler pragmas (paper).
  • TVM - Open Deep Learning Compiler Stack for cpu, gpu and specialized accelerators (paper; slides).
  • clgen - Benchmark generator using LSTMs (paper; slides).
  • COBAYN - Compiler Autotuning using BNs (paper).
  • OpenTuner - Framework for building domain-specific multi-objective program autotuners (paper; slides)
  • ONNX-MLIR - Representation and Reference Lowering of ONNX Models in MLIR Compiler Infrastructure (paper).

Benchmarks and Datasets

  • The Alberta Workloads for the SPEC CPU® 2017 Benchmark Suite - Additional workloads for the SPEC CPU2017 Benchmark Suite.
  • Project CodeNet - Code samples written in 50+ programming languages, annotated with info, such as code size, memory footprint, CPU run time, and status (acceptance/error types)
  • CodeXGLUE - A Machine Learning Benchmark Dataset for Code Understanding and Generation (paper)
  • ANGHABENCH - A suite with One Million Compilable C Benchmarks (paper)
  • BHive - A Benchmark Suite and Measurement Framework for Validating x86-64 Basic Block Performance Models (paper).
  • cBench - 32 C benchmarks with datasets and driver scripts.
  • PolyBench - Dataset - Multiple datasets for Polybench (paper)
  • PolyBench - 31 Stencil and Linear-algebra benchmarks with datasets and driver scripts.
  • PolyBench - Original - 30 Stencil and Linear-algebra benchmarks with datasets and driver scripts.
  • DeepDataFlow - 469k LLVM-IR files and 8.6B data-flow analysis labels for classification (paper).
  • devmap - 650 OpenCL benchmark features and CPU/GPU classification labels (paper; slides).

Conferences

Journals

How to Contribute

See Contribution Guidelines. TL;DR: send one of the maintainers a pull request.

About

Must read research papers and links to tools and datasets that are related to using machine learning for compilers and systems optimisation

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published