Memory Efficient FP 16 Training

# 🚀 Feature request


Fairseq uses memory efficient FP 16 training as explained in https://arxiv.org/pdf/1904.01038.pdf.

## Motivation
Generally the model requires high end GPU's to fine-tune on larger length datasets. Using memory efficient FP 16 we can reduce the need of high GPU's and thus models can be fine-tune without OOM problems.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Memory Efficient FP 16 Training #12084

🚀 Feature request

Motivation

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Memory Efficient FP 16 Training #12084

Description

🚀 Feature request

Motivation

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions