Add low-precision to TFNO #172

crwhite14 · 2023-06-29T20:30:24Z

This pull request adds low precision options for TFNO. Here is a brief summary.

opt.amp_autocast was previously in the yaml file but was a no-op. This PR gets opt.amp_autocast working

False (default): run the model in full precision
True: turn on torch.amp.autocast, torch's built-in half-precision method. This turns many operations to half precision, with a few notable exceptions: reduction operations, weight updates (due to instability) and complex-valued operations (due to lack of implementation).

Add new parameter to the yaml file, fno_block_precision

'full' (default): standard full precision
'half': the FFT, contraction, and inverse FFT run in half precision
'mixed': the FFT runs in full-precision, and the contraction and inverse FFT run in half precision

Add new parameter to the yaml file, stabilizer

None (default)
'tanh': adds a tanh just before the FFT in the fno-block. Typically this needs to be set to stabilize the FFT, when fno_block_precision='half'.

Running

python train_navier_stokes.py --opt.amp_autocast=True --tfno2d.fno_block_precision='half' --tfno2d.stabilizer='tanh'
python train_navier_darcy.py --opt.amp_autocast=True --tfno2d.fno_block_precision='half' --tfno2d.stabilizer='tanh'

Improves runtime and memory by up to 30%, depending on the GPU used, the resolution of the data (greater speedups for 64x64 resolution or higher), and other hyperparameters such as factorization and rank.

JeanKossaifi

This looks great, thanks!

JeanKossaifi · 2023-07-10T23:52:57Z

neuralop/models/spectral_convolution.py

+        if self.fno_block_precision == 'half':
+            x = x.half()
+        else:
+            x = x.float()


Should we remove this one or do you think we need to always explicitly cast here @crwhite14 @rtu715 ?

We can remove the "else: x = x.float()"
I just did that in this commit: bcc52e7#diff-5ae1e49af12ed16c75135c0043a08575110fd03d4c722a837b60aa0950b31e32L342-L343

JeanKossaifi · 2023-07-11T05:56:00Z

Thanks, great PR @crwhite14 @rtu715, merging!

Add low-precision to TFNO

crwhite14 and others added 8 commits June 22, 2023 16:45

fix typo

e1acc95

Merge branch 'neuraloperator:main' into temp/colin

3c48850

allow amp_autocast flag for the trainer

dad8b89

add initial fno_block_precision for NS

32894bc

make amp_autocast read from config

f509f83

add fno_block_precision to other configs

d12d1ad

tanh as stabilizer in the fno block

b8f77a2

add general version of einsum_complexhalf

8be96d0

crwhite14 requested a review from JeanKossaifi June 29, 2023 20:30

rtu715 and others added 3 commits July 6, 2023 16:17

Merge branch 'main' into low_precision

0a57fea

added uni testing for precision and stabilizer; test model with autocast

acb0218

replace torch.einsum with tl.einsum for complexhalf

c97ed91

JeanKossaifi reviewed Jul 10, 2023

View reviewed changes

remove cast to float

bcc52e7

JeanKossaifi merged commit 1051112 into neuraloperator:main Jul 11, 2023
1 check passed

ziqi-ma pushed a commit to ziqi-ma/neuraloperator that referenced this pull request Aug 29, 2023

Merge pull request neuraloperator#172 from crwhite14/low_precision

236d582

Add low-precision to TFNO

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add low-precision to TFNO #172

Add low-precision to TFNO #172

crwhite14 commented Jun 29, 2023

JeanKossaifi left a comment

JeanKossaifi Jul 10, 2023

crwhite14 Jul 11, 2023

JeanKossaifi commented Jul 11, 2023

Add low-precision to TFNO #172

Add low-precision to TFNO #172

Conversation

crwhite14 commented Jun 29, 2023

JeanKossaifi left a comment

Choose a reason for hiding this comment

JeanKossaifi Jul 10, 2023

Choose a reason for hiding this comment

crwhite14 Jul 11, 2023

Choose a reason for hiding this comment

JeanKossaifi commented Jul 11, 2023