---

# GTX1080ti: test FP32 vs. FP16

#### Pay attention: Same data for FP32 and FP16, but the batch size is doubled in the latter case.

TL;DR version:

FP32: bs=105, vram occupation peaks at `10074` Mb, wall time: 1:45

FP16: bs=210, vram occupation peaks at `10162` Mb, wall time: 1:37




In [1]:
%matplotlib inline

from fastai import *
from fastai.vision import *
import re
import scipy.ndimage
from ipyexperiments import *
import fastai
fastai.__version__

path    = Path('/home/DATA/data/')
fp32exp = IPyExperimentsPytorch()



*** Experiment started with the Pytorch backend
Device: ID 0, GeForce GTX 1080 Ti (11176 RAM)


*** Current state:
RAM:    Used    Free   Total       Util
CPU:   1,870  92,729  96,560 MB   1.94% 
GPU:     560  10,615  11,176 MB   5.02% 


･ RAM:  △Consumed    △Peaked    Used Total | Exec time 0:00:00.000
･ CPU:          0          0      1,871 MB |
･ GPU:          0          0        560 MB |


In [2]:
path = Path('/home/DATA/data/')
bs=105
data = ImageDataBunch.from_folder(path,
                                  train='backup224',
                                  valid='backup224valid',
                                  size=224, bs=bs,
                                  ).normalize(imagenet_stats)

･ RAM:  △Consumed    △Peaked    Used Total | Exec time 0:00:00.808
･ CPU:          2          2      1,981 MB |
･ GPU:          0          0        560 MB |


In [3]:
learn = create_cnn(data, models.resnet50, metrics=accuracy)

･ RAM:  △Consumed    △Peaked    Used Total | Exec time 0:00:01.791
･ CPU:          0          0      2,093 MB |
･ GPU:        106          0        666 MB |


In [4]:
%%time
learn.fit_one_cycle(5)

epoch,train_loss,valid_loss,accuracy
1,1.064476,0.440659,0.867703
2,0.636661,0.116766,0.968158
3,0.386443,0.045268,0.989007
4,0.244284,0.023008,0.999242
5,0.161967,0.020775,0.999242


CPU times: user 42.6 s, sys: 54.4 s, total: 1min 37s
Wall time: 1min 45s
･ RAM:  △Consumed    △Peaked    Used Total | Exec time 0:01:45.978
･ CPU:          0          0      2,136 MB |
･ GPU:        282     10,074        948 MB |


### The Kernel is now restarted.

In [1]:
%matplotlib inline

from fastai import *
from fastai.vision import *
import re
import scipy.ndimage
from ipyexperiments import *
import fastai
fastai.__version__

path    = Path('/home/DATA/data/')
fp16exp = IPyExperimentsPytorch()



*** Experiment started with the Pytorch backend
Device: ID 0, GeForce GTX 1080 Ti (11176 RAM)


*** Current state:
RAM:    Used    Free   Total       Util
CPU:   1,868  92,730  96,560 MB   1.94% 
GPU:     560  10,615  11,176 MB   5.02% 


･ RAM:  △Consumed    △Peaked    Used Total | Exec time 0:00:00.000
･ CPU:          0          0      1,869 MB |
･ GPU:          0          0        560 MB |


In [2]:

path = Path('/home/DATA/data/')
bs=210
data = ImageDataBunch.from_folder(path,
                                  train='backup224',
                                  valid='backup224valid',
                                  size=224, bs=bs,
                                  ).normalize(imagenet_stats)

･ RAM:  △Consumed    △Peaked    Used Total | Exec time 0:00:01.371
･ CPU:          3          2      2,049 MB |
･ GPU:          0          0        560 MB |


In [3]:
learn = create_cnn(data, models.resnet50, metrics=accuracy).to_fp16()

･ RAM:  △Consumed    △Peaked    Used Total | Exec time 0:00:01.804
･ CPU:          0          0      2,094 MB |
･ GPU:         82         32        642 MB |


In [4]:
%%time
learn.fit_one_cycle(5)

epoch,train_loss,valid_loss,accuracy
1,1.203099,0.464430,0.860500
2,0.855579,0.214616,0.948825
3,0.598909,0.096289,0.982183
4,0.433335,0.060937,0.987491
5,0.331232,0.056734,0.990144


CPU times: user 36.4 s, sys: 42.6 s, total: 1min 18s
Wall time: 1min 37s
･ RAM:  △Consumed    △Peaked    Used Total | Exec time 0:01:37.449
･ CPU:          0          0      2,137 MB |
･ GPU:        160     10,162        802 MB |
