Memory usage on 16bit calculation. #507

Alan-Turing-Ko · 2018-08-01T15:26:06Z

I've tested new 8, 16bit inference.
I expected memory usage will be half on 16bit inference than 32bit, but actual result went over my expectation.
On creating net, 11MB used for 32 bit model, but 16.5MB used for 16 bit model.
I turned off winograd, sgemm options for pure memory calculation.
Caffe model size is 12.2MB
What am I wrong?
Hope to any suggestions.

nihui · 2018-08-03T03:18:32Z

how do you measure the memory usage ?
you shall pick the RES value, but not the VIRT value in top utility.
the memory usage should be the same for float32 and float16 in theory

liaocz · 2018-08-03T07:48:47Z

I want to kown what platform for your testing?

Alan-Turing-Ko · 2018-08-03T07:58:36Z

Thanks for your reply.
I tested in ARM platform.
I use ncnn in android app.
And I measure in squeezenet example included in project.

    long initMem = Debug.getNativeHeapAllocatedSize();
    {
        InputStream assetsInputStream = getAssets().open("kkkkk.param.bin");
        int available = assetsInputStream.available();
        param = new byte[available];
        int byteCode = assetsInputStream.read(param);
        assetsInputStream.close();
    }
    {
        InputStream assetsInputStream = getAssets().open("kkkkk.bin");
        int available = assetsInputStream.available();
        bin = new byte[available];
        int byteCode = assetsInputStream.read(bin);
        assetsInputStream.close();
    }

    words = new byte[0];
    boolean binit = squeezencnn.Init(param, bin, words);
    long MemSize = Debug.getNativeHeapAllocatedSize() - initMem;

    Log.e("Net mem: ", String.valueOf(MemSize));

And one more question.
I convert my model into int8 using BUG1989/caffe-int8-convert-tools.
But when I inference I got all NaN values when pass conv layer.
What could be the problem?

liaocz · 2018-08-03T08:23:42Z

I haven't see the int8 implementation on arm platform, it convert to float32 when cacultion when I read the code, I don't sure if i am wrong

nihui · 2023-12-04T07:40:18Z

ncnn now supports fp16 operations and takes up half the memory compared to fp32.

zhu-zhaofei mentioned this issue Dec 19, 2021

PNNX is an open standard for PyTorch model interoperability #3262

Merged

nihui closed this as completed Dec 4, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory usage on 16bit calculation. #507

Memory usage on 16bit calculation. #507

Alan-Turing-Ko commented Aug 1, 2018

nihui commented Aug 3, 2018

liaocz commented Aug 3, 2018

Alan-Turing-Ko commented Aug 3, 2018

liaocz commented Aug 3, 2018

nihui commented Dec 4, 2023

Memory usage on 16bit calculation. #507

Memory usage on 16bit calculation. #507

Comments

Alan-Turing-Ko commented Aug 1, 2018

nihui commented Aug 3, 2018

liaocz commented Aug 3, 2018

Alan-Turing-Ko commented Aug 3, 2018

liaocz commented Aug 3, 2018

nihui commented Dec 4, 2023