with low_cpu_mem_usage, the max ram usage is still 7~8G to quantize 7B models, which is not expected. The target is <2G