Skip to content

Tool used to reduce memory usage of dataframe object through transforming dtype of each column.

Notifications You must be signed in to change notification settings

RyanWangZf/mem_usage_reduction

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 

Repository files navigation

mem_usage_reduction

Tool used to reduce memory usage of dataframe object through transforming dtype of each column.

User Guidance

import pandas as pd
import reduce_mem_usage
df = pd.read_csv('example.csv')
props,nalist,dtype_dict = reduce_mem_usage.get(df)

Attention

It fills NaN with the minimum-1 under a column.
nalist indicates the columns which have NaN and props is the compressed dataframe.
props is compressed df.
dtype_dict contains the dtype of each column of compressed, that is used in pd.read_csv('example.csv',dtype=dtype_dict) process.

You can also save dtype_dict and load it as follows:

%save
import json
f = open('dict.json','w',encoding='utf_8_sig')
json.dump(dtype_dict,f)
f.close()

%load
f = open('dict.json','r')
dtype_dict = f.read()
dtype_dict = eval(dtype_dict)
f.close()

Environment

Python 3.6.2 AMD64
pandas (0.20.3)
numpy (1.13.3+mkl)

Addtional

No copyright, feel free to use it.

About

Tool used to reduce memory usage of dataframe object through transforming dtype of each column.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages