Skip to content

Commit

Permalink
feat(util.pandas): Add optimise_df to reduce dataframe size.
Browse files Browse the repository at this point in the history
  • Loading branch information
aaronmussig committed Apr 15, 2022
1 parent 52bca3b commit 706baff
Show file tree
Hide file tree
Showing 3 changed files with 21 additions and 0 deletions.
1 change: 1 addition & 0 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,7 @@ This has been written with the intention of personal use, but feel free to use/c
util/io
util/accession
util/tree
util/pandas


.. toctree::
Expand Down
6 changes: 6 additions & 0 deletions docs/source/util/pandas.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
******
Pandas
******

.. autofunction:: magna.util.pandas.optimise_df

14 changes: 14 additions & 0 deletions magna/util/pandas.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
import pandas as pd


def optimise_df(df: pd.DataFrame):
"""Optimise a Pandas DataFrame by using the smallest possible data type.
Args:
df: The Pandas DataFrame to optimise.
"""

float_cols = df.select_dtypes('float').columns
int_cols = df.select_dtypes('integer').columns
df[float_cols] = df[float_cols].apply(pd.to_numeric, downcast='float')
df[int_cols] = df[int_cols].apply(pd.to_numeric, downcast='integer')

0 comments on commit 706baff

Please sign in to comment.