In [None]:
import numpy as np
import pandas as pd

So when `pandas` loads `.csv` files and encounters floats, it automatically makes them of data type `float64`.

Since the data we are given only has a precision of six digits, `float64` is overkill. Using `float32` would still maintain *all* precision but saves half the memory size!

In [None]:
print('float\t\t bytes')
print(np.float64(-0.622475), '\t', np.float64(-0.622475).nbytes)
print(np.float32(-0.622475), '\t', np.float32(-0.622475).nbytes)
print(np.float16(-0.622475), '\t', np.float16(-0.622475).nbytes)

In [None]:
df = pd.read_csv('../input/jane-street-market-prediction/train.csv')
df.info()

In [None]:
float64_cols = df.select_dtypes(include='float64').columns
mapper = {col_name: np.float32 for col_name in float64_cols}
df = df.astype(mapper)
df.info()

That's half the memory usage *without any loss of precision*!

As a bonus, here it is as a one-liner after loading `train.csv` into `df` using `.read_csv`:
```
df = df.astype({c: np.float32 for c in df.select_dtypes(include='float64').columns})
```