<img src="http://hilpisch.com/tpq_logo.png" alt="The Python Quants" width="35%" align="right" border="0"><br>

# Python for Quantitative Finance

&copy; Dr. Yves J. Hilpisch | The Python Quants GmbH

http://tpq.io | [training@tpq.io](mailto:trainin@tpq.io) | [@dyjh](http://twitter.com/dyjh)

# Input-Output Operations

In [None]:
import sys
import numpy as np
import pandas as pd

In [None]:
from numpy.random import default_rng

In [None]:
rng = default_rng()

In [None]:
N = 2_000_000

In [None]:
%time a = rng.standard_normal((N, 5)).round(8)

In [None]:
a.nbytes

In [None]:
%time df = pd.DataFrame(a, columns=list('abcde'))

In [None]:
df.head()

## `CSV` Files

In [None]:
store = '/Users/yves/Temp/data/'  # adjust the store to a local one
# store = ''  # for local storage

In [None]:
%time df.to_csv(store + 'data.csv')

In [None]:
!ls -n $store

In [None]:
!head $store/data.csv

In [None]:
%time df_ = pd.read_csv(store + 'data.csv', index_col=0)

## Binary Storage Formats

**HDF5 Standard** (https://www.hdfgroup.org/solutions/hdf5/)

In [None]:
%time df.to_hdf(store + 'data.h5', 'data')

In [None]:
!ls -n $store

In [None]:
%time df_ = pd.read_hdf(store + 'data.h5', 'data')

In [None]:
df_.head()

In [None]:
%timeit df_ = pd.read_hdf(store + 'data.h5', 'data')

**Apache Parquet** (https://parquet.apache.org/) 

In [None]:
%time df.to_parquet(store + 'data.pq')

In [None]:
!ls -n $store

In [None]:
%timeit df_ = pd.read_parquet(store + 'data.pq')

## SQL Databases

In [None]:
import sqlite3 as sq3

In [None]:
con = sq3.connect(store + 'data.sql')

In [None]:
%time df.to_sql('data', con)

In [None]:
!ls -n $store

In [None]:
%time df_ = pd.read_sql('SELECT * FROM data', con)

In [None]:
%time con.execute('SELECT * FROM data WHERE c > 4.0').fetchmany(3)

In [None]:
%time df.query('c > 4.0').head(3)

In [None]:
%time df_ = pd.read_sql('SELECT * FROM data WHERE a < -4.1', con)

In [None]:
df_.head()

In [None]:
%%time
df_ = pd.read_sql('SELECT * FROM data WHERE a < -2.75 AND e > 2.75',
                  con, index_col='index')

In [None]:
df_.head()

In [None]:
!rm $store*

In [None]:
!ls -n $store

<img src="http://hilpisch.com/tpq_logo.png" alt="The Python Quants" width="35%" align="right" border="0"><br>

<a href="http://tpq.io" target="_blank">http://tpq.io</a> | <a href="mailto:training@tpq.io">training@tpq.io</a> | <a href="http://twitter.com/dyjh" target="_blank">@dyjh</a> 