# File I/O in NumPy

### *Copyright 2021-2022 Dr. George Papagiannakis,  papagian@csd.uoc.gr*
*All Rights Reserved*
### *University of Crete & Foundation for Research & Technology - Hellas (FORTH)*

This notebook is also based on parts of [Lectures on scientific computing with Python](http://github.com/jrjohansson/scientific-python-lectures) by [J.R. Johansson](http://jrjohansson.github.io). 

---

In [1]:
import matplotlib.pyplot as plt
from numpy import *

## File I/O

### Comma-separated values (CSV)

A very common file format for data files is comma-separated values (CSV), or related formats such as TSV (tab-separated values). To read data from such files into Numpy arrays we can use the `numpy.genfromtxt` function. For example, 

In [2]:
with open('data/stockholm_td_adj.dat') as f:
    print("\n".join(f.read().splitlines()[:10]))

FileNotFoundError: [Errno 2] No such file or directory: 'data/stockholm_td_adj.dat'

In [3]:
data = genfromtxt('data/stockholm_td_adj.dat')

OSError: data/stockholm_td_adj.dat not found.

In [4]:
data.shape

NameError: name 'data' is not defined

In [5]:
fig, ax = plt.subplots(figsize=(14,4))
ax.plot(data[:,0]+data[:,1]/12.0+data[:,2]/365, data[:,5])
ax.axis('tight')
ax.set_title('tempeatures in Stockholm')
ax.set_xlabel('year')
ax.set_ylabel('temperature (C)');

NameError: name 'data' is not defined

Using `numpy.savetxt` we can store a Numpy array to a file in CSV format:

In [6]:
M = random.rand(3,3)

M

array([[0.13578036, 0.95574955, 0.97814186],
       [0.94472815, 0.40963549, 0.47257093],
       [0.958744  , 0.73454476, 0.77018024]])

In [7]:
savetxt("random-matrix.csv", M)

In [8]:
with open('random-matrix.csv') as f:
    print("\n".join(f.read().splitlines()))

1.357803615014674392e-01 9.557495488908374082e-01 9.781418622059638857e-01
9.447281537602874035e-01 4.096354918417117741e-01 4.725709312972151688e-01
9.587439963666842813e-01 7.345447556494900665e-01 7.701802408525866284e-01


In [9]:
savetxt("random-matrix.csv", M, fmt='%.5f') # fmt specifies the format

with open('random-matrix.csv') as f:
    print("\n".join(f.read().splitlines()))

0.13578 0.95575 0.97814
0.94473 0.40964 0.47257
0.95874 0.73454 0.77018


### Numpy's native file format

Useful when storing and reading back numpy array data. Use the functions `numpy.save` and `numpy.load`:

In [10]:
save("random-matrix.npy", M)

In [11]:
load("random-matrix.npy")

array([[0.13578036, 0.95574955, 0.97814186],
       [0.94472815, 0.40963549, 0.47257093],
       [0.958744  , 0.73454476, 0.77018024]])

## More properties of the numpy arrays

In [12]:
M.itemsize # bytes per element

8

In [13]:
M.nbytes # number of bytes

72

In [14]:
M.ndim # number of dimensions

2

## Further reading

* Check out more introductory notebooks in **Juno**!
* [General questions about NumPy](https://www.scipy.org/scipylib/faq.html#id1)
* http://numpy.scipy.org
* http://scipy.org/Tentative_NumPy_Tutorial
* http://scipy.org/NumPy_for_Matlab_Users - A Numpy guide for MATLAB users.

## Versions

In [15]:
%reload_ext version_information

%version_information numpy

Software,Version
Python,3.7.6 64bit [Clang 4.0.1 (tags/RELEASE_401/final)]
IPython,7.13.0
OS,Darwin 19.4.0 x86_64 i386 64bit
numpy,1.18.2
Wed Apr 08 18:16:37 2020 EEST,Wed Apr 08 18:16:37 2020 EEST
