# CREATING AND LOADING ARRAYS

In [34]:
import numpy as np
import pandas as pd

## Creating arrays

In [3]:
print('zeros', np.zeros(5))
print('ones:', np.ones((3, 4)))
print('arange:', np.arange(6))
print('linspace:', np.linspace(0.0, 1.0, 8))
print('random:', np.random.uniform(size=5))
print('custom:', np.array([3, 9, 7]))

zeros [ 0.  0.  0.  0.  0.]
ones: [[ 1.  1.  1.  1.]
 [ 1.  1.  1.  1.]
 [ 1.  1.  1.  1.]]
arange: [0 1 2 3 4 5]
linspace: [ 0.          0.14285714  0.28571429  0.42857143  0.57142857  0.71428571
  0.85714286  1.        ]
random: [ 0.33835656  0.84049256  0.99711302  0.57964509  0.54292733]
custom: [3 9 7]


Every array has a fixed data type. You can specify the data type explicitly, or you can let NumPy figure out the data type automatically. For example, ``np.ones()`` generates
an array of floating-point numbers by default, whereas ``np.arange()`` returns an array of integers. You can specify the data type explicitly as shown here:

In [4]:
np.ones(4, dtype=np.int)

array([1, 1, 1, 1])

In [5]:
np.arange(6).astype(np.single)

array([ 0.,  1.,  2.,  3.,  4.,  5.], dtype=float32)

Here are a few references:
* Array creation routines at http://docs.scipy.org/doc/numpy/reference/routines.array-creation.html
* NumPy random functions at http://docs.scipy.org/doc/numpy/reference/routines.random.html
* Data type objects at http://docs.scipy.org/doc/numpy/reference/arrays.dtypes.html
* Data types at http://docs.scipy.org/doc/numpy/user/basics.types.html

## Loading arrays from files

The ``np.load()`` and ``np.save()`` functions allow you to import and export NumPy arrays from/to binary files in a custom format.

In [6]:
np.save('array.dat', np.array([[4, 5, 7], [9, 1, 3]]), allow_pickle=False)

In [7]:
np.load('array.dat.npy')

array([[4, 5, 7],
       [9, 1, 3]])

``tostring()`` and ``fromstring()``

In [15]:
np.array([[1, 2], [3, 4], [5, 6]]).tostring()

b'\x01\x00\x00\x00\x00\x00\x00\x00\x02\x00\x00\x00\x00\x00\x00\x00\x03\x00\x00\x00\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x05\x00\x00\x00\x00\x00\x00\x00\x06\x00\x00\x00\x00\x00\x00\x00'

In [13]:
arrayString = b'\x01\x00\x00\x00\x00\x00\x00\x00\x02\x00\x00\x00\x00\x00\x00\x00\x03\x00\x00\x00\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x05\x00\x00\x00\x00\x00\x00\x00\x06\x00\x00\x00\x00\x00\x00\x00'
np.fromstring(arrayString, dtype=np.int64)

array([1, 2, 3, 4, 5, 6])

``tofile()`` and ``fromfile()``

In [27]:
a = np.array([[9.0, 8.0, 4.0], [5.0, 2.0, 1.0]])

In [28]:
a.tofile('array2.txt')

In [29]:
np.fromfile('array2.txt', dtype=np.float64)

array([ 9.,  8.,  4.,  5.,  2.,  1.])

``np.loadtxt()``

In [32]:
np.loadtxt('array3.txt', dtype=np.float64)

array([[   9.,   15.,   22.,    7.,   42.,   76.],
       [  45.,   23.,    2.,    9.,    2.,  143.],
       [  35.,    4.,    5.,   22.,   92.,   23.]])

Going from ``pandas`` to ``numpy`` is particularly easy: just use the ``.values`` attribute, available on all DataFrame and Series objects. More specifically:
* A Series corresponds to a 1D NumPy array.
* A DataFrame corresponds to a 2D NumPy array.
* A Panel corresponds to a 3D NumPy array (we won't cover this pandas structure here).

In [35]:
df = pd.read_csv('test.csv')
df

Unnamed: 0,name,value1,value2
0,BMK,437,5643
1,IUI,246,845
2,YEI,87,67
3,BEY,478,57
4,CMY,5876,77


In [36]:
df[['value1', 'value2']].values

array([[ 437, 5643],
       [ 246,  845],
       [  87,   67],
       [ 478,   57],
       [5876,   77]])

Here are a few references:
* Links between NumPy and pandas data structures at http://pandas.pydata.org/pandas-docs/stable/dsintro.html#dataframeinteroperability-with-numpy-functions
* Input/output routines at http://docs.scipy.org/doc/numpy/reference/routines.io.html