## Task
Explore structured arrays in NumPy

## Notebook Summary
* Primarily used when interfacing with C, Fortran, etc. For all other cases, use `pandas`
* Custom dtypes
* Structured arrays
* record arrays - same as structured arrays but allow access as attributes

## References
* *Python for Data Analysis*, Wes McKinney, O'Reilly, 2012
* *Numerical Python*, Robert Johansson, APress, 2015
* *Python Data Science Handbook*, Jake VanderPlas, O'Reilly, 2016


In [4]:
# display output from all cmds just like Python shell
from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"

import platform
print 'python.version = ', platform.python_version()
import IPython
print 'ipython.version =', IPython.version_info

import numpy as np
print 'numpy.version = ', np.__version__


python.version =  2.7.10
ipython.version = (5, 1, 0, '')
numpy.version =  1.11.3


In [14]:
# custom dtypes - 3 equivalent ways to specify

# as a dictionary
t1 = np.dtype({
    'names': ('type', 'x', 'y'),
    'formats': ('U10', 'i4', 'i4')
})
t1

# as list of tuples
t2 = np.dtype([
    ('type', 'U10'),
    ('x', 'i4'),
    ('y', 'i4')
])
t2

# as types alone in a comma-separated string
t3 = np.dtype('U10,i4,i4')
t3


dtype([('type', '<U10'), ('x', '<i4'), ('y', '<i4')])

dtype([('type', '<U10'), ('x', '<i4'), ('y', '<i4')])

dtype([('f0', '<U10'), ('f1', '<i4'), ('f2', '<i4')])

In [19]:
# structured array

t1 = np.dtype({
    'names': ('type', 'x', 'y'),
    'formats': ('U10', 'i4', 'i4')
})

arr = np.zeros(4, dtype=t1)
arr

arr[0] = ('A', 1, 1)
arr[1] = ('B', 2, 2)
arr[2] = ('C', 3, 3)
arr[3] = ('D', 4, 4)
arr


array([(u'', 0, 0), (u'', 0, 0), (u'', 0, 0), (u'', 0, 0)], 
      dtype=[('type', '<U10'), ('x', '<i4'), ('y', '<i4')])

array([(u'A', 1, 1), (u'B', 2, 2), (u'C', 3, 3), (u'D', 4, 4)], 
      dtype=[('type', '<U10'), ('x', '<i4'), ('y', '<i4')])

In [35]:
# accessing values in structured arrays

print '----- Access via names'
arr['type']
arr['x']
arr['y']

print '\n----- Access via index'
arr[1]
arr[1]['type']


print '\n----- Filtering'
arr['x'] >= 3
arr[arr['x'] >= 3]['type']

(arr['x'] >= 3) & (arr['x'] <= 4)


----- Access via names


array([u'A', u'B', u'C', u'D'], 
      dtype='<U10')

array([1, 2, 3, 4], dtype=int32)

array([1, 2, 3, 4], dtype=int32)


----- Access via index


(u'B', 2, 2)

u'B'


----- Filtering


array([False, False,  True,  True], dtype=bool)

array([u'C', u'D'], 
      dtype='<U10')

array([False, False,  True,  True], dtype=bool)