## Structured Data

<i>Structured arrays</i>, also called <i>record arrays</i> provide efficient storage for compound, heterogeneous data. These are useful when one would like to perform computations, while also keeping closely related data together.

NumPy allows one to store heterogeneous data in a single array structure. The only requirement is that all the data in one column of the array must be of the same type.

In [1]:
import numpy as np

#### Create lists to store information about employees

In [2]:
emp_names = ['Danielle', 'Lorena', 'Manuel', 'Ryan', 'Teresa', 'Wes']
emp_ids = [1, 2, 3, 4, 5, 6]
emp_scores = [78.2, 57.50, 90, 77, 96.20, 87.37]

#### Define the NumPy array with the names and types for each column
* U16 represents a 16-character Unicode string
* i4 is short for int32 (i for int, 4 for 4 bytes)
* f8 is shorthand for float64

In [3]:
emp_data = np.zeros(6, dtype = {'names' : ('Name', 'ID', 'Score'),
                                'formats':('U16', 'i4', 'f8')})

In [4]:
emp_data

array([('', 0, 0.), ('', 0, 0.), ('', 0, 0.), ('', 0, 0.), ('', 0, 0.),
       ('', 0, 0.)],
      dtype=[('Name', '<U16'), ('ID', '<i4'), ('Score', '<f8')])

#### Set the values for each field in the array

In [5]:
emp_data['Name'] = emp_names
emp_data['ID'] = emp_ids
emp_data['Score'] = emp_scores

emp_data

array([('Danielle', 1, 78.2 ), ('Lorena', 2, 57.5 ), ('Manuel', 3, 90.  ),
       ('Ryan', 4, 77.  ), ('Teresa', 5, 96.2 ), ('Wes', 6, 87.37)],
      dtype=[('Name', '<U16'), ('ID', '<i4'), ('Score', '<f8')])

In [6]:
emp_data['Name']

array(['Danielle', 'Lorena', 'Manuel', 'Ryan', 'Teresa', 'Wes'],
      dtype='<U16')

In [7]:
emp_data['ID']

array([1, 2, 3, 4, 5, 6], dtype=int32)

In [8]:
emp_data['Score']

array([78.2 , 57.5 , 90.  , 77.  , 96.2 , 87.37])

#### Access individual records

In [9]:
emp_data[2]

('Manuel', 3, 90.)

#### Get individual elements

In [10]:
emp_data[-2]['Name']

'Teresa'

In [11]:
emp_data[-2]['Score']

96.2

#### Retrieve data based on conditions

In [12]:
emp_data[emp_data['Score'] > 85]['Name']

array(['Manuel', 'Teresa', 'Wes'], dtype='<U16')

In [13]:
emp_data[emp_data['Score'] < 75]['Name']

array(['Lorena'], dtype='<U16')