In [1]:
import numpy as np

Create a structured array using a compound data type specification:

In [2]:
data = np.zeros(4, dtype={'names': ('name', 'age', 'weight'),
                          'formats': ('U10', 'i4', 'f8')})

data.dtype

dtype([('name', '<U10'), ('age', '<i4'), ('weight', '<f8')])

- `'U10'` translates to "Unicode string of maximum length 10"

-  `'i4'` translates to "4-byte (i.e., 32 bit) integer" 

-  `'f8'` translates to "8-byte (i.e., 64 bit) float."

In [4]:
data

array([('', 0, 0.), ('', 0, 0.), ('', 0, 0.), ('', 0, 0.)],
      dtype=[('name', '<U10'), ('age', '<i4'), ('weight', '<f8')])

In [5]:
name = ['Alice', 'Bob', 'Cathy', 'Doug']
age = [25, 45, 37, 19]
weight = [55.0, 85.5, 68.0, 61.5]

In [6]:
data['name'] = name
data['age'] = age
data['weight'] = weight

data

array([('Alice', 25, 55. ), ('Bob', 45, 85.5), ('Cathy', 37, 68. ),
       ('Doug', 19, 61.5)],
      dtype=[('name', '<U10'), ('age', '<i4'), ('weight', '<f8')])

|       | name <br />(max. length 10) | age <br />(4-byte int) | weight<br />(8-byte float) |
| :---: | :-------------------------: | :--------------------: | :------------------------: |
| Alice |           'Alice'           |           25           |            55.0            |
|  Bob  |            'Bob'            |           45           |            85.5            |
| Cathy |           'Cathy'           |           37           |            68.0            |
| Doug  |           'Doug'            |           19           |            61.5            |

In [8]:
# Get all names
data['name']

array(['Alice', 'Bob', 'Cathy', 'Doug'], dtype='<U10')

In [9]:
# Get first row of data
data[0]

('Alice', 25, 55.)

In [10]:
# Get the name from the last row
data[-1]['name']

'Doug'

Use Boolean masking to do some filtering:

In [11]:
# Get names where age is under 30
data[data['age'] < 30]['name']

array(['Alice', 'Doug'], dtype='<U10')

## Create Structured Arrays

### Dictionary method

In [12]:
np.dtype({'names':('name', 'age', 'weight'),
          'formats':('U10', 'i4', 'f8')})

dtype([('name', '<U10'), ('age', '<i4'), ('weight', '<f8')])

Numerical types can be specified using Python types or NumPy `dtype`s instead:

In [13]:
np.dtype({'names':('name', 'age', 'weight'),
          'formats':((np.str_, 10), int, np.float32)})

dtype([('name', '<U10'), ('age', '<i8'), ('weight', '<f4')])

### Tuples method

In [14]:
np.dtype([('name', 'S10'), ('age', 'i4'), ('weight', 'f8')])

dtype([('name', 'S10'), ('age', '<i4'), ('weight', '<f8')])

If the names of the types do not matter, just specify the typles alone in a **comma-separated** string

In [15]:
np.dtype('S10,i4,f8')

dtype([('f0', 'S10'), ('f1', '<i4'), ('f2', '<f8')])

### String format

- First (optional) character: specifies the ordering convention for significant bits

  | Character | Meaning       |
  | --------- | ------------- |
  | `<`       | little endian |
  | `>`       | big endian    |

- Second character: specifies the type of data

  | Character    | Description            | Example                            |
  | :----------- | :--------------------- | :--------------------------------- |
  | `'b'`        | Byte                   | `np.dtype('b')`                    |
  | `'i'`        | Signed integer         | `np.dtype('i4') == np.int32`       |
  | `'u'`        | Unsigned integer       | `np.dtype('u1') == np.uint8`       |
  | `'f'`        | Floating point         | `np.dtype('f8') == np.int64`       |
  | `'c'`        | Complex floating point | `np.dtype('c16') == np.complex128` |
  | `'S'`, `'a'` | String                 | `np.dtype('S5')`                   |
  | `'U'`        | Unicode string         | `np.dtype('U') == np.str_`         |
  | `'V'`        | Raw data (void)        | `np.dtype('V') == np.void`         |

- Last character: represents the size of the object in bytes



## More Advanced Compound Types

We can create a type where each element contains an array or matrix of values.

In [17]:
tp = np.dtype([('id', 'i8'), ('mat', 'f8', (3, 3))])
X = np.zeros(1, dtype=tp)

In [18]:
X[0]

(0, [[0., 0., 0.], [0., 0., 0.], [0., 0., 0.]])

In [20]:
X['mat']

array([[[0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.]]])

## RecordArrays: Structured Arrays with a Twist

Almost identical to the structured arrays just described, but with one additional feature: **fields can be accessed as attributes rather than as dictionary keys.**

In [21]:
data['age']

array([25, 45, 37, 19], dtype=int32)

View as a record array:

In [22]:
data_rec = data.view(np.recarray)

In [24]:
data_rec.age # access age as an attribute

array([25, 45, 37, 19], dtype=int32)