## More data types

### Casting

**“Bigger” type wins in mixed-type operations:**

In [3]:
import numpy as np

np.array([1, 2, 3]) + 1.5

array([ 2.5,  3.5,  4.5])

**Assignment never changes the type!**

In [5]:
a = np.array([1, 2, 3])
a.dtype

dtype('int32')

In [7]:
a[0] = 1.9  # <-- float is truncated to integer
a

array([1, 2, 3])

In [8]:
a[0] = 4
a

array([4, 2, 3])

**Forced casts:**

In [9]:
a = np.array([1.7, 1.2, 1.6])
b = a.astype(int)  # truncates to int
b

array([1, 1, 1])

**Rounding:**

In [10]:
a = np.array([1.7, 1.2, 1.6, 2.2, 3.8])
b= np.around(a)
b

array([ 2.,  1.,  2.,  2.,  4.])

In [11]:
c = np.around(a).astype(int)
c

array([2, 1, 2, 2, 4])

### Different data type sizes

In [12]:
np.array([1], dtype=int).dtype

dtype('int32')

In [13]:
np.iinfo(np.int32).max, 2**31 - 1

(2147483647, 2147483647)

In [15]:
np.iinfo(np.int64).max, 2**63 - 1

(9223372036854775807, 9223372036854775807)

## Structured data types


* sensor_code	(4-character string)
* position	    (float)
* value	        (float)

In [16]:
samples = np.zeros((6,), dtype=[('sensor_code', 'S4'), ('position', float), ('value', float)])

In [17]:
samples.ndim

1

In [18]:
samples.shape

(6,)

In [19]:
samples.dtype.names

('sensor_code', 'position', 'value')

In [22]:
samples[:] = [('ALFA',   1, 0.37), ('BETA', 1, 0.11), ('TAU', 1,   0.13), ('ALFA', 1.5, 0.37), ('ALFA', 3, 0.11), ('TAU', 1.2, 0.13)]

In [23]:
samples

array([(b'ALFA',  1. ,  0.37), (b'BETA',  1. ,  0.11),
       (b'TAU',  1. ,  0.13), (b'ALFA',  1.5,  0.37),
       (b'ALFA',  3. ,  0.11), (b'TAU',  1.2,  0.13)],
      dtype=[('sensor_code', 'S4'), ('position', '<f8'), ('value', '<f8')])

**Field access works by indexing with field names:**

In [24]:
samples['sensor_code']

array([b'ALFA', b'BETA', b'TAU', b'ALFA', b'ALFA', b'TAU'],
      dtype='|S4')

In [25]:
samples['value']

array([ 0.37,  0.11,  0.13,  0.37,  0.11,  0.13])

In [26]:
samples[0]

(b'ALFA',  1.,  0.37)

In [27]:
samples[0]['sensor_code']

b'ALFA'

**Multiple fields at once:**

In [28]:
samples[['position', 'value']]

array([( 1. ,  0.37), ( 1. ,  0.11), ( 1. ,  0.13), ( 1.5,  0.37),
       ( 3. ,  0.11), ( 1.2,  0.13)],
      dtype=[('position', '<f8'), ('value', '<f8')])

**Fancy indexing works, as usual:**

In [29]:
samples[samples['sensor_code'] == 'ALFA']   

array([], shape=(0, 6),
      dtype=[('sensor_code', 'S4'), ('position', '<f8'), ('value', '<f8')])

## maskedarray: dealing with (propagation of) missing data

** * For floats one could use NaN’s, but masks work for all types:**

In [31]:
x = np.ma.array([1, 2, 3, 4], mask=[0, 1, 0, 1])
x

masked_array(data = [1 -- 3 --],
             mask = [False  True False  True],
       fill_value = 999999)

In [32]:
y = np.ma.array([1, 2, 3, 4], mask=[0, 1, 1, 1])
y

masked_array(data = [1 -- -- --],
             mask = [False  True  True  True],
       fill_value = 999999)

In [33]:
x + y

masked_array(data = [2 -- -- --],
             mask = [False  True  True  True],
       fill_value = 999999)

** * Masking versions of common functions:**

In [34]:
np.ma.sqrt([1, -1, 2, -2]) 

masked_array(data = [1.0 -- 1.4142135623730951 --],
             mask = [False  True False  True],
       fill_value = 1e+20)