In [2]:
import numpy as np

### np.single

In [4]:
np.array([1, 1.2, 1.3, "22"], dtype=np.single)

array([ 1. ,  1.2,  1.3, 22. ], dtype=float32)

In [5]:
np.array([1, 1.2, 1.3, "22"])

array(['1', '1.2', '1.3', '22'], dtype='<U32')

___
### String Types: Sized Unicode

In [16]:
names = np.array(['bob', 'kim', 'amy'])
names

array(['bob', 'kim', 'amy'], dtype='<U3')

In [17]:
names.itemsize

12

In [12]:
names2 = np.array(['sarah', 'ashly', 'johny'])
names2

array(['sarah', 'ashly', 'johny'], dtype='<U5')

In [13]:
names2.itemsize

20

In [19]:
names2 = np.append(names2, 'josephine')
names2

array(['sarah', 'ashly', 'johny', 'josephine'], dtype='<U9')

In [20]:
names2.itemsize

36

###### important point when modifing the array of strings: 

In [21]:
x = np.array(['bob', 'kim', 'amy'])
x

array(['bob', 'kim', 'amy'], dtype='<U3')

In [22]:
x[1] = "ashly"
x

array(['bob', 'ash', 'amy'], dtype='<U3')

as you can see we couldn't fit 'ashly' into the array, why?<br>
because as we can see, the dtype of the array is 'U3' which means each elment in the list can only contain 3<br>
characters and when we want to replace a 3 char element in the array with 'ashly' which has 5 charcters,<br>
it doesn't fit, so we only get the first 3 characters of the word so it'll fit into the array,<br>
!!if we append 'ashly' into the array, the result will be different because the array will be resized

___
### Structured Arrays
<div class="alert alert-block alert-info"> Important<div>

Originally, you learned that array items all have to be the same data type, but that wasn’t entirely correct. NumPy has a special kind of array, called a record array or structured array, with which you can specify a type and, optionally, a name on a per-column basis. This makes sorting and filtering even more powerful, and it can feel similar to working with data in Excel, CSVs, or relational databases.

Structured arrays or record arrays are useful when you perform computations, and at the same time you could keep closely related data together. For example, when you process incident data and each incident contains geographic coordinates and the occurrence time, while you calculate the final result, you can easily find the associated geographic locations and timepoint for further visualization. NumPy also provides powerful capabilities to create arrays of records, as multiple data types live in one NumPy array. However, one principle in NumPy that still needs to be honored is that the data type in each field (you can think of this as a column in the records) needs to be homogeneous. Here are some simple examples that show you how it works:

In [21]:
x = np.empty((2,), dtype = ('i4, f4, U10')) 
x

array([(0, 0., ''), (0, 0., '')],
      dtype=[('f0', '<i4'), ('f1', '<f4'), ('f2', '<U10')])

In [22]:
x[1] = 12, 1.2, 'hi'
x

array([( 0, 0. , ''), (12, 1.2, 'hi')],
      dtype=[('f0', '<i4'), ('f1', '<f4'), ('f2', '<U10')])

In [23]:
x[:] = [(18, 0.5, 'jack'), (24, 3.5, 'adam')]
x

array([(18, 0.5, 'jack'), (24, 3.5, 'adam')],
      dtype=[('f0', '<i4'), ('f1', '<f4'), ('f2', '<U10')])

<br>we can use the field name to obtain the value of certain fields, so in the previous example, we used f2 to obtain the string field

In [17]:
x['f0'] # accessing the data using field name

array([1, 2], dtype=int32)

In [18]:
x['f1']

array([0.5, 3.5], dtype=float32)

In [19]:
x['f2']

array(['jack', 'adam'], dtype='<U10')

<br>
You might be wondering whether the default field name can be changed to something meaningful in your analysis? Of course it can! This is how:



In [24]:
x.dtype.names = 'age', 'score', 'name'

In [25]:
x

array([(18, 0.5, 'jack'), (24, 3.5, 'adam')],
      dtype=[('age', '<i4'), ('score', '<f4'), ('name', '<U10')])

In [26]:
x['age']

array([18, 24], dtype=int32)

In [27]:
x['name']

array(['jack', 'adam'], dtype='<U10')

___
By assigning the new field names back to the names attribute in the dtype object, we can have our customized field names. Or you can do this when you initialize the record arrays by using a list with a tuple, or a dictionary. In the following examples, we are going to create two identical record arrays with customized field names using a list, and a dictionary:

In [3]:
list_x = np.ones(
    shape=(2, ),
    dtype=[
        ('id', 'i4'),
        ('value', 'f4', (2, ))
    ]
)
list_x

array([(1, [1., 1.]), (1, [1., 1.])],
      dtype=[('id', '<i4'), ('value', '<f4', (2,))])

In [4]:
dic_x = np.ones(
    shape=(2, ),
    dtype={
        'names': ['id', 'value'],
        'formats': ['i4', '2f4'],
    }
)
dic_x

array([(1, [1., 1.]), (1, [1., 1.])],
      dtype=[('id', '<i4'), ('value', '<f4', (2,))])

! the dictionary keys must be exactly 'names' and 'formats'
___

In [35]:
x

array([(18, 0.5, 'jack'), (24, 3.5, 'adam')],
      dtype=[('age', '<i4'), ('score', '<f4'), ('name', '<U10')])

In [46]:
x[['name']]

array([('jack',), ('adam',)],
      dtype={'names':['name'], 'formats':['<U10'], 'offsets':[8], 'itemsize':48})

In [48]:
x[['name', 'age']]

array([('jack', 18), ('adam', 24)],
      dtype={'names':['name','age'], 'formats':['<U10','<i4'], 'offsets':[8,0], 'itemsize':48})

___
### SQL Like Query on Structured Data
<div class="alert alert-block alert-info"> Important<div>

In [58]:
heros = np.array([
    ('batman', 'DC', 80),
    ('superman', 'DC', 100),
    ('iron man', 'Marvel', 85),
    ('flash', 'DC', 70),
    ('hulk', 'Marvel', 95),],
    
    dtype={
    'names': ['name', 'universe', 'power'],
    'formats': ['U10', 'U6', np.int8]
    }
)
heros

array([('batman', 'DC',  80), ('superman', 'DC', 100),
       ('iron man', 'Marvel',  85), ('flash', 'DC',  70),
       ('hulk', 'Marvel',  95)],
      dtype=[('name', '<U10'), ('universe', '<U6'), ('power', 'i1')])

In [59]:
heros[0]

('batman', 'DC', 80)

In [60]:
heros['name']

array(['batman', 'superman', 'iron man', 'flash', 'hulk'], dtype='<U10')

In [65]:
heros[heros['power'] > 80]['name']

array(['superman', 'iron man', 'hulk'], dtype='<U10')

In [73]:
np.sort(heros, order='power')['name']

array(['flash', 'batman', 'iron man', 'hulk', 'superman'], dtype='<U10')

<br></br>
___
### Dates and time in NumPy

In [76]:
x = np.datetime64('2022-02-22')
x.dtype

dtype('<M8[D]')

In [77]:
y = np.datetime64('2022-02')
y.dtype

dtype('<M8[M]')

we can also specify the type

In [78]:
y = np.datetime64('2015-04', 'D')
y

numpy.datetime64('2015-04-01')