#### DTYPE
* The data type object 'dtype' is an instance of numpy.dtype class. It can be created with numpy.dtype
* So far, we have used in our examples of numpy arrays only fundamental numeric data types like 'int' and 'float'.
* These numpy arrays contained solely homogenous data types. dtype objects are
construed by combinations of fundamental data types.
* With the aid of dtype we are capable to create "Structured Arrays", - also
known as "Record Arrays"

In [2]:
import numpy as np
il6 = np.dtype(np.int16)
print(il6)

lst = [[3.4, 8.7, 9.9],
       [1.1, -7.8, -0.7],
       [4.1, 12.3, 4.9]]

A = np.array(lst, dtype=il6)
print(A)

int16
[[ 3  8  9]
 [ 1 -7  0]
 [ 4 12  4]]


#### Strucutured Arrays
* ndarrays are homogeneous data objects, i.e. all elements of an array have to be of the same data type.
* The data type dytpe on the other hand allows as to define separate data types for each column.

In [3]:
import numpy as np

dt = np.dtype([('density', np.int32)])
x = np.array([(393, ),(337, ),(256, )], dtype=dt)

print(x)
print("\nThe internal representaion:")
print(repr(x))


print(x['density'])

[(393,) (337,) (256,)]

The internal representaion:
array([(393,), (337,), (256,)], dtype=[('density', '<i4')])
[393 337 256]


In [4]:
intt = np.dtype([('Integers', np.int16)])
y = np.array([(3),(4),(4)], dtype = intt)
print(y)

print(repr(y))

print(y['Integers'])

[(3,) (4,) (4,)]
array([(3,), (4,), (4,)], dtype=[('Integers', '<i2')])
[3 4 4]


In [5]:
# little - endian ordering
dt = np.dtype('<d')
print(dt.name, dt.byteorder, dt.itemsize)

# big - endian ordering
dt = np.dtype('>d')
print(dt.name, dt.byteorder, dt.itemsize)

# native byte ordering
dt = np.dtype('d')
print(dt.name, dt.byteorder, dt.itemsize)



float64 = 8
float64 > 8
float64 = 8


* So you may ask yourself, if it is possible to use tuples and lists interchangeably? This is not possible.
* The tuples are used to define the records - in our case consisting solely of a density - and the list is the 'container' for the records or in other words 'the lists are cursed upon'.
* The tuples define the atomic elements of the structure and the lists the dimensions 

In [6]:
dt = np.dtype([('Country', 'S20'), ('density', 'i4'), ('area', 'i4'), ('population', 'i4')])
population_table = np.array([
    ('Netherlands', 393, 41536, 16928800),
    ('Belgium', 337, 30510, 11007020),
    ('United Kingdom', 256, 243610, 62262000),
    ('Germany', 233, 357021, 81799600),
    ('Liechtenstein', 205, 160, 32842),
    ('Italy', 192, 301230, 59715625),
    ('Switzerland', 177, 41290, 7301994),
    ('Luxembourg', 173, 2586, 512000),
    ('France', 111, 547030, 63601002),
    ('Austria', 97, 83858, 8169929),
    ('Greece', 81, 131940, 11606813),
    ('Ireland', 65, 70280, 4581269),
    ('Sweden', 20, 449964, 9515744),
    ('Finland', 16, 338424, 5410233),
    ('Norway', 13, 385252, 5033675)
                             ], dtype = dt)

print(population_table[:4])


[(b'Netherlands', 393,  41536, 16928800)
 (b'Belgium', 337,  30510, 11007020)
 (b'United Kingdom', 256, 243610, 62262000)
 (b'Germany', 233, 357021, 81799600)]


In [7]:
# printing each column individually
print("Printing Each Colum individually")
print(f"\nCountry: \n{population_table['Country']}")
print(f"\nDensity: \n{population_table['density']}")
print(f"\nArea: \n{population_table['area']}")
print(f"\nPopulation: \n {population_table['population']}")

Printing Each Colum individually

Country: 
[b'Netherlands' b'Belgium' b'United Kingdom' b'Germany' b'Liechtenstein'
 b'Italy' b'Switzerland' b'Luxembourg' b'France' b'Austria' b'Greece'
 b'Ireland' b'Sweden' b'Finland' b'Norway']

Density: 
[393 337 256 233 205 192 177 173 111  97  81  65  20  16  13]

Area: 
[ 41536  30510 243610 357021    160 301230  41290   2586 547030  83858
 131940  70280 449964 338424 385252]

Population: 
 [16928800 11007020 62262000 81799600    32842 59715625  7301994   512000
 63601002  8169929 11606813  4581269  9515744  5410233  5033675]


#### UNICODE STRINGS IN ARRAY
* Some may have noticed that the strings in our previous array have been prefixed with a lower case "b".
* This means that we have created binary strings with the definition "('country', 'S20')".
* To get unicode strings we exchange this with the definition "('country', np.unicode, 20)".
* We will redefine our population table now:

In [9]:
dt = np.dtype([('country', np.unicode_, 20),
               ('density', 'i4'),
               ('area', 'i4'),
               ('population', 'i4')
              ])
population_table = np.array([
    ('Netherlands', 393, 41526, 16928800),
    ('Belgium', 337, 30510, 11007020),
    ('United Kingdom', 256, 243610, 62262000),
    ('Germany', 233, 357021, 81799600),
    ('Liechtenstein', 205, 160, 32842),
    ('Italy', 192, 301230, 59715625),
    ('Switzerland', 177, 41290, 7301994),
    ('Luxembourg', 173, 2586, 512000),
    ('France', 111, 547030, 63601002),
    ('Austria', 97, 83858, 8169929),
    ('Greece', 81, 131940, 11606813),
    ('Ireland', 65, 70280, 4581269),
    ('Sweden', 20, 449964, 9515744),
    ('Finland', 16, 338424, 5410233),
    ('Norway', 13, 385252, 5033675),
    ], dtype=dt )
print(population_table)

[('Netherlands', 393,  41526, 16928800) ('Belgium', 337,  30510, 11007020)
 ('United Kingdom', 256, 243610, 62262000)
 ('Germany', 233, 357021, 81799600)
 ('Liechtenstein', 205,    160,    32842) ('Italy', 192, 301230, 59715625)
 ('Switzerland', 177,  41290,  7301994)
 ('Luxembourg', 173,   2586,   512000) ('France', 111, 547030, 63601002)
 ('Austria',  97,  83858,  8169929) ('Greece',  81, 131940, 11606813)
 ('Ireland',  65,  70280,  4581269) ('Sweden',  20, 449964,  9515744)
 ('Finland',  16, 338424,  5410233) ('Norway',  13, 385252,  5033675)]
