## Data types

Data type 	Description
- **bool_** Boolean (True or False) stored as a byte
- **int_**  Default integer type (same as C long; normally either int64 or int32)
- **intc** 	Identical to C int (normally int32 or int64)
- **intp** 	Integer used for indexing (same as C ssize_t; normally either int32 or int64)
- **int8** 	Byte (-128 to 127)
- **int16** Integer (-32768 to 32767)
- **int32** Integer (-2147483648 to 2147483647)
- **int64** Integer (-9223372036854775808 to 9223372036854775807)
- **uint8** Unsigned integer (0 to 255)
- **uint16** Unsigned integer (0 to 65535)
- **uint32** Unsigned integer (0 to 4294967295)
- **uint64** Unsigned integer (0 to 18446744073709551615)
- **float_** Shorthand for float64.
- **float16** Half precision float: sign bit, 5 bits exponent, 10 bits mantissa
- **float32** Single precision float: sign bit, 8 bits exponent, 23 bits mantissa
- **float64** Double precision float: sign bit, 11 bits exponent, 52 bits mantissa
- **complex_** Shorthand for complex128.
- **complex64** Complex number, represented by two 32-bit floats (real and imaginary components)
- **complex128** Complex number, represented by two 64-bit floats (real and imaginary components)

## NumPy Arrays

Main data structure in NumPy is *ndarray objects*, basically they are N-dimensional arrays. Arrays in NumPy are very similar to Python lists and represent collection of same data-type values.

##### Example

1 dimensional array (often is called a *vector*:
```
vector = [1, 2, 3]
```

2 dimensional array (often is called *matrix*)
```
matrix = [[1, 2, 3], 
          [4, 5, 6], 
          [7, 8, 9]]
```

## Creating arrays
### Converting from Python structures

We can convert Python lists (and nested lists) to nparray using ```numpy.array()``` function:

In [2]:
import numpy as np

vector = np.array([1, 2, 3, 4, 5])
print(vector)
type(vector)

[1 2 3 4 5]


numpy.ndarray

In [3]:
matrix = np.array([
    [1, 2, 3, 4],
    ['a', 'b', 'c', 'd'],
    [3.14, 2.7, 10.0, -1.0]
])
print(matrix)
type(matrix)

[['1' '2' '3' '4']
 ['a' 'b' 'c' 'd']
 ['3.14' '2.7' '10.0' '-1.0']]


numpy.ndarray

It is possible to get infomation about quantity of rows and columns of nparray with function ```np.shape()```. First value is a number of rows, second - number of columns.

In [5]:
print(np.shape(vector))
print(np.shape(matrix))

(5,)
(3, 4)


### Reading from files

It is possible read data from files using standart Python tools, libraries, etc. In NumPy import data from csv file is possible with 
[```numpy.genfromtxt()```](https://docs.scipy.org/doc/numpy-1.10.0/reference/generated/numpy.genfromtxt.html "NumPy Documentation") 
function.

In [6]:
world_alcohol = np.genfromtxt('world_alcohol.csv', delimiter = ',')
world_alcohol

array([[             nan,              nan,              nan,
                     nan,              nan],
       [  1.98600000e+03,              nan,              nan,
                     nan,   0.00000000e+00],
       [  1.98600000e+03,              nan,              nan,
                     nan,   5.00000000e-01],
       ..., 
       [  1.98600000e+03,              nan,              nan,
                     nan,   2.54000000e+00],
       [  1.98700000e+03,              nan,              nan,
                     nan,   0.00000000e+00],
       [  1.98600000e+03,              nan,              nan,
                     nan,   5.15000000e+00]])

As we can see NumPy cenverted all values to the same data type. We can check what is this type using property of nparray ```array.dtype```

In [7]:
world_alcohol.dtype

dtype('float64')

NumPy makes it because it needs all values in the array to be the same type (not like in Python, where you can store in lists values of different data types). We can check how file looks like:

In [13]:
import csv
f = open('world_alcohol.csv')
reader = csv.reader(f)
for row in reader:
    print(row)
f.close()

['Year', 'WHO region', 'Country', 'Beverage Types', 'Display Value']
['1986', 'Western Pacific', 'Viet Nam', 'Wine', '0']
['1986', 'Americas', 'Uruguay', 'Other', '0.5']
['1985', 'Africa', "Cte d'Ivoire", 'Wine', '1.62']
['1986', 'Americas', 'Colombia', 'Beer', '4.27']
['1987', 'Americas', 'Saint Kitts and Nevis', 'Beer', '1.98']
['1987', 'Americas', 'Guatemala', 'Other', '0']
['1987', 'Africa', 'Mauritius', 'Wine', '0.13']
['1985', 'Africa', 'Angola', 'Spirits', '0.39']
['1986', 'Americas', 'Antigua and Barbuda', 'Spirits', '1.55']
['1984', 'Africa', 'Nigeria', 'Other', '6.1']
['1987', 'Africa', 'Botswana', 'Wine', '0.2']
['1989', 'Americas', 'Guatemala', 'Beer', '0.62']
['1985', 'Western Pacific', "Lao People's Democratic Republic", 'Beer', '0']
['1984', 'Eastern Mediterranean', 'Afghanistan', 'Other', '0']
['1985', 'Western Pacific', 'Viet Nam', 'Spirits', '0.05']
['1987', 'Africa', 'Guinea-Bissau', 'Wine', '0.07']
['1984', 'Americas', 'Costa Rica', 'Wine', '0.06']
['1989', 'Africa'

First row in header and it consists of string values. We need to get rid of it:

In [16]:
world_alcohol = np.genfromtxt('world_alcohol.csv', dtype = 'U75', skip_header = 1, delimiter = ',')
world_alcohol

array([['1986', 'Western Pacific', 'Viet Nam', 'Wine', '0'],
       ['1986', 'Americas', 'Uruguay', 'Other', '0.5'],
       ['1985', 'Africa', "Cte d'Ivoire", 'Wine', '1.62'],
       ..., 
       ['1986', 'Europe', 'Switzerland', 'Spirits', '2.54'],
       ['1987', 'Western Pacific', 'Papua New Guinea', 'Other', '0'],
       ['1986', 'Africa', 'Swaziland', 'Other', '5.15']], 
      dtype='<U75')

## Indexing and slicing

As in Python indexing in NumPy begins from 0.

For 1 dimensional array (vector) indexing works exactly as in Python lists:

In [17]:
l = [1, 2, 3, 4, 5]
print(l[0])

vector = np.array([1, 2, 3, 4, 5])
print(vector[0])

1
1
