# NumPy Basics
Series of Fundamental/Syntax lessons about using NumPy module.

<br>

## Arrays
How create and how to use array.

### Creating Arrays

In [24]:
import sys
import numpy as np

In [6]:
# creating array
ar = np.array([1,2,3,4,4.5,5,8.2])

In [8]:
# accessing data
ar[3]

4.0

In [10]:
# access multiple data
ar[0:]

array([1. , 2. , 3. , 4. , 4.5, 5. , 8.2])

In [11]:
ar[:-1]

array([1. , 2. , 3. , 4. , 4.5, 5. ])

In [12]:
ar[1:-1]

array([2. , 3. , 4. , 4.5, 5. ])

In [13]:
ar[::-1]

array([8.2, 5. , 4.5, 4. , 3. , 2. , 1. ])

In [14]:
# access different data inside array
ar[[0, 3 ,-1]]

array([1. , 4. , 8.2])

### Array Types

An array of pure whole numbers has different type than an array with float numbers

In [16]:
b = np.array([5,6,7,8])

In [18]:
ar.dtype

dtype('float64')

In [19]:
b.dtype

dtype('int32')

In [22]:
c = np.array(['a', 'b', 'c', 'd'])

In [23]:
c.dtype

dtype('<U1')

You can change an array's data type

In [21]:
np.array([334,224,554], dtype=np.float)

array([334., 224., 554.])

### Dimensions and Shapes

Creating multi-dimensional arrays (Multidimensional arrays - an extension of 2D matrices and use additional subscripts for indexing) expect from 1D array we got above.

In [30]:
# 2D array
d = np.array([
    [12,13,14],
    [23,34,45]
])

In [31]:
# returns the shape of our array, here we got 2 rows and 3 columns
d.shape

(2, 3)

In [32]:
# returns how many dimension we got
d.ndim

2

In [33]:
# returns how many cell we got
d.size

6

#### Creating a 3D array

In [35]:
# 3D array
e = np.array([
    [
        [0,9,8],
        [7,6,5]
    ],
    [
        [4,3,2],
        [1,-1,-2]
    ]
])

In [37]:
e.shape # 2 groups, 2 rows, 2 columns

(2, 2, 3)

In [38]:
e.ndim

3

In [39]:
e.size

12

Data in array should be consistent or it'll fall back to regular Python

In [40]:
f = np.array([
    [
        [12,13,24]
    ],
    [
        [23,34,14],
        [21,32,43]
    ]
])

In [42]:
f.dtype # 'O' type means object

dtype('O')

In [43]:
f.shape

(2,)

In [44]:
f.size

2

### Indexing and Slicing of Matrices

In [46]:
g = np.array([
    [1,2,3],
    [4,5,6],
    [7,8,9]
])

In [47]:
# slicing through dimensions (d1, d2, d3, d4, ...)
g[1, 1] # accessing the 5

5

In [48]:
# selecting all dimension 1, before 2
g[:, :2]

array([[1, 2],
       [4, 5],
       [7, 8]])

In [52]:
# replacing a dimension's value by automatic expanding it into the whole dimension
g[2] = 34

In [53]:
g

array([[ 1,  2,  3],
       [ 4,  5,  6],
       [34, 34, 34]])

In [54]:
g[2, 1] = 88

In [55]:
g

array([[ 1,  2,  3],
       [ 4,  5,  6],
       [34, 88, 34]])

### Summary Statistics

In [56]:
d.sum()

141

In [57]:
d.mean()

23.5

In [58]:
d.std() #standard deviation

12.284814474246922

In [59]:
d.var()

150.91666666666666

You can also do this methods using axis

In [60]:
# axis 0 means vertical
d.sum(axis = 0)

array([35, 47, 59])

In [62]:
# axis 1 means horizontal
d.mean(axis = 1)

array([13., 34.])

<br>

## Broadcasting and Vectorized Operations

**Vectorized Operations** - allows the use of more optimal and pre-compiled functions and mathematical operations on NumPy array objects and data sequences. The output and operations are executed faster than non-vectorized operations.

In [18]:
# creating array through range
a = np.arange(4)

In [19]:
a

array([0, 1, 2, 3])

In [20]:
# vectorized operation example
a + 10

array([10, 11, 12, 13])

In [21]:
a

array([0, 1, 2, 3])

**Broadcasting operations** - executing an operation and saving (modifying/replacing/overriding) it 

In [22]:
a += 10

In [23]:
a

array([10, 11, 12, 13])

<br>

## Boolean Array

**Booleans** (True, False) can also be also used in an array. You can use it as data accessing instead of using indices or accessing through conditions (best use). 

In [25]:
a = np.array([1,2,3,4])

In [28]:
# accessing with booleans
a[[False, True, False, True]]

array([2, 4])

In [32]:
# example of boolean array
a % 2 == 0

array([False,  True, False,  True])

In [31]:
# accessing through conditions (filtering)
a[a % 2 == 0]

array([2, 4])

>Note: Boolean operators return boolean arrays, which is useful in filtering arrays.

<br>

## Algebra and Size

Standard Python objects take up more memory than NumPy objects; operations on NumPy objects complete very quickly compared to comparable objects in standard Python.

In [49]:
# size in bytes of a number in Regular Python
sys.getsizeof(1)

14

In [50]:
# size in bytes of a number in Numpy
np.dtype(np.int).itemsize

4

Above, we can see the difference of Regular Python and NumPy in terms of storing. 

We can also modify how many bytes we can use.

In [52]:
np.dtype(np.int16).itemsize

2

In [53]:
sys.getsizeof([1])

32

In [54]:
np.array([1]).itemsize

4

Performance is also way better in Numpy

In [82]:
a = list(range(1000000))

In [83]:
%time sum([x**2 for x in a])

Wall time: 1.54 s


333332833333500000

In [84]:
b = np.arange(1000000)

In [85]:
%time np.sum(b ** 2)

Wall time: 10 ms


584144992