# This Notebook...
Contains some advanced topics' overall details from the bird's eye view. We are not going to discuss things in detail as they are just the functions but may save the day sometime.
___
This section covers the topics from Appendix A from the Pandas Book. <br>
Happy to see you here!

In [1]:
import numpy as np

# 

* In numpy *every array* is **strided**. Which means, it has a pointer which knows from it's datatype that how much it should move each step.

* And also it determines NOT TO make a copy but just a VIEW of the array when doing some striding.

* So it has "striding" information

**NumPy array has minimum these blocks of informations stored**:

- A _pointer_
- The _dtype_
- The _tuple_ for shape property
- A _tuple of strides_

# 

# ndarray Object interanals 

### Look at the strides in this way
## `arr.strides`

In [2]:
arr = np.ones((3,4,2)) # 3 blocks with 4 rows and 2 columns
arr

array([[[1., 1.],
        [1., 1.],
        [1., 1.],
        [1., 1.]],

       [[1., 1.],
        [1., 1.],
        [1., 1.],
        [1., 1.]],

       [[1., 1.],
        [1., 1.],
        [1., 1.],
        [1., 1.]]])

In [3]:
arr.strides

(64, 16, 8)

Starting with `8 bytes each x 2 columns` = `16 byte x 4 rows` = `64 bytes x 3 blocks` = `192 bytes` as a whole.

In [5]:
arr.nbytes

192

# 

### Check if the array has "THAT" dtype?
*Foreget about using `if arr.dtype == int`... use this ↓*

## `np.issubtype` 

In [7]:
arr.dtype

dtype('float64')

In [9]:
np.issubdtype(arr.dtype, np.int)

False

In [10]:
np.issubdtype(arr.dtype, np.float)

True

In [11]:
np.issubdtype(arr.dtype, np.float32)

False

See? Float > Float64 > Float32...

Some kind of hierarchy is maintained.

#### See the hierarchy by... 

## `np.<dtype>.mro()` 


In [12]:
np.float.mro()

[float, object]

Means, above that float - there is only Object dtype

In [13]:
np.int8.mro()

[numpy.int8,
 numpy.signedinteger,
 numpy.integer,
 numpy.number,
 numpy.generic,
 object]

Mean, above int8 there are these many...

# 

# Array Manipulation 

<center> "There are many ways to work with array beyond fancy indexing, slicing, and boolean indexing" </center>

## `arr.reshape()` 

. . . 

You can pass the `-1` dimention. The other shape will be automatically picked.

In [15]:
arr = np.arange(20).reshape(5, 4)
arr

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15],
       [16, 17, 18, 19]])

Now, there are: (4,5), (10,2), (2,10), (5,2,2), (2,5,2), (2,2,5) ... combinations available. So if we pass one of them, it will work but if you want it to figure out - then we can pass `-1`.

That is called **dimention will be *inferred* from the data**

In [16]:
arr.reshape(-1, 5)

array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19]])

See? It bacame (4,5)

In [17]:
arr.reshape(2, -1, 5)

array([[[ 0,  1,  2,  3,  4],
        [ 5,  6,  7,  8,  9]],

       [[10, 11, 12, 13, 14],
        [15, 16, 17, 18, 19]]])

So cool!

# 

## `arr.flat` vs `arr.ravel` 

**flat**: Will make a copy always <br>
**ravel**: Will give the view when possible <br>

# 

### C vs Fortran order 

The way values in the array are stored may differ from their ways how they are stored.

Basically there are two main ways to store values.
1. Row major
2. Column major

For historical reasons, 
1. C order == Row major
2. Fortran order == Column major

**Here is wehre the `order` argument is revealed**

The `order` parameter is available in several methods. But for now see this.

In [18]:
arr

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15],
       [16, 17, 18, 19]])

## This may be asked in an interview!

In [19]:
arr.ravel(order= 'F')

array([ 0,  4,  8, 12, 16,  1,  5,  9, 13, 17,  2,  6, 10, 14, 18,  3,  7,
       11, 15, 19])

There are less common orders too: 'A' and 'K'. But will stick to `'C' for C` and `'F' for Fortran`.

# 

### Concatinating and Splitting arrays 

## `arr.concatenate()`
## `arr.vstack()`
## `arr.hstack()`
## ___
## `arr.split()`
## `arr.hsplit()`
## `arr.vsplit()`

Split is interesting... let's take a look.

In [22]:
arr

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15],
       [16, 17, 18, 19]])

In [23]:
np.split(arr, [1, 4])

[array([[0, 1, 2, 3]]),
 array([[ 4,  5,  6,  7],
        [ 8,  9, 10, 11],
        [12, 13, 14, 15]]),
 array([[16, 17, 18, 19]])]

Done.


# 

### Better syntax of `vstack` and `hstack`... 

## `np.r_` & `np.c_` 

This might look like out of the world thing... but they are quite useful and so easy to look at than v/h stack.

In [25]:
# Consider this example
a, b = np.arange(20).reshape(2,2,5)

In [26]:
a

array([[0, 1, 2, 3, 4],
       [5, 6, 7, 8, 9]])

In [27]:
b

array([[10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19]])

# `hstack` == `c_` 

In [28]:
np.c_[a,b]

array([[ 0,  1,  2,  3,  4, 10, 11, 12, 13, 14],
       [ 5,  6,  7,  8,  9, 15, 16, 17, 18, 19]])

In [32]:
np.hstack([a, b])

array([[ 0,  1,  2,  3,  4, 10, 11, 12, 13, 14],
       [ 5,  6,  7,  8,  9, 15, 16, 17, 18, 19]])

## `vstack` == `r_` 

In [33]:
np.r_[a,b]

array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19]])

In [34]:
np.vstack([a,b])

array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19]])

Both will fail if the sizes do not match.

In [37]:
np.r_[a, b.T]

ValueError: all the input array dimensions for the concatenation axis must match exactly, but along dimension 1, the array at index 0 has size 5 and the array at index 1 has size 2

# 

`r_` and `c_` are also useful when you want to **PRODUCE** some values in some shape.

In [40]:
np.c_[5:10, 15:20]

array([[ 5, 15],
       [ 6, 16],
       [ 7, 17],
       [ 8, 18],
       [ 9, 19]])

In [41]:
np.c_[5:10, 15:20, 30:35]

array([[ 5, 15, 30],
       [ 6, 16, 31],
       [ 7, 17, 32],
       [ 8, 18, 33],
       [ 9, 19, 34]])

In [42]:
np.c_[5:10, 15:20, 30:35, -20:-15]

array([[  5,  15,  30, -20],
       [  6,  16,  31, -19],
       [  7,  17,  32, -18],
       [  8,  18,  33, -17],
       [  9,  19,  34, -16]])

Row wise...

In [45]:
np.r_[5:10, 15:20, 30:35, -20:-15]

array([  5,   6,   7,   8,   9,  15,  16,  17,  18,  19,  30,  31,  32,
        33,  34, -20, -19, -18, -17, -16])

# 

### Repeating elements with...

## `np.tile()` and `np.repeat()`

#### `np.repeat()` 

In [46]:
arr

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15],
       [16, 17, 18, 19]])

In [47]:
np.repeat(arr, 3)

array([ 0,  0,  0,  1,  1,  1,  2,  2,  2,  3,  3,  3,  4,  4,  4,  5,  5,
        5,  6,  6,  6,  7,  7,  7,  8,  8,  8,  9,  9,  9, 10, 10, 10, 11,
       11, 11, 12, 12, 12, 13, 13, 13, 14, 14, 14, 15, 15, 15, 16, 16, 16,
       17, 17, 17, 18, 18, 18, 19, 19, 19])

When the axis are not given the array will be flatten out.

In [48]:
np.repeat(arr, 3, axis= 0)

array([[ 0,  1,  2,  3],
       [ 0,  1,  2,  3],
       [ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 4,  5,  6,  7],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [ 8,  9, 10, 11],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15],
       [12, 13, 14, 15],
       [12, 13, 14, 15],
       [16, 17, 18, 19],
       [16, 17, 18, 19],
       [16, 17, 18, 19]])

In [49]:
np.repeat(arr, 3, axis= 1)

array([[ 0,  0,  0,  1,  1,  1,  2,  2,  2,  3,  3,  3],
       [ 4,  4,  4,  5,  5,  5,  6,  6,  6,  7,  7,  7],
       [ 8,  8,  8,  9,  9,  9, 10, 10, 10, 11, 11, 11],
       [12, 12, 12, 13, 13, 13, 14, 14, 14, 15, 15, 15],
       [16, 16, 16, 17, 17, 17, 18, 18, 18, 19, 19, 19]])

##### For each value, we can give different repeat value. 

In [50]:
arr

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15],
       [16, 17, 18, 19]])

In [52]:
np.repeat(arr, [1,5,2,4,3], axis= 0)

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 4,  5,  6,  7],
       [ 4,  5,  6,  7],
       [ 4,  5,  6,  7],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15],
       [12, 13, 14, 15],
       [12, 13, 14, 15],
       [12, 13, 14, 15],
       [16, 17, 18, 19],
       [16, 17, 18, 19],
       [16, 17, 18, 19]])

#### `np.tile()` 

This repeats whole block. 

In [54]:
np.tile(arr, 2)

array([[ 0,  1,  2,  3,  0,  1,  2,  3],
       [ 4,  5,  6,  7,  4,  5,  6,  7],
       [ 8,  9, 10, 11,  8,  9, 10, 11],
       [12, 13, 14, 15, 12, 13, 14, 15],
       [16, 17, 18, 19, 16, 17, 18, 19]])

In [56]:
np.tile(arr, 2, axis= 0)

TypeError: _tile_dispatcher() got an unexpected keyword argument 'axis'

Whoops! See that `tile()` doesn't have the axis keyword. By default it glues on the column side. To make it glue on row side... we can **pass a tuple**.

In [58]:
np.tile(arr, (2,1))

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15],
       [16, 17, 18, 19],
       [ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15],
       [16, 17, 18, 19]])

Giving both!

In [59]:
np.tile(arr, (2,2))


array([[ 0,  1,  2,  3,  0,  1,  2,  3],
       [ 4,  5,  6,  7,  4,  5,  6,  7],
       [ 8,  9, 10, 11,  8,  9, 10, 11],
       [12, 13, 14, 15, 12, 13, 14, 15],
       [16, 17, 18, 19, 16, 17, 18, 19],
       [ 0,  1,  2,  3,  0,  1,  2,  3],
       [ 4,  5,  6,  7,  4,  5,  6,  7],
       [ 8,  9, 10, 11,  8,  9, 10, 11],
       [12, 13, 14, 15, 12, 13, 14, 15],
       [16, 17, 18, 19, 16, 17, 18, 19]])

# 

### Fancy indexing in Non-Fancy way. 

## `np.take()` and `np.put()` 

*Yes ↑ the same take() method that we have seen.*

```python
# take is equivalent to 
arr.take(index)
# to
arr[inds]
```

```python
# put is equivalent to
arr.put(index, value)
# to
arr[inds] = value
```

# 

# Next up. 
ufuncs!