# Basic Rules:

## Must Have Same Data Type  

***
***


In Numpy, all arrays must have the same data type: integer, floats, etc.  
This is a difference between Python and Numpy.  

Because of that difference Numpy is actually faster than Python for numerical based operations. 

In [3]:
import numpy as np

### Declaring Arrays  

***


In [4]:
simple_array = np.array([1, 2, 3, 5, 7])

type(simple_array)

numpy.ndarray

In [5]:
simple_array_list = np.array((2, 3, 5))

type(simple_array_list)

numpy.ndarray

## Type of Data:  

***
***


### Checking data type of arrays:  

***


In [6]:
simple_array.dtype

dtype('int32')

### Assigning data type when declaring arrays:  

***


In [7]:
simple_array = np.array([1, 2, 3, 5, 7], dtype='i')

simple_array.dtype

dtype('int32')

In [8]:
simple_array_list = np.array((2, 3, 5), dtype='f')

simple_array_list.dtype

dtype('float32')

More options of data types are available (64 bits, etc).

## Characteristics of Arrays:  

***
***


### Dimensions  

***


In [9]:
two_dim_array = np.array([[1, 2, 3], 
                          [4, 5, 6]
                         ])

two_dim_array.ndim

2

The method `.ndim` just tells us how many dimensions the array has, or rather how many arrays we can access.  
The 1st array `[1, 2, 3]` is accessed via the index `[0]` and the 2nd one via `[1]`.  

The number 3 from array indexed as `[0]` can be accessed so:

In [10]:
two_dim_array[0,2]

3

The data type of this multi-dimensional array is an ndarray:

In [11]:
type(two_dim_array)

numpy.ndarray

#### Same Size Arrays:

Note that ndarrays indexed [0] and [1] are of the same size, each holding 3 values.  
This is a requirement for multi-dimensional arrays.  
                     
**They MUST hold arrays of the same size.** 

The following ndarray looks like it has 2 dimensions due to it holding 2 arrays. However Numpy sees it as a 1 dimension array:

In [12]:
false_multidim_array = np.array([[0, 1, 2], 
                               [0, 1, 2, 3]
                                ])
false_multidim_array.ndim

ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (2,) + inhomogeneous part.

>This results in an ERROR.  
Numpy needs homogeneous arrays, meaning same size arrays.

#### Multiple ndarrays:

If for example an array holds two ndarrays that each are 2 dimensional (meaning they each hold 2 arrays), Numpy sees it **as a 3 dimensional array**:

In [13]:
three_dim_madeof_2multidims = np.array([
    [[0, 1, 2], [3, 4, 5]],  # the first 2 dimensional array
    [[0, -1, -2], [-3, -4, -5]]  # the second 2 dimensional array
])

print(three_dim_madeof_2multidims.ndim)

type(three_dim_madeof_2multidims)

3


numpy.ndarray

### Shapes of ndarrays:  

***


So we just saw that the array that has multiple ndarrays is of a dimension of 3.  

`.shape` is a method that returns the values of the ndarrays held in such a multiple ndarrays. 

In [14]:
three_dim_madeof_2multidims.shape

(2, 2, 3)

So the tuple that is returned gives us 3 values:  
`(2, 2, 3)`  

The first 2 is the number of ndarrays that are held.  
The second 2 is the number of arrays that each of these 2 ndarrays are holding.  
And the 3 is the size of each of these ndarrays, meaning the amount of indexes they each hold (so 3 means indexes 0 to 2).  

So one can notice that these 3 values are each referencing a deeper value, giving information about the next contained array or data.  


## Useful Methods to create ndarrays (matrices)  

***
***


#### .arange():

This `.arange()` method creates an array of all the number that are passed as arguments, from 0 to 100 (not included), with optional steps.  
For example:

In [15]:
array_0to100_step2 = np.arange(0, 100, 2)
array_0to100_step2

array([ 0,  2,  4,  6,  8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32,
       34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66,
       68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98])

This is a copy of the Python loop method `range()`, when used like this:  

`for i in range(0, 100, 2)`

#### .random.permutation():

This `np.random.permutation()` shuffles all the entries of an array, changing the index of each and all values.  

Here I create an array of numbers 0 to 10, and I will then shuffle all the numbers.  

In [16]:
array_0toten = np.random.permutation(np.arange(11))
array_0toten

array([ 8,  1,  3,  5,  4, 10,  7,  9,  2,  0,  6])

.random.randint():

This `np.random.randint()` is another method that is available in the random package.  
It returns an integer, randomly, from a range defined as argument.  

In [17]:
single_random_integer = np.random.randint(111, 222)
print(single_random_integer)
type(single_random_integer)

153


int

**Note that the returned value is actually an integer**

#### .random.rand() to create arrays:

The `np.random.rand()` is very useful when one wants to create ndarrays or just arrays that are filled with only randomized values.  

In [18]:
four_dim_array_of_ndarrays = np.random.rand(2, 3, 4, 2)

print(four_dim_array_of_ndarrays)
four_dim_array_of_ndarrays.ndim

[[[[0.22735851 0.58159459]
   [0.12663096 0.28940161]
   [0.38408653 0.08136789]
   [0.6185699  0.52575921]]

  [[0.05214658 0.45185709]
   [0.07204307 0.6412778 ]
   [0.72417532 0.24478105]
   [0.74074992 0.96322167]]

  [[0.96796628 0.6500481 ]
   [0.78264356 0.21384532]
   [0.40229783 0.85843745]
   [0.00758543 0.12100009]]]


 [[[0.65546537 0.87238653]
   [0.29754638 0.89869883]
   [0.81858412 0.0807973 ]
   [0.9607609  0.27031525]]

  [[0.15864224 0.70972061]
   [0.61825745 0.07745101]
   [0.06838418 0.07841767]
   [0.35997192 0.14213282]]

  [[0.33995151 0.27445522]
   [0.12342669 0.24337768]
   [0.32593736 0.1648945 ]
   [0.01872192 0.17960553]]]]


4

Here's how to read it:

![image.png](attachment:49bef688-262b-475a-869c-ff36473dad43.png)

Note that all the values randonly created are floating values ranging from 0 to 1.  
It is common to multiply by 10 or 100 to get higher valuesm like so:  

`four_dim_array_of_ndarrays = 10*np.random.rand(2, 3)`

In [19]:
four_dim_array_of_ndarrays= 100*np.random.rand(2, 3)
four_dim_array_of_ndarrays

array([[18.36357621, 84.85991956, 59.98814156],
       [ 6.77268815, 13.12377321, 35.08145051]])

#### .reshape() to create matrices:

Now to create a matrice of 4 rows and 25 columns, we can start by creating an array of number ranging from 0 to 100 (excluded) and then use  
`.reshape()` to shape it into a 4 rows 25 columns matrice:

In [20]:
zero_to99_array = np.arange(100).reshape(4, 25)

print(zero_to99_array.shape)
print(zero_to99_array)

(4, 25)
[[ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
  24]
 [25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48
  49]
 [50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73
  74]
 [75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98
  99]]


>Note that `.reshape()` HAS TO follow an `np.arange()` like it is in the example above.  
Otherwise it would not work.

## Slicing:  

***
***


### Slicing an array does not create a copy of it:  

***


In Python the slicing of a list for instance is a way to quickly get an independant copy of that list.  
**HOWEVER** in numpy this is not the case.  
>The slicing still refers to the same memory allocated to the first array, the one being sliced.  
So the array that results from the slicing is dependant on the first one. 

In [21]:
array_tobe_sliced = np.arange(100)  # the array is 1 dim, with values ranging 0 to 99

slicing_array = array_tobe_sliced[3:10]  # we slice the array into one with range 3 to 9
print(slicing_array)

[3 4 5 6 7 8 9]


Now we change one value so that we can see that it also affected the array that was sliced (so the first one).  

In [22]:
slicing_array[0] = -999  # we change the first entry, so the number 3
slicing_array

array([-999,    4,    5,    6,    7,    8,    9])

We now print `array_tobe_sliced` to confirm that it is affected by the change made in its subarray (slicing_array):

In [23]:
print(array_tobe_sliced)

[   0    1    2 -999    4    5    6    7    8    9   10   11   12   13
   14   15   16   17   18   19   20   21   22   23   24   25   26   27
   28   29   30   31   32   33   34   35   36   37   38   39   40   41
   42   43   44   45   46   47   48   49   50   51   52   53   54   55
   56   57   58   59   60   61   62   63   64   65   66   67   68   69
   70   71   72   73   74   75   76   77   78   79   80   81   82   83
   84   85   86   87   88   89   90   91   92   93   94   95   96   97
   98   99]


To make the slicing_array an **actual INDEPENDANT copy**, we can still use the `.copy()` method:

In [24]:
indepdt_slicing_array = array_tobe_sliced[3:10].copy()

indepdt_slicing_array[0] = 333  # changed first entry, the 3, to a value of 333

print(array_tobe_sliced)  # printing the original array to show the 10th entry did not change
print(indepdt_slicing_array)  # printing the independant copy

[   0    1    2 -999    4    5    6    7    8    9   10   11   12   13
   14   15   16   17   18   19   20   21   22   23   24   25   26   27
   28   29   30   31   32   33   34   35   36   37   38   39   40   41
   42   43   44   45   46   47   48   49   50   51   52   53   54   55
   56   57   58   59   60   61   62   63   64   65   66   67   68   69
   70   71   72   73   74   75   76   77   78   79   80   81   82   83
   84   85   86   87   88   89   90   91   92   93   94   95   96   97
   98   99]
[333   4   5   6   7   8   9]


**NOTE:**  
>There is another way to make an independant copy of an array, it is via the array index syntax, like:  
`new_ind_arraycopy = array_tobe_sliced[[0, 1, 2, 3, 4, 5]]`

**Note that 2 square brackets are needed.**

This creates a smaller independant copy of array_tobe_sliced, with just the values at indexes 0 to 5. Any change made to that new copy does not impact the original one.  

More about this in the Manipulating Data parag. down below.

### Slicing syntax:  

***


#### from Start to end with jumps:

The syntax for that is:  

`array[::x]`  
where x is the jump value. So if x = 2, it will go from 0 to the last value and jumps 2 numbers. 

In [25]:
array_tobe_sliced[::3]  # slicing from Start to End with a jump of 3

array([   0, -999,    6,    9,   12,   15,   18,   21,   24,   27,   30,
         33,   36,   39,   42,   45,   48,   51,   54,   57,   60,   63,
         66,   69,   72,   75,   78,   81,   84,   87,   90,   93,   96,
         99])

So we sliced from the start (0) to the end (99) and jumped 3 indexes every time.

#### from End to start with jumps:

This is almost the same syntax as the previous one:  

`array[::-x]`  
The -x indicates to numpy that it should start at the end and go backward down to the beginning. 

In [26]:
array_tobe_sliced[::-3]  # slicing from the end (99) down to 0 with jumps of 3

array([  99,   96,   93,   90,   87,   84,   81,   78,   75,   72,   69,
         66,   63,   60,   57,   54,   51,   48,   45,   42,   39,   36,
         33,   30,   27,   24,   21,   18,   15,   12,    9,    6, -999,
          0])

So if instead of -3 we use -1 we will get every entry in a reverse order.  
This is a way to reverse an array.

In [27]:
val_index_tofind = -999

for each_entry in array_tobe_sliced:
    is_it_1 = int(val_index_tofind) / array_tobe_sliced[each_entry]
    if is_it_1 == 1:
        print(f"the index of the value -999 is {array_tobe_sliced[each_entry]}")
        break

  is_it_1 = int(val_index_tofind) / array_tobe_sliced[each_entry]


IndexError: index -999 is out of bounds for axis 0 with size 100

## Extracting information from Matrice:

***
***


### Accessing rows:  

***


To return a whole row from a matrice the syntax is:  

`array[row_index,:]`
Where `row_index` is the index of the row one wants to get returned. 

>REMEMBER that the **first index is always 0**, so the first row is indexed at value 0.  

This will give this example:

In [28]:
array_a = np.round(10*np.random.rand(5, 4))  # the .round is rounding the values to an integer value (no float), and we get a 5 rows 4 col matric
array_a

array([[ 4.,  4.,  1.,  3.],
       [ 3.,  0.,  7., 10.],
       [ 4.,  0.,  4.,  8.],
       [ 5.,  1.,  4.,  7.],
       [ 3.,  9.,  5.,  7.]])

In [29]:
array_a[0,:]

array([4., 4., 1., 3.])

So the 0 in `[0,:]` references the 1st row and the ":" tells numpy to return the whole row. 

### Accessing Columns:  

***


Now to access columns one just needs to invert the values. So instead of:  
`array_a[0,:]` for the whole 1st row  

we write:  
`array_a[:,0]` to get the whole 1st column

In [30]:
array_a[:,0]

array([4., 3., 4., 5., 3.])

#### Accessing both:

So now that we know that the syntax structure reference the row for the first argument value and the column for the 2nd one,  
we can follow that structure to access any slice of any of row or column. 

In [31]:
array_a[1, 2:4]  # returning from the 2nd row, columns 3 & 4

array([ 7., 10.])

In [32]:
array_a[0:2, 1:4]  # returning from rows 1 to 2 the columns 2 to 4

array([[ 4.,  1.,  3.],
       [ 0.,  7., 10.]])

### .argwhere() to find the index of a value:  

***


When we want to find the index of a value sometimes looking at the matrice itself is tedious, especially if it is a big matrice with tens of rows and col.  

In this case the `np.argwhere()` method will return the index of any given entry's value:

Let's try it with the value -999 in the matrice array_tobe_sliced, which I will rename to save some time:

In [33]:
atbs = array_tobe_sliced
atbs

array([   0,    1,    2, -999,    4,    5,    6,    7,    8,    9,   10,
         11,   12,   13,   14,   15,   16,   17,   18,   19,   20,   21,
         22,   23,   24,   25,   26,   27,   28,   29,   30,   31,   32,
         33,   34,   35,   36,   37,   38,   39,   40,   41,   42,   43,
         44,   45,   46,   47,   48,   49,   50,   51,   52,   53,   54,
         55,   56,   57,   58,   59,   60,   61,   62,   63,   64,   65,
         66,   67,   68,   69,   70,   71,   72,   73,   74,   75,   76,
         77,   78,   79,   80,   81,   82,   83,   84,   85,   86,   87,
         88,   89,   90,   91,   92,   93,   94,   95,   96,   97,   98,
         99])

Now we can use the method:

In [35]:
index_of_minus999 = np.argwhere(atbs==-999)[0][0]
index_of_minus999

3

The variable now holds the index value of -999, which is 3. 

## Manipulating Data:  

***
***


### Transposing:  

***


To transpose the method is just .T, as in:  
`array_a.T`

In [74]:
array_a.T

array([[ 3.,  9.,  1.,  5.,  5.],
       [ 7.,  6.,  8.,  2., 10.],
       [ 4.,  2.,  0.,  6.,  1.],
       [ 8., 10.,  5.,  5.,  8.]])

### Sorting in ascending order:  

***


#### Columns:

So if we want to sort each column of the array_a here below in an ascending order, we can use the .sort() method:

In [75]:
array_a

array([[ 3.,  7.,  4.,  8.],
       [ 9.,  6.,  2., 10.],
       [ 1.,  8.,  0.,  5.],
       [ 5.,  2.,  6.,  5.],
       [ 5., 10.,  1.,  8.]])

In [78]:
array_a.sort(axis=0)  # the axis 0 indicates to sort the columns

In [77]:
array_a

array([[ 1.,  2.,  0.,  5.],
       [ 3.,  6.,  1.,  5.],
       [ 5.,  7.,  2.,  8.],
       [ 5.,  8.,  4.,  8.],
       [ 9., 10.,  6., 10.]])

So now each column starts from its lowest value and ascends to its highest.

#### Rows:

For rows we just need to change the axis value to 1, like so:  

In [79]:
array_a.sort(axis=1)  # the axis 1 indicates to sort the rows
array_a

array([[ 0.,  1.,  2.,  5.],
       [ 1.,  3.,  5.,  6.],
       [ 2.,  5.,  7.,  8.],
       [ 4.,  5.,  8.,  8.],
       [ 6.,  9., 10., 10.]])

#### Copying via the Array of Indexes: Masking 

Like we introduced in the Slicing parag. at the beginning, it is possible to create an independant copy via the array index syntax: 

`ind_copyarray = original_array[[0, 1, 2, 3]]`

This way, known as **"Masking" (or "Fancy Indexing")**, creates an **independant, smaller, copy** of the original_array, which will not be affected by changes brought to the copy.

A faster way to create a bigger or a true copy is to use **boolean conditionals such as "<" or ">"**, like so:
We assume the original array is 1 dim with 100 entries.  

`ind_copyarray = original_array[original_array<=100]`

Here we are effectively bringing the whole 100 entries over to a new var that is now an independant copy.  
Let's demonstrate with our array_a from above:

In [43]:
print(array_a)

ind_copy = array_a[array_a<11]
print(f"This is the copy:\n{ind_copy}")

[[ 4.  4.  1.  3.]
 [ 3.  0.  7. 10.]
 [ 4.  0.  4.  8.]
 [ 5.  1.  4.  7.]
 [ 3.  9.  5.  7.]]
This is the copy:
[ 4.  4.  1.  3.  3.  0.  7. 10.  4.  0.  4.  8.  5.  1.  4.  7.  3.  9.
  5.  7.]


Now let's modify an entry of ind_copy and see that array_a is not affected by this change:

In [45]:
ind_copy[-1] = 111
print(ind_copy)
print(array_a)

[  4.   4.   1.   3.   3.   0.   7.  10.   4.   0.   4.   8.   5.   1.
   4.   7.   3.   9.   5. 111.]
[[ 4.  4.  1.  3.]
 [ 3.  0.  7. 10.]
 [ 4.  0.  4.  8.]
 [ 5.  1.  4.  7.]
 [ 3.  9.  5.  7.]]


The last entry of array_a did not change to 111. 

#### Copying with 2 conditionals:

In this Array Index syntax we are using 2 conditionals:  

`ind_copy_B = array_a[(array_a < 111) & (array_a > 2)]`  

The "&" is actually a third conditional here that adds the 1st conditional "<111" to the 2nd one ">2".  
This creates an independant copy too. 

In [46]:
ind_copy_B = array_a[(array_a < 111) & (array_a > 2)]  
ind_copy_B

array([ 4.,  4.,  3.,  3.,  7., 10.,  4.,  4.,  8.,  5.,  4.,  7.,  3.,
        9.,  5.,  7.])

As expected the copy array has all values < 111 and > 2. 

**NOTE:**  
>There is a diference when using "&" and "and" to add multiple conditionals.  
The "and" is not for arrays, it is for single objects. While "&" works perfect with arrays just like we did. 

### Broadcasting: Adding a Scalar to a whole Matrice:  

***


When it is needed to add one value to each entry of a matrice, the technique called Broadcasting is used. Its syntax is very simple:  

`twodim_array+X`

Here the X is the scalar we want to add to each entry of the matrice twodim_array. 

In [55]:
twodim_array = np.round(10*np.random.rand(2,3))  # creating a 2 rows 3 col matrice populated with randomized integer values
twodim_array

array([[ 9.,  5.,  1.],
       [ 8.,  3., 10.]])

In [58]:
twodim_array+100

array([[109., 105., 101.],
       [108., 103., 110.]])

### Concatenating: horizontally & vertically:  

***


So if we want to extend a matrice to its right side (horizontally) we can just use this method:  

`np.hstack((array_a, array_b))`  

And to extend vertically (toward the bottom side) we use this one:  

`np.vstack((array_a, array_b))`

In [68]:
copy_twodim_array = twodim_array+100  # creating a copy of twodim_array that has +100 to all entries

concat_h = np.hstack((twodim_array, copy_twodim_array))
print(f"This is horizontal concatenation:\n{concat_h}") 

concat_v = np.vstack((twodim_array, copy_twodim_array))
print(f"This is vertical concatenation:\n{concat_v}") 

This is horizontal concatenation:
[[  9.   5.   1. 109. 105. 101.]
 [  8.   3.  10. 108. 103. 110.]]
This is vertical concatenation:
[[  9.   5.   1.]
 [  8.   3.  10.]
 [109. 105. 101.]
 [108. 103. 110.]]
