# Creating ndarray

NumPy offers several array creation functions. We have already tried the `.array` function to create an array from a List. There are several other built-in functions offered by NumPy for quick use. 

In [1]:
import numpy as np

a = [1,2,3,4,6]   # This is a list
arr = np.array(a) # Passing the list, converting to numpy array

arr

array([1, 2, 3, 4, 6])

This is the default function we have used so far to create an array. 

Let's try something else. 

In [2]:
a = np.arange(5)
print(a)

[0 1 2 3 4]


Nice uh? We have quickly created a 1D array, comprising of 5 integers, starting from the number 0 to the range value specified. 

Let's try again. 

In [3]:
a = np.arange(10)
print(a)

[0 1 2 3 4 5 6 7 8 9]


Since we specified `10` as input to `arange`, we got an array with 10 elements in it.

Let's try some more examples.

In [4]:
a = np.zeros(5)
print(a)

[0. 0. 0. 0. 0.]


Here you go. We now have an array containing 5 elements, but each element has the value `0`. It is interesting to note that the value `0` is actually of type float. We can spot it because the value reads `0.` and not `0`. The `0.` is same as having the value as `0.0`

So how do we produce an array with integer zeros? We do so by explicitely specifying the datatype. 

In [5]:
a = np.zeros(5, dtype=np.int32)
print(a)

[0 0 0 0 0]


Many array creation functions in NumPy support an extra parameter of data type. Here `dtype=np.int32` explicitely specifies that we need the elements to be of type `int32`. We can see that the result now contains `0` and not `0.`

# Array Creation Functions

The below table shows an exhuastive list of array creation functions found in NumPy. You can give them a try and see how each of them works. We strongly encourage that you do try them and play around with them. 

![](./images/array_creation.JPG)

# Data Types

The below table shows an exhaustive list of data types. Keep in mind that you can also create 1D, 2D and 3D arrays with string data. You are not just restricted to Numercial data. In fact, elements of the array can be of any type, including objects. 

Do play around to create arrays using various of these data types. 

![](./images/data_types.JPG)

# Datatype Conversions

We can convert the datatype of elements in an array. NumPy provides a conveninet `astype` function to fetch a matrix in the desired datatype. For the operation to succeed, it must be possible to convert all elements to the specified datatype. 

In [6]:
a = np.array(['1', '2', '3'])
print(a)
b = a.astype(np.int32)
print(b)

['1' '2' '3']
[1 2 3]


In the above example, we started with an array of type String. The array `a` has all elements as string, and this can be viewed in the corresponding print. 

We used the `astype` function then to create a new array of type int. All elements within the array were convereted to int and returned. This is represented by the array `b`

Keep in mind that calling the `astype` function creates a new array. The original array remains unchanged.

# Arithmetic Operations

Let's perform some basic arithmetic operations like **addition, subtraction, multiplication and division** on below 2 arrays using numpy. 

In [7]:
import numpy as np

a = np.array([[1,1],[1,1]])
b = np.array([[2,2],[2,2]])

## Addition

In [8]:
a+b

array([[3, 3],
       [3, 3]])

## Subtraction

In [9]:
a-b

array([[-1, -1],
       [-1, -1]])

## Multiplication

In [10]:
a*b

array([[2, 2],
       [2, 2]])

## Division

In [11]:
a/b

array([[0.5, 0.5],
       [0.5, 0.5]])

## Raising powers

Also we can raise powers of array values.

In [12]:
# Array 'a' values raised to the power of 2.
np.power(a,2)

array([[1, 1],
       [1, 1]])

Let's try the same thing with the `b` matrix and see if the power raise works. 

In [13]:
# Array 'b' values raised to the power of 2.
np.power(b,2)

array([[4, 4],
       [4, 4]])

# Reshaping a matrix

We can also reshape our matrix using **reshape** function.

## Reshaping a 1D matrix
Let's reshape a matrix with 1 dimension from shape (1,3) to (3,1). <br>
**Example 1**

In [14]:
a = np.array([1,2,3])
a

array([1, 2, 3])

In [15]:
a = np.reshape(a, (3,1))
a

array([[1],
       [2],
       [3]])

**Example 2**<br>
Reshaping a matrix from shape (1,8) to (8,1).

In [16]:
a = np.array([4,5,6,12,76,26,67,73])
a
# Shape : (1,8)

array([ 4,  5,  6, 12, 76, 26, 67, 73])

In [17]:
np.reshape(a, (8,1))

array([[ 4],
       [ 5],
       [ 6],
       [12],
       [76],
       [26],
       [67],
       [73]])

## Reshaping a 2D matrix
Let's reshape a matrix with 2 dimensions from shape (2,3) to (3,2).

**Example 1**

In [18]:
b = np.array([[1,2,3],
              [4,5,6]])
b

array([[1, 2, 3],
       [4, 5, 6]])

In [19]:
b = np.reshape(b, (3,2))
b

array([[1, 2],
       [3, 4],
       [5, 6]])

It is important to closely observe how the matrix was actually reshaped. See how 3 and 4 got arranged on the same row after the reshape operation.

**Example 2**<br>
Reshaping a matrix from shape (3,5) to (5,3).

In [20]:
b = np.array([[56,76,31,44,55],
              [33,78,123,33,28],
              [12,23,43,1,2]])
b
# Shape : (3,5)

array([[ 56,  76,  31,  44,  55],
       [ 33,  78, 123,  33,  28],
       [ 12,  23,  43,   1,   2]])

In [21]:
np.reshape(b, (5,3))

array([[ 56,  76,  31],
       [ 44,  55,  33],
       [ 78, 123,  33],
       [ 28,  12,  23],
       [ 43,   1,   2]])

## Reshaping a 3D matrix

Let's reshape a 3D matrix from shape (2,2,3) to (3,2,2).

**Example 1**

In [22]:
c = np.array([[[1,2,3],
                [4,5,6]],
                
                [[7,8,9],
                [10,11,12]]])

c

array([[[ 1,  2,  3],
        [ 4,  5,  6]],

       [[ 7,  8,  9],
        [10, 11, 12]]])

In [23]:
np.reshape(c, (3,2,2))

array([[[ 1,  2],
        [ 3,  4]],

       [[ 5,  6],
        [ 7,  8]],

       [[ 9, 10],
        [11, 12]]])

**Example 2**<br>
Reshaping a matrix from shape (1,2,3) to (1,3,2). This should effectively just reshape the inside 2D matrix. Let's give it a try. 

In [24]:
c = np.array([[[1,2,3],
               [4,5,6]]])
c

array([[[1, 2, 3],
        [4, 5, 6]]])

In [25]:
np.reshape(c, (1,3,2))

array([[[1, 2],
        [3, 4],
        [5, 6]]])

**NOTE** : You can reshape the 2D matrices present inside a 3D matrix. Consider the above example, we reshaped a 3D matrix from shape (1,2,3) to (1,3,2). That is, we simply reshaped the inside 2D matrix (2,3) to (3,2). 

---
# Joining

Out of different join functions available in numpy, we mostly use `concatenate` to join 2 or more arrays. Make sure that either of one axes is the same.

**Use Case** This is one of the most useful functions which can be used to combine two or more datasets to work on a single dataset.

**Note** : In numpy,

`axis=1 refers to rows`
`axis=0 refers to columns`

## Joining two 1D arrays


In [26]:
a = np.array([1,2,3])
b = np.array([4,5,6,12,76,26,67,73])

np.concatenate([a,b])

array([ 1,  2,  3,  4,  5,  6, 12, 76, 26, 67, 73])

The contatenate function produces a new matrix. The orignal matrices `a` and `b` remain unchanged after the concatenate operation.

## Joining two 2D arrays


In [27]:
a = np.array([[1,2,3],
              [4,5,6]])
b = np.array([[ 56,  76,  31],
       [ 44,  55,  33],
       [ 78, 123,  33],
       [ 28,  12,  23],
       [ 43,   1,   2]])

np.concatenate([a,b], axis=0)

array([[  1,   2,   3],
       [  4,   5,   6],
       [ 56,  76,  31],
       [ 44,  55,  33],
       [ 78, 123,  33],
       [ 28,  12,  23],
       [ 43,   1,   2]])

When we specify `axis=0` we actually mean that we want to append the columns. We can see that rows got added into the matrix, and the columns received the additional values. 

Let us try performing a concatenation for `axis=1`

In [28]:
a = np.array([[1,2,3],
              [4,5,6]])
b = np.array([[ 56,  76,  31],
       [ 44,  55,  33]])

np.concatenate([a,b], axis=1)

array([[ 1,  2,  3, 56, 76, 31],
       [ 4,  5,  6, 44, 55, 33]])

## Joining two 3D arrays

In [29]:
a = np.array([[[1,2,3],
                [4,5,6]],
                
                [[7,8,9],
                [10,11,12]]]) # Shape : (2,2,3)

b = np.array([[[99,199,299]],[[88,188,288]]]) # Shape : (2,1,3)

np.concatenate([a,b],axis=1)

array([[[  1,   2,   3],
        [  4,   5,   6],
        [ 99, 199, 299]],

       [[  7,   8,   9],
        [ 10,  11,  12],
        [ 88, 188, 288]]])

# Splitting
We use `array_split` to split an array into `n` number of parts.

**Use Case** Splitting can be used to divide the dataset, hence working or experimenting on various parts of data to understand it's behaviour.

In [30]:
a = np.array([1, 2, 3, 4, 5, 6])
newarr = np.array_split(a, 3)

print(newarr)

[array([1, 2]), array([3, 4]), array([5, 6])]


As you can see above, a new array is created, splitting `a` into `3` different arrays. 


# Random

The numpy.random module is responsible for generating arrays with random numbers. Each time you run the below cells, different numbers are generated.

## np.random.randn

The **randn** generates an array of random float numbers whose shape is to be given and it draws the numbers from a **standard normal distribution.**

In [31]:
import numpy as np

# Below (3,2) represents the shape of the array.
np.random.randn(3,2)

array([[ 0.34291505,  0.15846929],
       [-0.28041854,  1.37424051],
       [-1.00608324, -1.23477004]])

## np.random.randint
The **randint** generates an array of random integers.


In [32]:
# Below, an array of size (3,2) is generated with random values between 1 to 10.
np.random.randint(1,10, size=(3,2))

array([[4, 7],
       [3, 6],
       [2, 9]])

## np.random.rand
The **rand** generates an array of random float numbers whose shape is to be given and are drawn from a **uniform distribution.**

In [33]:
# Below (2,4) represents the shape of the array.
np.random.rand(2,4)

array([[0.61846814, 0.38662633, 0.25613095, 0.5793173 ],
       [0.31091698, 0.79434679, 0.02547833, 0.11601057]])

## np.random.seed

The **seed** is associated with a number and each **np.random** computation done below it shall be fixed and no matter how many times the cell is run, the array numbers are fixed and will not change. <br> Simply, each random arrays generated are associated with the seed value.
<br><br> In the below code block, 
1. First we set the seed value to 5. 
2. Next, we will generate an array of random integers between 1 to 10.
3. Now run the code block at least 2 times and you'll notice that the values are unchanged.
4. Now set a different seed value and run the code block at least 2 times to generate new random values.

In [34]:
np.random.seed(seed=5)

np.random.randint(1,10, size=(3,2))

array([[4, 7],
       [7, 1],
       [9, 5]])

You can try running the above code a multiple times, and you will always get the same matrix. This is because, we have explicity specified a seed value. This is why we call a random number generator in computer programming as a psudo random number generator. It does not generate numbers that are truly random. 

---
# Broadcasting

This is one of the most essential topics in numpy. The term broadcasting describes how NumPy treats arrays with different shapes during arithmetic operations.

NumPy operations are usually done element-by-element which requires two arrays to have exactly the same shape. However, convenience methods in NumPy allow us to use shortcuts. Let's say want to multiple the elements of an array with the number 2. How would we do it? 

In [35]:
a = np.array([1, 2, 3])
b = 2

print(a * b)

[2 4 6]


From above result, we can infer that the scalar `b` is stretched during the arithmetic operation into an array with same shape as `a` so the shapes are compatible for element-by-element multiplication.
![Array Broadcasting image](https://numpy.org/doc/stable/_images/theory.broadcast_1.gif "Array Broadcasting image") [source](https://numpy.org/doc/stable/user/theory.broadcasting.html#array-broadcasting-in-numpy)


The multiplication of the matrix by the number 2, is actually not a valid arithmatic operation. This is why we say that NumPy actually created an array from the `2` by stretching, and then performing the appropriate multiplication operation.

## The Broadcasting Rule

> In order to broadcast, the size of the axes of both arrays in an operation must either be the same size or one of them must be one.


Consider an example below,

Suppose there's a matrix `a` and you would like to scale `1st row by 3 times, 2nd row by 4 times and 3rd row by 5 times`.

    a   =  [[1,2,3],       
            [4,5,6],
            [7,8,9]]
    
    Shape of a = (3,3)
    
Now in order to scale the matrix `a` by given values, we'll make another matrix `b`,
    
    b   =   [3,
             4,
             5]
    
    Shape of b = (3,1)
    
As the broadcasting rule says,
1. Size of both arrays are same, (i.e) size of a = 3, size of b = 3.
2. One of the array's axes is equal to `1`. (i.e) `y` axis of array `b` = 1


Now since the **Broadcasting Rule** is met, the broadcasting can happen.

    Output = [[3,8,15],
              [12,20,30],
              [21,32,45]]

In [36]:
a = np.array([[1,2,3],
              [4,5,6],
              [7,8,9]])

b = np.array([3,4,5])

print(a*b)

[[ 3  8 15]
 [12 20 30]
 [21 32 45]]


---
**Case when broadcast fails** <br>
When the trailing dimensions of the arrays are unequal, broadcasting fails because it is impossible to align the values in the rows of the 1st array with the elements of the 2nd arrays for element-by-element addition.
![Broadcast Fail](https://numpy.org/doc/stable/_images/theory.broadcast_3.gif "Broadcast Fail")

[source](https://numpy.org/doc/stable/user/theory.broadcasting.html#array-broadcasting-in-numpy)