In [2]:
import numpy as np

## Initializing Arrays

NumPy offers various ways to create and initialize arrays. You may have learned about a few of them. 


Here are some problems to test your knowledge. 

- Start by creating a $(2 \times 3)$ array identical to

\begin{bmatrix}
1 & 2 & 4 \\
7 & 13 & 21
\end{bmatrix} 
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;   and assign it to a variable `arr`.

In [None]:

# Write your code below

print(arr.shape) 
# Note that shape is a property of a numpy array and not a function

The `shape` property of an array gives you the dimensions of an array. Checking the dimensions is often a useful debugging technique when working with complex code, as it helps ensure that arrays are shaped as expected.

- Create an array of size $(n \times m)$ with each value being equal to 7.

In [None]:
n = 5
m = 7

# Write your code below

Here are a few more exercises. Do check out the numpy documentation if you face any difficulties.

In [None]:
# Create a matrix RANDOM_ARR of dimensions 4*3*4 , initialized with random values

# Your code goes here

print(RANDOM_ARR)


# Create an identity matrix of size n * n (Check your output for different values of n)
n = 3

print(EYE)

You can also change the shape of arrays using `array.reshape()` function.
Here are a few problems based on it.

In [None]:
y = np.array([[1,2,3],[4,5,6]])

# Create a new array y_transpose which is the transpose of y.

# Code goes here


print(y_transpose)

# Now create an array y_flattened which is a 1D-array with the same elements as y.

# Code goes here

print(y_flattened)


# It turns out that there are much more direct ways to flatten or transpose an array in numpy. 

y_transpose = y.T
y_flattened = y.flatten()

print(y_transpose)
print(y_flattened)
# Check that the output is the same as before
# You should still try to solve the above two exercises just using the reshape function

One fun fact about the `reshape` functionality is that you need not mention the number of elements across all the dimensions. You can skip one dimension, as it will be automatically calculated based on the number of elements in the other dimensions.. Still confused ? here's a sweet example.

In [6]:
arr = np.random.randint(1,10,size=(4,5,7,3))

# Let's say I want to change this matrix to a column vector. 
# It's pretty obvious that the new numpy array must have 2 dimensions and one of them is 1 (Since it's a column vector).
# This means I can skip the number of elements in other dimension (Numpy can calculate this on its own) and instead write -1 for ease

arr_new = arr.reshape(1,-1)
print(arr_new.shape)

(1, 420)


In [None]:
# What's the easiest way to create an array with shape (3,4) that is the same as
# the same array in the statement below?
# np.array([[1,2,3,4],[5,6,7,8],[9,10,11,12]])
arr = 

# numpy.ndarray is a homogeneous array, which means every array has a
# particular data type. What datatype does "arr" have?
print(arr.dtype)

In [8]:
# Another very useful function is numpy.linspace that generates an array of equally spaced numbers over any given interval.

start = 0.1
end =   0.9
np.linspace(start,end,num=10,endpoint=True,axis=0)

array([0.1       , 0.18888889, 0.27777778, 0.36666667, 0.45555556,
       0.54444444, 0.63333333, 0.72222222, 0.81111111, 0.9       ])

## Slicing

In [None]:
# Akin to a regular list, we can slice numpy arrays as well, in various ways
# Note that indexing starts at 0 (similar to C), and that the last index is not included

arr1 = np.array([1,2,3,4,5])

print(arr1)
print(arr1[0:5])
print(arr1[0:4])
print(arr1[0:4:1])
print(arr1[0:4:2])
print(arr1[::-1])

In [None]:
# Unlike normal Python lists, you could slice multi-dimensional arrays too.
arr = np.arange(40).reshape(4,-1)
print(arr)
print(arr[1:4,2:5])

# Try changing the above values to get a better understanding of slicing in numpy arrays 

You can learn about slicing [here](https://www.programiz.com/python-programming/numpy/array-slicing)

## Copies

In [None]:
# Try running this piece of code. Is the output the same as the one you expected?

arr = np.random.randint(10,shape=6) # Shape can be an integer (for a 1D array) or a tuple (for Multi-Dimensional Arrays)
print(arr)

new_arr = arr

arr[0] = 100
print(arr)
print(new_arr)



In [None]:
# When you assign an array to a new variable in Python (using =), it does not create a copy of the original array. 
# Instead, it creates a reference to the same memory location. 
# This means that any changes made to the new variable (new_arr in this case) will also affect the original variable (arr), as they both point to the same data.


# The interesting question now is how do you create a copy of an array which is completely independent of the original array.
# Luckily, numpy provides a function called copy() which does exactly that.

arr = np.random.randint(10,size=6)
print(arr)

new_arr = arr.copy()
arr[0] = -39
print(arr)
print(new_arr)

## Random Number Generation

Random numbers generation is an important tool in data science. It is used to generate random events during simulations.

In [None]:
# Create an array of shape (3,4) containing random numbers from 0 to 1

# Your code goes here

print(arr)

# Try running the above code multiple times. Do you see the same output every time?
# Can you keep the output the same every time you run the code?

In [None]:
# The reason you see different outputs every time is because the random numbers are generated using a seed. 
# The seed is the starting point for the random number generator algorithm and its value determines the sequence of random numbers generated.
# By default, the seed is set to None, which means that the random number generator uses the current system time as the seed.

# But you can bypass this behaviour. You can set the seed to a specific value using the np.random.seed() function

np.random.seed(31) # The seed can be set to any integer between 0 and 2**32 - 1


# Copy the code from the previous cell here and run this cell multiple times. What do you observe now?
print(arr)


In [None]:
# Create an array of shape (2,4) which contains random integers from the range [5,12]
# Hint: Use np.random.randint()

The `numpy.random` library provides a lot of other functionalities too. Like shuffling an array, or choosing a random value from one. 


It also allows you to sample numbers from various other distributions like Normal Distribution, Binomial Distribution and many more.


You can learn more about it [here](https://numpy.org/doc/stable/reference/random/legacy.html)

## Aggregation Functions

In [21]:
arr = np.arange(10).reshape(2,5)

print(arr.sum())

45


In [None]:
# Similar to .sum() there are a lot of other functions that can be used on numpy arrays. Few of them are listed below

print(arr.mean())
print(arr.std())
print(arr.min())

# Run this cell and check the output

In [31]:
# Things get a little interesting when you want to use these operations on Multi-Dimensional arrays.

arr = np.array([[1,2,9],
                [3,4,5],
                [-3,8,7]])
print(arr,"\n\n")

print(arr.max(axis=0),"\n\n")

print(arr.max(axis=1),"\n\n")

# In the case of multi-dimensional arrays, the above functions can take an additional parameter 'axis'
# to define the dimension along which the operation is to be performed.

# You can also run the above functions without the axis parameter to get the result for the entire array as if it were a 1D array.

print(arr.max(),arr.min(),arr.mean())

[[ 1  2  9]
 [ 3  4  5]
 [-3  8  7]] 


[3 8 9] 


[9 5 8] 


9 -3 4.0


## Broadcasting

Numpy is much faster than matrix operations that you could code. Here are a few reasons

1. It executes C code rather than using Python functions which are quite slow.


2. All elements in a Numpy Array are of same datatype which is not the case with Python lists or tuples.


3. It stores the elements in a flattened array even if the array declared had multiple dimensions.

The true strength of NumPy lies in its ability to seamlessly perform arithmetic operations between arrays, between arrays and scalars, and more.

For a detailed explanation of broadcasting, check out the following link: [Broadcasting in NumPy](https://numpy.org/doc/stable/user/basics.broadcasting.html).

In [None]:
# Let's see what happens when we multiply an array with
# 1) a scalar
# 2) an array of same shape

arr1 = np.array([1,2,3])
arr2 = np.array([4,5,6])
print(arr1,"\n-----")

print(arr1*2,"\n-----")  # Scalar multiplication
print(arr1*arr2)           # Element-wise multiplication


# Observe the outputs of the code above.
# Now think of a way to create arr_square which is the square of each element of arr.

arr = np.random.randint(12,size=3)

print(arr)

# Your code goes here

arr_square = 

print(arr_square)

In [None]:
# Just like the multiplication operator, you can use the addition, subtraction, division, and exponentiation operators with constants on numpy arrays as well.

arr = np.arange(12).reshape(3,4)

print(arr)
print(arr+1)
print(arr-1)
print(arr*2)
print(arr/2)
print(arr//2)
print(arr**3)

In [None]:
arr1 = np.array([1,2,3,4])
print(arr*arr1)


arr2 = np.array([7,2,5,8])
print(arr2*arr)

### When performing operations between arrays of different shapes, NumPy applies the following rules:

Match dimensions starting from the rightmost axis.
If dimensions don’t match:
1. A dimension with size 1 can be stretched to match the other array's size.
2. If dimensions cannot align (and no size is 1), broadcasting fails, raising a ValueError.

In [None]:
# Try to figure out what happens in the following example
arr1 = np.arange(12).reshape(3,4)
arr2 = np.arange(4)
print(arr1,"\n")
print(arr2,"\n")

print(arr1+arr2)

In [None]:
# Can you explain why the following code throws an error?
arr3 = np.arange(12).reshape(3,4)
arr4 = np.arange(3)
print(arr3,"\n")
print(arr4,"\n")

print(arr3+arr4)