![AI_course.png](attachment:AI_course.png)

# Module_02, Lab_01, Numpy in Python
NumPy is one of the fundamental packages for scientific computing in Python. It contains functionality for multidimensional arrays, high-level mathematical functions such as linear algebra operations and the Fourier transform, and pseudorandom number generators. 

In [3]:
import numpy as np

In [5]:
x=np.array([[1,2,3],[4,5,6]])
x.shape

(2, 3)

In [6]:
x[:,0]# 0th column

array([1, 4])

In [7]:
x[:,1] # 1st column

array([2, 5])

In [8]:
x[0,:] # 0th row

array([1, 2, 3])

In [9]:
x[1,:] # 1st row

array([4, 5, 6])

In [12]:
x=np.ones((3,3))
x

array([[1., 1., 1.],
       [1., 1., 1.],
       [1., 1., 1.]])

In [13]:
x= np.arange(5)# Create array
x

array([0, 1, 2, 3, 4])

In [1]:
import numpy as np # import the module numpy
x = np.array([[1, 2, 3], [4, 5, 6]]) # Make the array in matrix
print("x:\n{}".format(x)) # This line print the result

x:
[[1 2 3]
 [4 5 6]]


## Using Loop

In [2]:
# Create an array
a = np.array([1, 2, 3, 4, 5])

# Calculate the square of each element using a for loop
squared = []
for i in a:
    squared.append(i ** 2)

print("Squared values using a for loop:", squared)

Squared values using a for loop: [1, 4, 9, 16, 25]


## 1. Vectorization

### Vector Operation:

In [3]:
# Calculate the square of each element using vectorized operation
squared = a ** 2

print("Squared values using vectorized operation:", squared)

Squared values using vectorized operation: [ 1  4  9 16 25]


#### The output is same but vectorized operation will be much faster than loop.

## 2. Broadcasting
Fundamentally, it functions around one rule: arrays can be broadcast against each other if their dimensions match or if one of the arrays has a size of 1.

Here’s a quick example. Array A has the shape (4, 1, 8), and array B has the shape (1, 6, 8).

In [4]:
import numpy as np

A = np.arange(32).reshape(4, 1, 8)

A


array([[[ 0,  1,  2,  3,  4,  5,  6,  7]],

       [[ 8,  9, 10, 11, 12, 13, 14, 15]],

       [[16, 17, 18, 19, 20, 21, 22, 23]],

       [[24, 25, 26, 27, 28, 29, 30, 31]]])

In [5]:
import numpy as np
B = np.arange(48).reshape(1,6,8)
B

array([[[ 0,  1,  2,  3,  4,  5,  6,  7],
        [ 8,  9, 10, 11, 12, 13, 14, 15],
        [16, 17, 18, 19, 20, 21, 22, 23],
        [24, 25, 26, 27, 28, 29, 30, 31],
        [32, 33, 34, 35, 36, 37, 38, 39],
        [40, 41, 42, 43, 44, 45, 46, 47]]])

A has 4 planes, each with 1 row and 8 columns. B has only 1 plane with 6 rows and 8 columns. Watch what NumPy does for you when you try to do a calculation between them!

Add the two arrays together:

In [6]:
A + B

array([[[ 0,  2,  4,  6,  8, 10, 12, 14],
        [ 8, 10, 12, 14, 16, 18, 20, 22],
        [16, 18, 20, 22, 24, 26, 28, 30],
        [24, 26, 28, 30, 32, 34, 36, 38],
        [32, 34, 36, 38, 40, 42, 44, 46],
        [40, 42, 44, 46, 48, 50, 52, 54]],

       [[ 8, 10, 12, 14, 16, 18, 20, 22],
        [16, 18, 20, 22, 24, 26, 28, 30],
        [24, 26, 28, 30, 32, 34, 36, 38],
        [32, 34, 36, 38, 40, 42, 44, 46],
        [40, 42, 44, 46, 48, 50, 52, 54],
        [48, 50, 52, 54, 56, 58, 60, 62]],

       [[16, 18, 20, 22, 24, 26, 28, 30],
        [24, 26, 28, 30, 32, 34, 36, 38],
        [32, 34, 36, 38, 40, 42, 44, 46],
        [40, 42, 44, 46, 48, 50, 52, 54],
        [48, 50, 52, 54, 56, 58, 60, 62],
        [56, 58, 60, 62, 64, 66, 68, 70]],

       [[24, 26, 28, 30, 32, 34, 36, 38],
        [32, 34, 36, 38, 40, 42, 44, 46],
        [40, 42, 44, 46, 48, 50, 52, 54],
        [48, 50, 52, 54, 56, 58, 60, 62],
        [56, 58, 60, 62, 64, 66, 68, 70],
        [64, 66, 68, 70, 72,

The way broadcasting works is that NumPy duplicates the plane in B three times so that you have a total of four, matching the number of planes in A. It also duplicates the single row in A five times for a total of six, matching the number of rows in B. Then it adds each element in the newly expanded A array to its counterpart in the same location in B. The result of each calculation shows up in the corresponding location of the output.

In [7]:
# Create a 2D NumPy array with a diagonal of ones, and zeros everywhere else
eye = np.eye(4)# We have to get 4 cross by 4 identies matrix
print("NumPy array:\n{}".format(eye)) # prints its result


NumPy array:
[[1. 0. 0. 0.]
 [0. 1. 0. 0.]
 [0. 0. 1. 0.]
 [0. 0. 0. 1.]]


# Data Science Operations: Filter, Order, Aggregate
Some examples of real, useful data science operations: filtering, sorting, and aggregating data.

### Indexing
Indexing uses many of the same idioms that normal Python code uses. You can use positive or negative indices to index from the front or back of the array. You can use a colon (:) to specify “the rest” or “all,” and you can even use two colons to skip elements as with regular Python lists.

Here’s the difference: NumPy arrays use commas between axes, so you can index multiple axes in one set of square brackets. An example is the easiest way to show this off. It’s time to confirm Dürer’s magic square!

The number square below has some amazing properties. If you add up any of the rows, columns, or diagonals, then you’ll get the same number, 34. That’s also what you’ll get if you add up each of the four quadrants, the center four squares, the four corner squares, or the four corner squares of any of the contained 3 × 3 grids. You’re going to prove it!

In [8]:
import numpy as np

square = np.array([
    [16, 3, 2, 13],
    [5, 10, 11, 8],
    [9, 6, 7, 12],
    [4, 15, 14, 1]
])

for i in range(4):
    assert square[:, i].sum() == 34
    assert square[i, :].sum() == 34


assert square[:2, :2].sum() == 34

assert square[2:, :2].sum() == 34

assert square[:2, 2:].sum() == 34

assert square[2:, 2:].sum() == 34

Inside the for loop, you verify that all the rows and all the columns add up to 34. After that, using selective indexing, you verify that each of the quadrants also adds up to 34.

One last thing to note is that you’re able to take the sum of any array to add up all of its elements globally with square.sum(). This method can also take an axis argument to do an axis-wise summing instead.

### Masking and Filtering
Index-based selection is great, but what if you want to filter your data based on more complicated nonuniform or nonsequential criteria? This is where the concept of a mask comes into play.

A mask is an array that has the exact same shape as your data, but instead of your values, it holds Boolean values: either True or False. You can use this mask array to index into your data array in nonlinear and complex ways. It will return all of the elements where the Boolean array has a True value.

Here’s an example showing the process, first in slow motion and then how it’s typically done, all in one line:

In [9]:
import numpy as np

numbers = np.linspace(5, 50, 24, dtype=int).reshape(4, -1)

numbers
mask = numbers % 4 == 0

mask

numbers[mask]

by_four = numbers[numbers % 4 == 0]

by_four

array([ 8, 12, 16, 20, 24, 28, 32, 36, 40, 44, 48])

np.linspace() generates n numbers evenly distributed between a minimum and a maximum, which is useful for evenly distributed sampling in scientific plotting.

### Transposing, Sorting, and Concatenating
Other manipulations, while not quite as common as indexing or filtering, can also be very handy depending on the situation you’re in. You’ll see a few examples in this section.

Here’s transposing an array:

In [10]:
import numpy as np

a = np.array([
    [1, 2],
    [3, 4],
    [5, 6],
])

a.T

array([[1, 3, 5],
       [2, 4, 6]])

In [11]:
a.transpose()

array([[1, 3, 5],
       [2, 4, 6]])

In [12]:
import numpy as np

data = np.array([
    [7, 1, 4],
    [8, 6, 5],
    [1, 2, 3]
])

np.sort(data)
np.sort(data, axis=None)
np.sort(data, axis=0)

array([[1, 1, 3],
       [7, 2, 4],
       [8, 6, 5]])

Finally, here’s an example of concatenation. While there’s a np.concatenate() function, there are also a number of helper functions that are sometimes easier to read.

Here are some examples:

In [13]:
import numpy as np

a = np.array([
    [4, 8],
    [6, 1]
])

b = np.array([
    [3, 5],
    [7, 2],
])

np.hstack((a, b))




np.vstack((b, a))






np.concatenate((a, b))






np.concatenate((a, b), axis=None)

array([4, 8, 6, 1, 3, 5, 7, 2])

show the slightly more intuitive functions hstack() and vstack(). Inputs 6 and 7 show the more generic concatenate(), first without an axis argument and then with axis=None. This flattening behavior is similar in form to what you just saw with sort().

### Aggregating
Your last stop on this tour of functionality before diving into some more advanced topics and examples is aggregation. You’ve already seen quite a few aggregating methods, including .sum(), .max(), .mean(), and .std(). You can reference NumPy’s larger library of functions to see more. Many of the mathematical, financial, and statistical functions use aggregation to help you reduce the number of dimensions in your data.

## Getting Into Shape: Array Shapes and Axes
There are a few concepts that are important to keep in mind, especially as you work with arrays in higher dimensions.

Vectors, which are one-dimensional arrays of numbers, are the least complicated to keep track of. Two dimensions aren’t too bad, either, because they’re similar to spreadsheets. But things start to get tricky at three dimensions, and visualizing four? Forget about it.
### 1. Mastering Shape
Shape is a key concept when you’re using multidimensional arrays. At a certain point, it’s easier to forget about visualizing the shape of your data and to instead follow some mental rules and trust NumPy to tell you the correct shape.

All arrays have a property called .shape that returns a tuple of the size in each dimension. It’s less important which dimension is which, but it’s critical that the arrays you pass to functions are in the shape that the functions expect. A common way to confirm that your data has the proper shape is to print the data and its shape until you’re sure everything is working like you expect.

In [14]:
import numpy as np
temperatures = np.array([
    29.3, 42.1, 18.8, 16.1, 38.0, 12.5,
    12.6, 49.9, 38.6, 31.3, 9.2, 22.2
]).reshape(2, 2, 3)
temperatures.shape
temperatures
np.swapaxes(temperatures, 1, 2)

array([[[29.3, 16.1],
        [42.1, 38. ],
        [18.8, 12.5]],

       [[12.6, 31.3],
        [49.9,  9.2],
        [38.6, 22.2]]])

Here, you use a numpy.ndarray method called .reshape() to form a 2 × 2 × 3 block of data. When you check the shape of your array in input 3, it’s exactly what you told it to be. However, you can see how printed arrays quickly become hard to visualize in three or more dimensions. After you swap axes with .swapaxes(), it becomes little clearer which dimension is which

### 2. Understanding Axes
The example above shows how important it is to know not only what shape your data is in but also which data is in which axis. In NumPy arrays, axes are zero-indexed and identify which dimension is which. For example, a two-dimensional array has a vertical axis (axis 0) and a horizontal axis (axis 1). Lots of functions and commands in NumPy change their behavior based on which axis you tell them to process.

This example will show how .max() behaves by default, with no axis argument, and how it changes functionality depending on which axis you specify when you do supply an argument:

In [15]:
import numpy as np

table = np.array([
    [5, 3, 7, 1],
    [2, 6, 7 ,9],
    [1, 1, 1, 1],
    [4, 3, 2, 0],
])
table

array([[5, 3, 7, 1],
       [2, 6, 7, 9],
       [1, 1, 1, 1],
       [4, 3, 2, 0]])

In [16]:
table.max()

9

In [17]:
table.max(axis=0)

array([5, 6, 7, 9])

In [18]:
table.max(axis=1)

array([7, 9, 1, 4])

By default, .max() returns the largest value in the entire array, no matter how many dimensions there are. However, once you specify an axis, it performs that calculation for each set of values along that particular axis. For example, with an argument of axis=0, .max() selects the maximum value in each of the four vertical sets of values in table and returns an array that has been flattened, or aggregated into a one-dimensional array.

In fact, many of NumPy’s functions behave this way: If no axis is specified, then they perform an operation on the entire dataset. Otherwise, they perform the operation in an axis-wise fashion.