> This notebook is inspired from [1-python-numpy-tutorial](https://colab.research.google.com/github/ffund/ml-notebooks/blob/master/notebooks/1-python-numpy-tutorial.ipynb)

<img src="../assets/Header_NovaAi_Camp2.0.png" alt="NovaAi Camp 2.0 Header" width="100%">

# NumPy for Machine Learning

## Why NumPy for Machine Learning?

Every time you use ChatGPT, recognize a face, or get a recommendation on Netflix, NumPy is working behind the scenes. Here's why:

üß† **Neural Networks = Matrix Math**  
Deep learning models are built on matrix multiplications and operations that NumPy handles efficiently.

üñºÔ∏è **Images = 3D Arrays**  
Every image is just a NumPy array: `(height, width, RGB)`. Computer vision starts here.

üìä **Datasets = 2D Arrays**  
Machine learning data is stored as `(samples, features)` - perfect for NumPy.

‚ö° **Speed Matters**  
NumPy is **100x faster** than Python lists. When training models on millions of data points, this is critical.

## What You'll Learn

In this lab, you'll master NumPy operations that you'll use **every single day** in ML/AI:

‚úÖ **Represent ML data** - Images, datasets, batches  
‚úÖ **Manipulate data** - Slicing, filtering, selecting samples  
‚úÖ **Transform data** - Normalization, standardization, reshaping  
‚úÖ **Understand ML operations** - Matrix multiplication, broadcasting, aggregations  
‚úÖ **Prepare data for models** - The exact operations used before feeding data to neural networks

By the end, you'll understand what's happening inside TensorFlow and PyTorch!

## How to Use This Notebook

üí° **Run each cell** - Click the play button (‚ñ∂) on the left of each code cell  
ü§î **Predict first** - Try to guess the output before running  
üîß **Experiment** - Modify the code and see what happens  
üìä **Visualize** - We've included diagrams to help you see what's happening

Let's dive in! üöÄ

---
## Part 1: NumPy Arrays - How ML Represents Data

Before neural networks can process data, everything must be converted to **NumPy arrays**:

- Text ‚Üí Numbers ‚Üí Arrays
- Images ‚Üí Pixel values ‚Üí Arrays  
- Tabular data ‚Üí Feature vectors ‚Üí Arrays

NumPy provides the **high-performance array object** that makes ML possible. It's fast, memory-efficient, and designed for the math operations ML needs.

### Importing NumPy

By convention, we import NumPy as `np`. You'll see this in every ML project:

In [None]:
import numpy as np

### Creating Arrays - The Building Blocks of ML

A **NumPy array** is like a container that holds data in a grid format. Think of it as:

- **1D array** ‚Üí A list of features for one sample: `[age, income, credit_score]`
- **2D array** ‚Üí A dataset with multiple samples: Each row is one person, each column is a feature
- **3D array** ‚Üí An image: `(height, width, RGB_channels)`
- **4D array** ‚Üí A batch of images: `(batch_size, height, width, RGB)`

**Key concepts:**
- **Shape**: The dimensions of the array `(rows, columns, ...)`
- **Rank**: Number of dimensions (1D, 2D, 3D, etc.)
- **Type**: All values must be the same type (all integers or all floats)

#### Creating a 1D Array - Single Sample

Let's create an array representing one person's features:

In [None]:
# This could represent one person's features: [age, height, weight]
features = np.array([25, 175, 70])
print("Features:", features)
print("Shape:", features.shape)  # (3,) means 1D array with 3 elements

This creates the array we can see on the right here:

![](http://jalammar.github.io/images/numpy/create-numpy-array-1.png)

In [None]:
# You can access individual elements like a list
print("Age:", features[0])
print("Height:", features[1])
print("Weight:", features[2])

# And modify them
features[0] = 26  # Birthday! Age increased
print("\nUpdated features:", features)

#### Creating 2D Arrays - Multiple Samples (A Dataset!)

In ML, we usually work with **multiple samples** at once. A 2D array is perfect for this:

![](http://jalammar.github.io/images/numpy/numpy-array-create-2d.png)

For 3D arrays (like images), we add depth:

![](http://jalammar.github.io/images/numpy/numpy-3d-array.png)

In [None]:
# A mini-dataset: 2 people, 2 features each
# Row 1: Person 1 [age=25, income=50000]
# Row 2: Person 2 [age=30, income=60000]
dataset = np.array([[25, 50000], 
                     [30, 60000]])
print(dataset)

In [None]:
print("Shape:", dataset.shape)  # (2, 2) means 2 rows, 2 columns
print("This is a dataset with", dataset.shape[0], "samples and", dataset.shape[1], "features")

#### Array Creation Functions - Essential for ML

In ML, we often need to create arrays filled with specific values. NumPy has handy functions for this:

![](http://jalammar.github.io/images/numpy/create-numpy-array-ones-zeros-random.png)

**When do we use these in ML?**
- `zeros()` ‚Üí Initialize weights, create empty batches
- `ones()` ‚Üí Create bias vectors, masks
- `random()` ‚Üí Initialize neural network weights with random values

These functions work with any shape - 1D, 2D, 3D, even 4D for batches of images!

![](http://jalammar.github.io/images/numpy/numpy-matrix-ones-zeros-random.png)

![](http://jalammar.github.io/images/numpy/numpy-3d-array-creation.png)

#### Examples: Creating Arrays for ML Tasks

In [None]:
# Initialize an empty batch of data (we'll fill it later)
empty_batch = np.zeros((32, 10))  # 32 samples, 10 features each
print("Empty batch shape:", empty_batch.shape)
print(empty_batch)

In [None]:
# Create a bias vector (commonly used in neural networks)
bias = np.ones(5)  # 5 bias values, all initialized to 1
print("Bias vector:", bias)

In [None]:
# Fill an array with a specific value
labels = np.full((10,), -1)  # 10 labels, all marked as -1 (unlabeled)
print("Labels:", labels)

In [None]:
# Identity matrix (useful in linear algebra for ML)
identity = np.eye(3)  # 3x3 identity matrix
print("Identity matrix:\n", identity)

In [None]:
# Random initialization (THIS IS HOW NEURAL NETWORKS START!)
# Initialize weights with small random values
weights = np.random.random((3, 2))  # 3 inputs, 2 outputs
print("Random weights:\n", weights)
print("\nThese random values will be adjusted during training!")

#### üéØ ML Example: Creating a Mini-Batch of Images

In computer vision, images are represented as 3D arrays. Let's create a batch:

In [None]:
# Create a batch of 32 grayscale images, each 28x28 pixels (like MNIST digits)
batch_images = np.zeros((32, 28, 28))

print(f"Batch shape: {batch_images.shape}")
print(f"This represents:")
print(f"  - {batch_images.shape[0]} images")
print(f"  - Each image is {batch_images.shape[1]}√ó{batch_images.shape[2]} pixels")
print(f"  - Total elements: {batch_images.size:,}")

In [None]:
# For RGB color images, we need 4D: (batch, height, width, channels)
rgb_batch = np.zeros((32, 224, 224, 3))

print(f"RGB batch shape: {rgb_batch.shape}")
print(f"This is {rgb_batch.shape[0]} color images of size {rgb_batch.shape[1]}√ó{rgb_batch.shape[2]} with {rgb_batch.shape[3]} color channels (R,G,B)")

#### Creating Sequences - Useful for Labels and Ranges

**`arange(start, stop, step)`** - Like Python's range, but returns a NumPy array  
**`linspace(start, stop, num)`** - Creates evenly spaced values

**ML uses:**
- `arange` ‚Üí Create class labels [0, 1, 2, 3, 4, ...]
- `linspace` ‚Üí Create learning rate schedules, plot ranges

In [None]:
# Create a range of values
values = np.arange(0, 10, 2)  # Start at 0, stop before 10, step by 2
print("Range:", values)

# ML example: Create class labels for 10 classes
class_labels = np.arange(10)
print("Class labels:", class_labels)

**Note:** `arange` doesn't include the stop value (just like Python's `range`).

In [None]:
# Create evenly spaced values (INCLUDES both endpoints)
evenly_spaced = np.linspace(0, 1, num=5)
print("Evenly spaced:", evenly_spaced)

# ML example: Create a learning rate decay schedule
learning_rates = np.linspace(0.1, 0.001, num=10)
print("\nLearning rate schedule:", learning_rates)

#### Stacking Arrays - Combining Data

In ML, we often need to combine datasets or features:

**`vstack`** (vertical) ‚Üí Stack rows (add more samples)  
**`hstack`** (horizontal) ‚Üí Stack columns (add more features)

In [None]:
# Stack vertically - add more samples to a dataset
batch1 = np.array([1, 2, 3])
batch2 = np.array([4, 5, 6])
combined = np.vstack((batch1, batch2))
print("Stacked vertically (more samples):\n", combined)

In [None]:
# Stack horizontally - add more features to each sample
ages = np.array([[25], [30], [35]])
incomes = np.array([[50000], [60000], [70000]])
dataset_with_features = np.hstack((ages, incomes))
print("Stacked horizontally (more features):\n", dataset_with_features)

---
## Part 2: Array Indexing - Selecting and Manipulating ML Data

**Indexing is essential in ML for:**
- Splitting data into training/validation/test sets
- Selecting specific samples or features
- Extracting image regions (cropping)
- Filtering data based on conditions
- Accessing predictions for specific classes

NumPy offers powerful ways to index and slice arrays!

### Slicing Arrays - Like Slicing a Dataset

You can slice NumPy arrays just like Python lists, but in multiple dimensions:

![](http://jalammar.github.io/images/numpy/numpy-array-slice.png)

For 2D arrays (datasets), you can slice rows AND columns:

![](http://jalammar.github.io/images/numpy/numpy-matrix-indexing.png)

**Syntax:** `array[row_slice, column_slice]`

In [None]:
# Create a sample dataset
# Think of this as: 3 samples, 4 features each
dataset = np.array([[1,  2,  3,  4], 
                     [5,  6,  7,  8], 
                     [9, 10, 11, 12]])

print("Full dataset:")
print(dataset)

# Select first 2 samples (rows), features 1 and 2 (columns index 1:3)
subset = dataset[:2, 1:3]
print("\nFirst 2 samples, features 1-2:")
print(subset)

‚ö†Ô∏è **Important:** Slices are **views**, not copies! Modifying a slice modifies the original array.

In [None]:
# Demonstrate that slices are views
print("Original value:", dataset[0, 1])

subset[0, 0] = 99  # Modify the slice

print("After modifying slice:", dataset[0, 1])  # Original array changed too!
print("\nThis is memory-efficient but be careful!")

#### üéØ ML Example: Splitting Data into Train/Test Sets

In [None]:
# Create a dataset: 100 samples, 5 features each
data = np.random.rand(100, 5)
labels = np.random.randint(0, 2, 100)  # Binary labels (0 or 1)

# Split: 80% training, 20% testing
split_point = 80

X_train = data[:split_point]      # First 80 samples
y_train = labels[:split_point]

X_test = data[split_point:]       # Last 20 samples  
y_test = labels[split_point:]

print(f"Training set: {X_train.shape[0]} samples")
print(f"Test set: {X_test.shape[0]} samples")
print(f"Features per sample: {X_train.shape[1]}")

You can also mix integer indexing with slice indexing. However, doing so will yield an array of lower rank than the original array. Note that this is quite different from the way that MATLAB handles array slicing:

In [None]:
# Create the following rank 2 array with shape (3, 4)
a = np.array([[1,2,3,4], [5,6,7,8], [9,10,11,12]])
print(a)

Two ways of accessing the data in the middle row of the array. Mixing integer indexing with slices yields an array of lower rank, while using only slices yields an array of the same rank as the original array:

In [None]:
row_r1 = a[1, :]    # Rank 1 view of the second row of a
row_r2 = a[1:2, :]  # Rank 2 view of the second row of a
row_r3 = a[[1], :]  # Rank 2 view of the second row of a
print(row_r1, row_r1.shape)
print(row_r2, row_r2.shape)
print(row_r3, row_r3.shape)

In [None]:
# We can make the same distinction when accessing columns of an array:
col_r1 = a[:, 1]
col_r2 = a[:, 1:2]
print(col_r1, col_r1.shape)
print()
print(col_r2, col_r2.shape)

Integer array indexing: When you index into numpy arrays using slicing, the resulting array view will always be a subarray of the original array. In contrast, integer array indexing allows you to construct arbitrary arrays using the data from another array. Here is an example:

In [None]:
a = np.array([[1,2], [3, 4], [5, 6]])

# An example of integer array indexing.
# The returned array will have shape (3,) and
print(a[[0, 1, 2], [0, 1, 0]])

# The above example of integer array indexing is equivalent to this:
print(np.array([a[0, 0], a[1, 1], a[2, 0]]))

In [None]:
# When using integer array indexing, you can reuse the same
# element from the source array:
print(a[[0, 0], [1, 1]])

# Equivalent to the previous integer array indexing example
print(np.array([a[0, 1], a[0, 1]]))

One useful trick with integer array indexing is selecting or mutating one element from each row of a matrix:

In [None]:
# Create a new array from which we will select elements
a = np.array([[1,2,3], [4,5,6], [7,8,9], [10, 11, 12]])
print(a)

In [None]:
# Create an array of indices
b = np.array([0, 2, 0, 1])

# Select one element from each row of a using the indices in b
print(a[np.arange(4), b])  # Prints "[ 1  6  7 11]"

In [None]:
# Mutate one element from each row of a using the indices in b
a[np.arange(4), b] += 10
print(a)

Boolean array indexing: Boolean array indexing lets you pick out arbitrary elements of an array. Frequently this type of indexing is used to select the elements of an array that satisfy some condition. Here is an example:

In [None]:
import numpy as np

a = np.array([[1,2], [3, 4], [5, 6]])

bool_idx = (a > 2)  # Find the elements of a that are bigger than 2;
                    # this returns a numpy array of Booleans of the same
                    # shape as a, where each slot of bool_idx tells
                    # whether that element of a is > 2.

print(bool_idx)

In [None]:
# We use boolean array indexing to construct a rank 1 array
# consisting of the elements of a corresponding to the True values
# of bool_idx
print(a[bool_idx])

# We can do all of the above in a single concise statement:
print(a[a > 2])

When working with numpy arrays, it‚Äôs often helpful to get the *indices* (not only the values) of array elements that meet certain conditions. There are a few numpy functions that you‚Äôll definitely want to remember:

-   [`argmax`](https://numpy.org/doc/stable/reference/generated/numpy.argmax.html) (get index of maximum element in array)
-   [`argmin`](https://numpy.org/doc/stable/reference/generated/numpy.argmin.html) (get index of minimum element in array)
-   [`argsort`](https://numpy.org/doc/stable/reference/generated/numpy.argsort.html) (get sorted list of indices, by element value, in ascending order)
-   [`where`](https://numpy.org/doc/stable/reference/generated/numpy.where.html) (get indices of elements that meet some condition)

In [None]:
a = np.array([1, 8, 9, -3, 2, 4, 7, 9])

# Get the index of the maximum element in a
print(np.argmax(a))

# Get the index of the minimum element in a
# (this array has two elements with the maximum value -
# only one index is returned)
print(np.argmin(a))

# Get sorted list of indices
print(np.argsort(a))

# Get sorted list of indices in descending order
# [::-1] is a special slicing index that returns the reversed list
print(np.argsort(a)[::-1])

# Get indices of elements that meet some condition
# this returns a tuple, the list of indices is the first entry
# so we use [0] to get it
print(np.where(a > 5)[0])

# Get indices of elements that meet some condition
# this example shows how to get the index of *all* the max values
print(np.where(a >= a[np.argmax(a)])[0])

For brevity we have left out a lot of details about numpy array indexing; if you want to know more you should read the documentation.

### Datatypes

Every numpy array is a grid of elements of the same type. Numpy provides a large set of numeric datatypes that you can use to construct arrays. Numpy tries to guess a datatype when you create an array, but functions that construct arrays usually also include an optional argument to explicitly specify the datatype. Here is an example:

In [None]:
x = np.array([1, 2])  # Let numpy choose the datatype
y = np.array([1.0, 2.0])  # Let numpy choose the datatype
z = np.array([1, 2], dtype=np.int64)  # Force a particular datatype

print(x.dtype, y.dtype, z.dtype)

You can read all about numpy datatypes in the [documentation](http://docs.scipy.org/doc/numpy/reference/arrays.dtypes.html).

### Array math

What makes working with `numpy` so powerful and convenient is that it comes with many *vectorized* math functions for computation over elements of an array. These functions are highly optimized and are *very* fast - much, much faster than using an explicit `for` loop.

For example, let‚Äôs create a large array of random values and then sum it both ways. We‚Äôll use a `%%time` *cell magic* to time them.

In [None]:
a = np.random.random(100000000)

In [None]:
%%time
x = np.sum(a)

In [None]:
%%time
x = 0
for element in a:
  x = x + element

Look at the ‚ÄúWall Time‚Äù in the output - note how much faster the vectorized version of the operation is! This type of fast computation is a major enabler of machine learning, which requires a *lot* of computation.

Whenever possible, we will try to use these vectorized operations.

Some mathematic functions are available both as operator overloads and as functions in the numpy module.

For example, you can perform an elementwise sum on two arrays using either the + operator or the `add()` function.

![](http://jalammar.github.io/images/numpy/numpy-arrays-adding-1.png)

![](http://jalammar.github.io/images/numpy/numpy-matrix-arithmetic.png)

In [None]:
x = np.array([[1,2],[3,4]], dtype=np.float64)
y = np.array([[5,6],[7,8]], dtype=np.float64)

# Elementwise sum; both produce the array
print(x + y)
print(np.add(x, y))

And this works for other operations as well, not only addition:

![](http://jalammar.github.io/images/numpy/numpy-array-subtract-multiply-divide.png)

In [None]:
# Elementwise difference; both produce the array
print(x - y)
print(np.subtract(x, y))

In [None]:
# Elementwise product; both produce the array
print(x * y)
print(np.multiply(x, y))

In [None]:
# Elementwise division; both produce the array
# [[ 0.2         0.33333333]
#  [ 0.42857143  0.5       ]]
print(x / y)
print(np.divide(x, y))

In [None]:
# Elementwise square root; produces the array
# [[ 1.          1.41421356]
#  [ 1.73205081  2.        ]]
print(np.sqrt(x))

Note that unlike MATLAB, `*` is elementwise multiplication, not matrix multiplication. We instead use the `dot()` function to compute inner products of vectors, to multiply a vector by a matrix, and to multiply matrices. `dot()` is available both as a function in the numpy module and as an instance method of array objects:

![](http://jalammar.github.io/images/numpy/numpy-matrix-dot-product-1.png)

In [None]:
x = np.array([[1,2],[3,4]])
y = np.array([[5,6],[7,8]])

v = np.array([9,10])
w = np.array([11, 12])

# Inner product of vectors; both produce 219
print(v.dot(w))
print(np.dot(v, w))

You can also use the `@` operator which is equivalent to numpy's `dot` operator.

In [None]:
print(v @ w)

In [None]:
# Matrix / vector product; both produce the rank 1 array [29 67]
print(x.dot(v))
print(np.dot(x, v))
print(x @ v)

In [None]:
# Matrix / matrix product; both produce the rank 2 array
# [[19 22]
#  [43 50]]
print(x.dot(y))
print(np.dot(x, y))
print(x @ y)

Besides for the functions that overload operators, Numpy also provides many useful functions for performing computations on arrays, such as `min()`, `max()`, `sum()`, and others:

![](http://jalammar.github.io/images/numpy/numpy-matrix-aggregation-1.png)

In [None]:
x = np.array([[1, 2], [3, 4], [5, 6]])

print(np.max(x))  # Compute max of all elements; prints "6"
print(np.min(x))  # Compute min of all elements; prints "1"
print(np.sum(x))  # Compute sum of all elements; prints "21"

Not only can we aggregate all the values in a matrix using these functions, but we can also aggregate across the rows or columns by using the `axis` parameter:

![](http://jalammar.github.io/images/numpy/numpy-matrix-aggregation-4.png)

In [None]:
x = np.array([[1, 2], [5, 3], [4, 6]])

print(np.max(x, axis=0))  # Compute max of each column; prints "[5 6]"
print(np.max(x, axis=1))  # Compute max of each row; prints "[2 5 6]"

You can find the full list of mathematical functions provided by numpy in the [documentation](http://docs.scipy.org/doc/numpy/reference/routines.math.html).

Apart from computing mathematical functions using arrays, we frequently need to reshape or otherwise manipulate data in arrays. The simplest example of this type of operation is transposing a matrix; to transpose a matrix, simply use the T attribute of an array object.

![](http://jalammar.github.io/images/numpy/numpy-transpose.png)

In [None]:
x = np.array([[1, 2], [3, 4], [5, 6]])

print(x)
print("transpose\n", x.T)

In [None]:
v = np.array([[1,2,3]])
print(v )
print("transpose\n", v.T)

In more advanced use case, you may find yourself needing to change the dimensions of a certain matrix. This is often the case in machine learning applications where a certain model expects a certain shape for the inputs that is different from your dataset. numpy's `reshape()` method is useful in these cases.

![](http://jalammar.github.io/images/numpy/numpy-reshape.png)

A common task in this class will be to convert a 1D array to a 2D array, and vice versa. We can use `reshape()` for this.

For example, suppose we had this 2D array, but we need to pass it to a function that expects a 1D array.

In [None]:
w = np.array([[1],[2],[3]])
print(w)
w.shape

We can remove the ‚Äúunnecessary‚Äù extra dimension with

In [None]:
y = w.reshape(-1,)
print(y)
y.shape

Note that we can pass -1 as one dimension and numpy will infer the correct size based on our matrix size!

There‚Äôs also a `squeeze()` function that removes *all* of the ‚Äúunnecessary‚Äù dimensions (dimensions that have size 1) from an array:

In [None]:
z = w.squeeze()
print(z)
z.shape

To go from a 1D to 2D array, we can just add in another dimension of size 1:

In [None]:
y.reshape((-1,1))

### Broadcasting

Broadcasting is a powerful mechanism that allows numpy to work with arrays of different shapes when performing arithmetic operations.

For example: basic linear algebra, we can only add (and perform similar element-wise operations) two matrics that have the *same* dimension. In numpy, if we want to add two matrics that have different dimensions, numpy will implicitly ‚Äúextend‚Äù the dimension of one matrix to match the other so that we can perform the operation.

So these operations will work, instead of returning an error:

![](https://sebastianraschka.com/images/blog/2020/numpy-intro/broadcasting-1.png)

![](https://sebastianraschka.com/images/blog/2020/numpy-intro/broadcasting-2.png)

Broadcasting two arrays together follows these rules:

**Rule 1**: If the two arrays differ in their number of dimensions, the shape of the one with fewer dimensions is padded with ones on its leading (left) side.

For example, in the following cell, `a` will be implicitly extended to shape (1,3):

In [None]:
a = np.array([1,2,3])         # has shape (3,): one dimension
b = np.array([[4], [5], [6]]) # has shape (3,1): two dimensions
c = a + b                     # will have shape (3,3) (two dimensions)

**Rule 2**: If the shape of the two arrays does not match in any dimension, the array with shape equal to 1 in that dimension is stretched to match the other shape.

For example, in the following cell `a` will be implicitly extended to shape (3,2):

In [None]:
a = np.array([[1],[2],[3]])         # has shape (3,1)
b = np.array([[4,5], [6,7], [8,9]]) # has shape (3,2)
c = a + b                           # will have shape (3,2)

**Rule 3**: If in any dimension the sizes disagree and neither is equal to 1, an error is raised:

In [None]:
a = np.array([[1],[2],[3],[4]])      # has shape (4,1)
b = np.array([[4,5], [6,7], [8,9]])  # has shape (3,2)
c = a + b                            # ValueError: operands could not be broadcast

For more detail, you can read the explanation from the [documentation](http://docs.scipy.org/doc/numpy/user/basics.broadcasting.html).

Functions that support broadcasting are known as universal functions. You can find the list of all universal functions in the [documentation](http://docs.scipy.org/doc/numpy/reference/ufuncs.html#available-ufuncs).

Here are a few visual examples involving broadcasting.

![](http://jalammar.github.io/images/numpy/numpy-array-broadcast.png)

Note that these arrays are compatible in each dimension if they have either the same size in that dimension, or if one array has size 1 in that dimension.

![](http://jalammar.github.io/images/numpy/numpy-matrix-broadcast.png)

And here are some more practical applications:

In [None]:
# Compute outer product of vectors
v = np.array([1,2,3])  # v has shape (3,)
w = np.array([4,5])    # w has shape (2,)
# To compute an outer product, we first reshape v to be a column
# vector of shape (3, 1); we can then broadcast it against w to yield
# an output of shape (3, 2), which is the outer product of v and w:

print(np.reshape(v, (3, 1)) * w)

In [None]:
# Add a vector to each row of a matrix
x = np.array([[1,2,3], [4,5,6]])
# x has shape (2, 3) and v has shape (3,) so they broadcast to (2, 3),
# giving the following matrix:

print(x + v)

In [None]:
# Add a vector to each column of a matrix
# x has shape (2, 3) and w has shape (2,).
# If we transpose x then it has shape (3, 2) and can be broadcast
# against w to yield a result of shape (3, 2); transposing this result
# yields the final result of shape (2, 3) which is the matrix x with
# the vector w added to each column. Gives the following matrix:

print((x.T + w).T)

In [None]:
# Another solution is to reshape w to be a row vector of shape (2, 1);
# we can then broadcast it directly against x to produce the same
# output.
print(x + np.reshape(w, (2, 1)))

In [None]:
# Multiply a matrix by a constant:
# x has shape (2, 3). Numpy treats scalars as arrays of shape ();
# these can be broadcast together to shape (2, 3), producing the
# following array:
print(x * 2)

Broadcasting typically makes your code more concise and faster, so you should strive to use it where possible.

This brief overview has touched on many of the important things that you need to know about numpy, but is far from complete. Check out the [numpy reference](http://docs.scipy.org/doc/numpy/reference/) to find out much more about numpy.

## Matplotlib

Matplotlib is a plotting library. In this section we give a brief introduction to the `matplotlib.pyplot` module, which provides a plotting system similar to that of MATLAB.

By convention, we typically import this module using the `plt` alias:

In [None]:
import matplotlib.pyplot as plt

### Plotting

The most important function in `matplotlib` is `plot`, which allows you to plot 2D data as a line plot. Here is a simple example:

In [None]:
# Compute the x and y coordinates for points on a sine curve
x = np.arange(0, 3 * np.pi, 0.1)
y = np.sin(x)

# Plot the points using matplotlib
plt.plot(x, y)

# Show the figure.
plt.show()

With just a little bit of extra work we can easily plot multiple lines at once, and add a title, legend, and axis labels:

In [None]:
y_sin = np.sin(x)
y_cos = np.cos(x)

# Plot the points using matplotlib
plt.plot(x, y_sin, label='Sine')
plt.plot(x, y_cos, label='Cosine')
plt.legend() # uses the label arguments given above

plt.xlabel('x axis label')
plt.ylabel('y axis label')
plt.title('Sine and Cosine')

# Show the figure.
plt.show()

Another useful, basic, plot is the `scatter` plot:

In [None]:
x = np.random.rand(50)
y = np.random.rand(50)
colors = np.random.randint(0, 2, 50)

plt.scatter(x, y, c=colors)

plt.show()

When there is no logical connection between adjacent points (for example: if the horizontal axis is a categorical variable without a logical order), then you should *not* connect them (as in a line plot), since this implies a relationship that does not exist! Instead, you would use a scatter plot.

### Subplots

You can plot different things in the same figure using the `subplot` function. Here is an example:

In [None]:
# Compute the x and y coordinates for points on sine and cosine curves
x = np.arange(0, 3 * np.pi, 0.1)
y_sin = np.sin(x)
y_cos = np.cos(x)

# Set up a subplot grid that has height 2 and width 1,
# and set the first such subplot as active.
plt.subplot(2, 1, 1)

# Make the first plot
plt.plot(x, y_sin)
plt.title('Sine')

# Set the second subplot as active, and make the second plot.
plt.subplot(2, 1, 2)
plt.plot(x, y_cos)
plt.title('Cosine')

# Adjust spacing between subplots
plt.subplots_adjust(hspace = 0.4)

# Show the figure.
plt.show()

You can read much more about the `subplot` function in the [documentation](http://matplotlib.org/api/pyplot_api.html#matplotlib.pyplot.subplot).

You can use the `imshow` function to display images from a file. Here is an example:

In [None]:
from matplotlib.cbook import get_sample_data

img_file = get_sample_data('grace_hopper.jpg')

img = plt.imread(img_file)
plt.imshow(img)
plt.show()

The `matplotlib` library is very flexible, and you can create virtually any plot and modify its appearance in any way you might think of. If you don‚Äôt like the way your plot looks, fix it! Use the [documentation](http://matplotlib.org/api/pyplot_api.html) to learn more.

By Abdulellah Mojalled : [linkedin](https://www.linkedin.com/in/abdulellah-mojalled/)