****

# <center> <b> <span style="color:orange;"> Python Proficiency for Scientific Computing and Data Science (PyPro-SCiDaS)  </span> </b></center>

### <center> <b> <span style="color:green;">An Initiation to Programming using Python (Init2Py) </span> </b></center>
    


****

# <center> <b> <span style="color:blue;"> Lecture 11: Numpy </span> </b></center>


****

![NumPy.jpeg](attachment:NumPy.jpeg)

[The NumPy Reference](https://docs.scipy.org/doc/numpy/reference/).

### <left> <b> <span style="color:brown;"> Objective: </span> </b></left>

This lecture aims to introduce the core concepts of `NumPy`, a fundamental library for numerical computing in Python. We will explore how to create and manipulate multidimensional arrays, perform vectorized operations, and efficiently handle large datasets. By understanding these tools, you'll gain the ability to perform high-performance numerical computations and build the foundation for scientific computing and data analysis using `NumPy`.
****

## ðŸ§© What is NumPy and Why Use It?

**NumPy (Numerical Python)** is the **core library for scientific and numerical computing** in Python.  
It provides a **high-performance multidimensional array object** (`ndarray`) and tools for efficient operations on vectors, matrices, and higher-dimensional data.

---

### ðŸ”¹ Key Features

- **Speed & Efficiency:**  
  Implemented in **C** and **Fortran**, NumPy executes vectorized operations much faster than standard Python lists â€” making it ideal for **data science** and **machine learning**.

- **Mathematical Power:**  
  Supports advanced computations such as **linear algebra**, **matrix operations**, **Fourier transforms**, and **statistical calculations** (mean, median, range, etc.).

- **Ecosystem Foundation:**  
  NumPy is the **backbone** of the scientific Python stack â€” libraries like **Pandas**, **Scikit-learn**, and **TensorFlow** are built on top of it.

- **Machine Learning Relevance:**  
  Since ML algorithms rely heavily on **vectors and matrices**, NumPy is essential for tasks like **classification**, **regression**, and **clustering**.

---

âœ… **Moreover:**  
NumPy is a **fast, efficient, and foundational library** that powers numerical and scientific computing in Python â€”  serving as the **engine** behind most AI and data analysis workflows.


## NumPy arrays

At the core of the NumPy package, is the ``ndarray`` object. This encapsulates $n$-dimensional arrays of homogeneous data types, with many operations being performed in compiled code for performance.

  
  
A numpy array is a grid of values, all of the same type, and is indexed by a tuple of nonnegative integers. The number of dimensions is the *rank* of the array; the *shape* of an array is a tuple of integers giving the size of the array along each dimension.
  
  
  
Data manipulation in Python is nearly synonymous with NumPy array manipulation: even newer tools like Pandas are built around the NumPy array. This section will present several examples of using NumPy array manipulation to access data and subarrays, and to split, reshape, and join the arrays.   

### To use numpy you need to import the module

**Import conventions** : The recommended convention to import numpy is:

```python

                                         import numpy as np
```

In [None]:
import numpy as np


There are a number of ways to initialize new numpy arrays, for example from:

* a Python list or tuples;
* using functions that are dedicated to generating numpy arrays, such as `arange`, `linspace`, etc.;
* reading data from files.

### ðŸ§© Creating NumPy Arrays â€” Examples of Different Initialization Methods

NumPy provides several ways to **create arrays**, depending on the data source or the purpose of your computation.  Here are examples for each common creation mode:


#### ðŸ”¹ 1. From a **Python List or Tuple**

You can convert existing Python lists or tuples into NumPy arrays using `np.array()`.

```python
import numpy as np

# From a Python list
a = np.array([1, 2, 3, 4])
print(a)           # Output: [1 2 3 4]

# From a tuple
b = np.array((5, 6, 7, 8))
print(b)           # Output: [5 6 7 8]
```

In [2]:
import numpy as np

# From a Python list
a = np.array([1, 2, 3, 4])
print(a)           # Output: [1 2 3 4]
print(type(a))
print('=====================================')

# From a tuple
b = np.array((5, 6, 7, 8))
print(b)           # Output: [5 6 7 8]
print(type(b))

[1 2 3 4]
<class 'numpy.ndarray'>
[5 6 7 8]
<class 'numpy.ndarray'>


#### ðŸ”¹ 2. Using **Array-Generating Functions**

NumPy provides built-in functions to generate arrays efficiently:

```python
# Using arange(start, stop, step)
c = np.arange(0, 10, 2)
print(c)           # Output: [0 2 4 6 8]

# Using linspace(start, stop, num)
d = np.linspace(0, 1, 5)
print(d)           # Output: [0.   0.25 0.5  0.75 1.  ]
```

In [4]:
# Using arange(start, stop, step)
c = np.arange(0, 10, 2)
print(c)           # Output: [0 2 4 6 8]
print('=====================================')

# Using linspace(start, stop, num)
d = np.linspace(0, 1, 5)
print(d)           # Output: [0.   0.25 0.5  0.75 1.  ]

[0 2 4 6 8]
[0.   0.25 0.5  0.75 1.  ]



### ðŸ§© NumPy Array Attributes

Each NumPy array comes with useful **attributes** that describe its structure.  
These attributes help you understand how your data is organized in memory and how to manipulate it effectively.

---

#### ðŸ”¹ 1. `ndim` â€” Number of Dimensions

The **`ndim`** attribute tells you how many **dimensions (axes)** the array has.

```python
import numpy as np

# 1D array
a = np.array([1, 2, 3])
print(a.ndim)   # Output: 1
````

ðŸ§  **Explanation:**

* This is a **1-dimensional** array (a single row of elements).

```python
# 2D array
b = np.array([[1, 2, 3],
              [4, 5, 6]])
print(b.ndim)   # Output: 2
```

ðŸ§© Here, `b` has **2 dimensions** â€” rows and columns.

```python
# 3D array
c = np.array([[[1, 2], [3, 4]],
              [[5, 6], [7, 8]]])
print(c.ndim)   # Output: 3
```

ðŸ’¡ A 3D array can be thought of as a **stack of 2D matrices**.

---

#### ðŸ”¹ 2. `shape` â€” Dimensionsâ€™ Size

The **`shape`** attribute gives a **tuple** representing the **size of each dimension**.

```python
print(a.shape)   # Output: (3,)
print(b.shape)   # Output: (2, 3)
print(c.shape)   # Output: (2, 2, 2)
```

ðŸ§  **Explanation:**

* `(3,)` â†’ one dimension of length 3 (a vector).
* `(2, 3)` â†’ 2 rows, 3 columns (a matrix).
* `(2, 2, 2)` â†’ 2 blocks, each of shape 2Ã—2 (a 3D array).

---

#### ðŸ”¹ 3. `size` â€” Total Number of Elements

The **`size`** attribute gives the **total number of elements** in the array â€”
the product of all the dimensions.

```python
print(a.size)   # Output: 3
print(b.size)   # Output: 6
print(c.size)   # Output: 8
```

ðŸ§® **Explanation:**

* For `b`: (2 \text{ rows} \times 3 \text{ columns} = 6)
* For `c`: (2 \times 2 \times 2 = 8)

---

#### âœ… **Overview*

| **Attribute** | **Meaning**              | **Example Value** | **Description**      |
| ------------- | ------------------------ | ----------------- | -------------------- |
| `ndim`        | Number of dimensions     | `2`               | 2D array (matrix)    |
| `shape`       | Size of each dimension   | `(2, 3)`          | 2 rows, 3 columns    |
| `size`        | Total number of elements | `6`               | All entries combined |

---

ðŸ’¡ **In short:**

* Use **`.ndim`** to find how many axes the array has.
* Use **`.shape`** to see the layout of dimensions.
* Use **`.size`** to know how many total elements exist.

These attributes are essential for debugging, reshaping arrays, and understanding your dataâ€™s structure.

```


* * * *
Exercise: **Creating arrays using functions**

> Experiment with `arange`, `linspace`, `ones`, `zeros`, `eye`, `full` and `diag`.
    Create different kinds of arrays with random numbers.
    Try setting the seed before creating an array with random values.
    Look at the function `np.empty`. What does it do? When might this be useful?


ðŸ§® **These functions** are particularly useful for mathematical modeling, simulations, and ML preprocessing.

* * * *

#### 3. Reading Data from **Files**

### ðŸ§© Creating and Reading Arrays from Files in NumPy

NumPy allows you not only to **read arrays from files**, but also to **create and write** your own data files first.  
Letâ€™s go through the full process â€” **creating**, **displaying**, and **reading** data from files.


#### ðŸ”¹ 1. Create a Text File with Data

You can use NumPyâ€™s `savetxt()` function to write arrays to a text file.

```python
import numpy as np

# Create a NumPy array
data = np.array([[1, 2, 3],
                 [4, 5, 6]])

# Save the array to a text file (comma-separated)
np.savetxt('data.txt', data, delimiter=',')

print("âœ… File 'data.txt' created successfully!")
````

In [5]:
import numpy as np

# Create a NumPy array
data = np.array([[1, 2, 3],
                 [4, 5, 6]])

# Save the array to a text file (comma-separated)
np.savetxt('data.txt', data, delimiter=',')

print("âœ… File 'data.txt' created successfully!")

âœ… File 'data.txt' created successfully!


#### ðŸ”¹ 2. Display the Contents of the File

To confirm that the file has been created correctly, you can open and display it:

```python
# Display file contents
with open('data.txt', 'r') as file:
    contents = file.read()
    print("ðŸ“„ File contents:")
    print(contents)
```
**Expected Output:**

```
1.000000000000000000e+00,2.000000000000000000e+00,3.000000000000000000e+00
4.000000000000000000e+00,5.000000000000000000e+00,6.000000000000000000e+00
```

*(These are the default scientific notation values.)*

In [6]:
with open('data.txt', 'r') as file:
    contents = file.read()
    print("ðŸ“„ File contents:")
    print(contents)

ðŸ“„ File contents:
1.000000000000000000e+00,2.000000000000000000e+00,3.000000000000000000e+00
4.000000000000000000e+00,5.000000000000000000e+00,6.000000000000000000e+00



#### ðŸ”¹ 3. Read the Data Back into a NumPy Array

Now, letâ€™s load the same file into a new array using `np.loadtxt()`:

```python
# Read the array from the text file
g = np.loadtxt('data.txt', delimiter=',')
print("ðŸ“Š Array loaded from file:")
print(g)
```

**Output:**

```
[[1. 2. 3.]
 [4. 5. 6.]]
```



In [7]:
g = np.loadtxt('data.txt', delimiter=',')
print("ðŸ“Š Array loaded from file:")
print(g)

ðŸ“Š Array loaded from file:
[[1. 2. 3.]
 [4. 5. 6.]]




#### âœ… **In a nutshell**

| **Step**    | **Function**        | **Purpose**                  |
| ----------- | ------------------- | ---------------------------- |
| Create file | `np.savetxt()`      | Save array to text/CSV file  |
| View file   | `open()` / `read()` | Display file content         |
| Read file   | `np.loadtxt()`      | Load data into a NumPy array |



### ðŸ§© Accessing and Modifying Array Elements in NumPy

Once youâ€™ve created a NumPy array, you can **access**, **modify**, or **slice** its elements using **indexing**.  
Indexing in NumPy works similarly to Python lists but is far more powerful when working with large datasets.

---

#### ðŸ”¹ Example: Accessing Elements by Index

```python
import numpy as np

a = np.array([1, 2, 3])    # Create a 1D NumPy array

# Access individual elements
print(a[0], a[1], a[2])    # Output: 1 2 3
````

ðŸ§  **Explanation:**

* Indexing in NumPy starts at **0** (just like in Python lists).
* `a[0]` refers to the **first** element, `a[1]` to the **second**, and so on.

---

#### ðŸ”¹ Example: Modifying Array Elements

You can change the value of an element by assigning a new value to its index.

```python
a[0] = 5                   # Change the first element
print(a)                   # Output: [5 2 3]
```

ðŸ§© **Explanation:**

* NumPy arrays are **mutable**, so you can update elements directly.
* The change is applied in place â€” no need to recreate the array.

---

#### ðŸ”¹ Example: Accessing with Negative Indexing

Just like Python lists, you can use **negative indices** to access elements from the end.

```python
print(a[-1])               # Output: 3  (last element)
print(a[-2])               # Output: 2  (second-to-last element)
```

---

#### ðŸ”¹ Example: Multi-Dimensional Indexing

For 2D arrays (matrices), you use **two indices**: one for the row and one for the column.

```python
b = np.array([[1, 2, 3],
              [4, 5, 6]])

print(b[0, 0])   # Output: 1  (row 0, column 0)
print(b[1, 2])   # Output: 6  (row 1, column 2)
```

You can also modify elements the same way:

```python
b[0, 1] = 9
print(b)
# Output:
# [[1 9 3]
#  [4 5 6]]
```
---
```

### Reshaping of Arrays

Another useful type of operation is reshaping of arrays. The most flexible way of doing this is with the ``reshape`` method. For example, if you want to put the numbers $1$ through $9$ in a $3 \times 3$ grid, you can do the following:

In [8]:
grid = np.arange(1, 10).reshape((3, 3))
print(grid)

[[1 2 3]
 [4 5 6]
 [7 8 9]]


> **Try**: Note that for this to work, the size of the initial array must match the size of the reshaped array. Try a few examples that you will create.

### Fancy indexing

Fancy indexing is the name for when an array or a list is used in-place of an index:

In [9]:
twenty = (np.arange(4 * 5)).reshape(4, 5)
twenty

array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19]])

In [10]:
row_indices = [1, 2, 3]
twenty[row_indices]

array([[ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19]])

In [11]:
col_indices = [1, 2, -1] # remember, index -1 means the last element
twenty[row_indices, col_indices]

array([ 6, 12, 19])

We can also use index **masks**:

If the index mask is a NumPy array of data type bool, then an element is selected (True) or not (False) depending on the value of the index mask at the position of each element



In [12]:
# 1D array of random integers
# get 10 integers from 0 to 23

num_samples = 10
integers = np.random.randint(23, size=num_samples)
integers

array([ 8, 20,  8,  4, 16,  6,  7,  8, 15,  9], dtype=int32)

In [13]:
# mask has to be of the same shape as the array to be indexed; else IndexError would be thrown
# mask for indexing alternate elements in the array
row_mask = np.array([True, False, True, False, True, False, True, False, True, False])

integers[row_mask]

array([ 8,  8, 16,  7, 15], dtype=int32)

This feature is very useful to conditionally select elements from an array, using for example comparison operators:

In [14]:
range_arr = np.arange(0, 10, 0.5)
range_arr

array([0. , 0.5, 1. , 1.5, 2. , 2.5, 3. , 3.5, 4. , 4.5, 5. , 5.5, 6. ,
       6.5, 7. , 7.5, 8. , 8.5, 9. , 9.5])

In [15]:
mask = (range_arr > 5) * (range_arr < 7.5)
mask
# What is happenning here?

array([False, False, False, False, False, False, False, False, False,
       False, False,  True,  True,  True,  True, False, False, False,
       False, False])

In [16]:
range_arr[mask]

array([5.5, 6. , 6.5, 7. ])

* * * *
### Exercise:
 > Investigate the **Concatenation, Splitting, Copies,repeat, tile, vstack, hstack**  functions for numpy arrays.

 > Investigate [Views versus copies in NumPy](https://scipy-cookbook.readthedocs.io/items/ViewsVsCopies.html)

* * * *

## Array math

Basic mathematical functions operate elementwise on arrays, and are available both as operator overloads and as functions in the numpy module:

In [17]:
import numpy as np

x = np.array([[1,2],[3,4]], dtype=np.float64)
y = np.array([[5,6],[7,8]], dtype=np.float64)

# Elementwise sum; both produce the array
# [[ 6.0  8.0]
#  [10.0 12.0]]
print(x + y)
print("---------------------------")
print(np.add(x, y))
print("\n \n")

# Elementwise difference; both produce the array
# [[-4.0 -4.0]
#  [-4.0 -4.0]]
print(x - y)
print("---------------------------")
print(np.subtract(x, y))
print("\n \n")


# Elementwise product; both produce the array
# [[ 5.0 12.0]
#  [21.0 32.0]]
print(x * y)
print("---------------------------")
print(np.multiply(x, y))
print("\n \n")


# Elementwise division; both produce the array
# [[ 0.2         0.33333333]
#  [ 0.42857143  0.5       ]]
print(x / y)
print("---------------------------")
print(np.divide(x, y))
print("\n \n")


# Elementwise square root; produces the array
# [[ 1.          1.41421356]
#  [ 1.73205081  2.        ]]
print(np.sqrt(x))


[[ 6.  8.]
 [10. 12.]]
---------------------------
[[ 6.  8.]
 [10. 12.]]

 

[[-4. -4.]
 [-4. -4.]]
---------------------------
[[-4. -4.]
 [-4. -4.]]

 

[[ 5. 12.]
 [21. 32.]]
---------------------------
[[ 5. 12.]
 [21. 32.]]

 

[[0.2        0.33333333]
 [0.42857143 0.5       ]]
---------------------------
[[0.2        0.33333333]
 [0.42857143 0.5       ]]

 

[[1.         1.41421356]
 [1.73205081 2.        ]]


We instead use the ``dot`` function to compute inner products of vectors, to multiply a vector by a matrix, and to multiply matrices. ``dot`` is available both as a function in the numpy module and as an instance method of array objects:

In [18]:
import numpy as np

x = np.array([[1,2],[3,4]])
y = np.array([[5,6],[7,8]])

v = np.array([9,10])
w = np.array([11, 12])

# Inner product of vectors; both produce 219
print(v.dot(w))
print("---------------------------")
print(np.dot(v, w))
print("\n \n")



# Matrix / vector product; both produce the rank 1 array [29 67]
print(x.dot(v))
print("---------------------------")
print(np.dot(x, v))
print("\n \n")

# Matrix / matrix product; both produce the rank 2 array
# [[19 22]
#  [43 50]]
print(x.dot(y))
print("---------------------------")
print(np.dot(x, y))


219
---------------------------
219

 

[29 67]
---------------------------
[29 67]

 

[[19 22]
 [43 50]]
---------------------------
[[19 22]
 [43 50]]


You can explore the complete collection of NumPyâ€™s **mathematical functions** in its [official documentation](https://docs.scipy.org/doc/numpy/reference/routines.math.html).  
For more specialized topics, the **linear algebra** functions are listed [here](https://numpy.org/doc/stable/reference/routines.linalg.html), and the **statistical routines** can be found [on this page](https://numpy.org/doc/stable/reference/routines.statistics.html).


## NumPy Arrays vs. Python Lists, which one is faster ?

We'll create a large array of one million numbers and multiply each element by $2$.

In [4]:
import numpy as np

In [5]:
# Creating numpy array 
numpy_array = np.arange(1_000_000)

# Creating python list
python_list = list(range(1_000_000))

Now, let's compute the time of execution of both using the `%timeit` magic command

In [6]:
# the execution time of the numpy array
%timeit array = numpy_array * 2

4.03 ms Â± 673 Î¼s per loop (mean Â± std. dev. of 7 runs, 100 loops each)


In [7]:
# the execution time of the python list
%timeit lst = [num * 2 for num in python_list]

77.8 ms Â± 2.98 ms per loop (mean Â± std. dev. of 7 runs, 10 loops each)



### âš¡ What Just Happened? â€” **Vectorization**

When you compared plain Python operations with NumPy operations, you probably noticed that **NumPy is dramatically faster** â€” often by **20x or more**!

This incredible speed is due to a concept called **Vectorization**.  Instead of looping through elements one by one (as Python lists do), **NumPy performs operations on the entire array at once**, using highly optimized **pre-compiled C code** under the hood. 

---

### ðŸ§  What Does the Magic Command `%timeit` Do?

The `%timeit` command in Jupyter/IPython is a **benchmarking tool** that measures how long it takes for a line or block of code to run.  
It executes the code **multiple times** to compute an **accurate average execution time**.

---

#### ðŸ”¹ Usage Examples

- **Single Line Timing**  
  Use `%timeit` for short, one-line commands:
  ```python
  %timeit [x**2 for x in range(1000)]
  ```

* **Multiple Line Timing**
  Use `%%timeit` (note the double `%`) at the top of a cell to time multiple lines of code:

  ```python
  %%timeit
  total = 0
  for i in range(1000):
      total += i**2
  ```

---

âœ… **Take away**

* **Vectorization** makes NumPy fast by performing operations on entire arrays in compiled code.
* **`%timeit`** helps you measure and compare the speed of your code precisely.





 ## Data Types


### ðŸ§© Data Types for NumPy Arrays (`ndarray`)

Every NumPy array (`ndarray`) has a **data type**, known as its **`dtype`** (data-type object),  
which defines the kind of elements stored in the array â€” such as integers, floats, strings, or complex numbers.

Understanding data types is important because they determine:
- **How much memory** each element uses,
- **What operations** can be performed,
- And **how data is interpreted** internally.

---

#### ðŸ”¹ 1. Checking the Data Type

You can use the `.dtype` attribute to check the type of elements in an array:

```python
import numpy as np

a = np.array([1, 2, 3])
b = np.array([1.0, 2.0, 3.0])

print(a.dtype)   # Output: int64 (or int32 depending on your system)
print(b.dtype)   # Output: float64
```

ðŸ§  **Explanation:**

* `a` contains integers â†’ NumPy automatically assigns `int64` (or `int32`).
* `b` contains decimals â†’ assigned as `float64`.

---

#### ðŸ”¹ 2. Specifying the Data Type

You can manually specify a data type using the `dtype` parameter:

```python
c = np.array([1, 2, 3], dtype=float)
print(c)         # Output: [1. 2. 3.]
print(c.dtype)   # Output: float64
```

Or explicitly define the precision:

```python
d = np.array([1, 2, 3], dtype=np.int8)
print(d.dtype)   # Output: int8
```

ðŸ’¡ **Tip:** Choosing smaller data types like `int8` or `float32` can save memory in large datasets.

---

#### ðŸ”¹ 3. Common NumPy Data Types

| **Data Type**                         | **Description**                  | **Example**                              |
| ------------------------------------- | -------------------------------- | ---------------------------------------- |
| `int8`, `int16`, `int32`, `int64`     | Signed integers (various sizes)  | `np.array([1, 2, 3], dtype=np.int16)`    |
| `uint8`, `uint16`, `uint32`, `uint64` | Unsigned integers (no negatives) | `np.array([1, 2, 3], dtype=np.uint8)`    |
| `float16`, `float32`, `float64`       | Floating-point numbers           | `np.array([1.2, 3.4], dtype=np.float32)` |
| `complex64`, `complex128`             | Complex numbers                  | `np.array([1+2j, 3+4j])`                 |
| `bool_`                               | Boolean (True/False)             | `np.array([True, False])`                |
| `str_`, `unicode_`                    | Strings                          | `np.array(["AI", "ML"])`                 |

---

#### ðŸ”¹ 4. Type Conversion (Casting)

You can convert an existing array to a different data type using `.astype()`:

```python
e = np.array([1.7, 2.3, 3.9])
f = e.astype(int)
print(f)         # Output: [1 2 3]
print(f.dtype)   # Output: int64
```

ðŸ§® **Explanation:**
`astype()` creates a **new array** with the converted data type (it does not modify the original array).

---
```


## Vectorizing functions

As mentioned several times by now, to get good performance we should always try to avoid looping over elements in our vectors and matrices, and instead use vectorized algorithms. The first step in converting a scalar algorithm to a vectorized algorithm is to make sure that the functions we write work with vector inputs.

In [1]:
def Theta(x):
    """
    scalar implementation of the Heaviside step function.
    """
    if x >= 0:
        return 1
    else:
        return 0

In [3]:

v1 = np.array([-3,-2,-1,0,1,2,3])

Theta(v1)

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

That didn't work because we didn't write the function Theta so that it can handle a vector input. To get a vectorized version of Theta we can use the Numpy function vectorize. In many cases it can automatically vectorize a function:

In [4]:
Theta_vec = np.vectorize(Theta)
Theta_vec(v1)

array([0, 0, 0, 1, 1, 1, 1])

On the Other Hand (OTHO), we can also implement the function to accept a vector input from the beginning (requires more effort but might give better performance):

In [5]:
def Theta(x):
    """
    Vector-aware implementation of the Heaviside step function.
    """
    return 1 * (x >= 0)

In [6]:
Theta(v1)

array([0, 0, 0, 1, 1, 1, 1])

In [7]:
# it even works with scalar input
Theta(-1.2), Theta(2.6)

(0, 1)

Numpy provides many more functions for manipulating arrays; you can see the full list [in the documentation](https://docs.scipy.org/doc/numpy/reference/routines.array-manipulation.html).

## Broadcasting

Broadcasting is a powerful mechanism that allows numpy to work with arrays of different shapes when performing arithmetic operations. Frequently we have a smaller array and a larger array, and we want to use the smaller array multiple times to perform some operation on the larger array.

For example, suppose that we want to add a constant vector to each row of a matrix. We could do it like this:

In [8]:
import numpy as np

# We will add the vector v to each row of the matrix x,
# storing the result in the matrix y
x = np.array([[1,2,3], [4,5,6], [7,8,9], [10, 11, 12]])
v = np.array([1, 0, 1])
y = np.empty_like(x)   # Create an empty matrix with the same shape as x

# Add the vector v to each row of the matrix x with an explicit loop
for i in range(4):
    y[i, :] = x[i, :] + v

# Now y is the following
# [[ 2  2  4]
#  [ 5  5  7]
#  [ 8  8 10]
#  [11 11 13]]
print(y)


[[ 2  2  4]
 [ 5  5  7]
 [ 8  8 10]
 [11 11 13]]


This works; however when the matrix ``x`` is very large, computing an explicit loop in Python could be slow. Note that adding the vector ``v ``to each row of the matrix ``x`` is equivalent to forming a matrix ``vv`` by stacking multiple copies of ``v`` vertically, then performing elementwise summation of ``x`` and ``vv``. We could implement this approach like this:

In [9]:
import numpy as np

# We will add the vector v to each row of the matrix x,
# storing the result in the matrix y
x = np.array([[1,2,3], [4,5,6], [7,8,9], [10, 11, 12]])
v = np.array([1, 0, 1])
vv = np.tile(v, (4, 1))   # Stack 4 copies of v on top of each other
print(vv)                 # Prints "[[1 0 1]
                          #          [1 0 1]
                          #          [1 0 1]
                          #          [1 0 1]]"
y = x + vv  # Add x and vv elementwise
print(y)  # Prints "[[ 2  2  4
          #          [ 5  5  7]
          #          [ 8  8 10]
          #          [11 11 13]]"


[[1 0 1]
 [1 0 1]
 [1 0 1]
 [1 0 1]]
[[ 2  2  4]
 [ 5  5  7]
 [ 8  8 10]
 [11 11 13]]


Numpy broadcasting allows us to perform this computation without actually creating multiple copies of `v`. Consider this version, using broadcasting:

In [10]:
import numpy as np

# We will add the vector v to each row of the matrix x,
# storing the result in the matrix y
x = np.array([[1,2,3], [4,5,6], [7,8,9], [10, 11, 12]])
v = np.array([1, 0, 1])
y = x + v  # Add v to each row of x using broadcasting
print(y)  # Prints "[[ 2  2  4]
          #          [ 5  5  7]
          #          [ 8  8 10]
          #          [11 11 13]]"


[[ 2  2  4]
 [ 5  5  7]
 [ 8  8 10]
 [11 11 13]]


The line ``y = x + v`` works even though ``x`` has shape **(4, 3)** and ``v`` has shape **(3,)** due to broadcasting; this line works as if ``v`` actually had shape **(4, 3)**, where each row was a copy of ``v``, and the sum was performed elementwise.

* * * *

Broadcasting two arrays together follows these rules:


1. If the arrays do not have the same rank, prepend the shape of the lower rank array with 1s until both shapes have the same length.

2. The two arrays are said to be compatible in a dimension if they have the same size in the dimension, or if one of the arrays has size 1 in that dimension.
    
3. The arrays can be broadcast together if they are compatible in all dimensions.
    
4. After broadcasting, each array behaves as if it had shape equal to the elementwise maximum of shapes of the two input arrays.
    
5. In any dimension where one array had size 1 and the other array had size greater than 1, the first array behaves as if it were copied along that dimension


If this explanation does not make sense, try reading the explanation from the [documentation](https://docs.scipy.org/doc/numpy/user/basics.broadcasting.html)
Functions that support broadcasting are known as universal functions. You can find the list of all universal functions [in the documentation](https://docs.scipy.org/doc/numpy/reference/ufuncs.html#available-ufuncs).

Here are some applications of broadcasting:

In [12]:
import numpy as np

# Compute outer product of vectors
v = np.array([1,2,3])  # v has shape (3,)
w = np.array([4,5])    # w has shape (2,)
# To compute an outer product, we first reshape v to be a column
# vector of shape (3, 1); we can then broadcast it against w to yield
# an output of shape (3, 2), which is the outer product of v and w:
# [[ 4  5]
#  [ 8 10]
#  [12 15]]
print(np.reshape(v, (3, 1)) * w)

# Add a vector to each row of a matrix
x = np.array([[1,2,3], [4,5,6]])
# x has shape (2, 3) and v has shape (3,) so they broadcast to (2, 3),
# giving the following matrix:
# [[2 4 6]
#  [5 7 9]]
print(x + v)
print('----------------------------')

# Add a vector to each column of a matrix
# x has shape (2, 3) and w has shape (2,).
# If we transpose x then it has shape (3, 2) and can be broadcast
# against w to yield a result of shape (3, 2); transposing this result
# yields the final result of shape (2, 3) which is the matrix x with
# the vector w added to each column. Gives the following matrix:
# [[ 5  6  7]
#  [ 9 10 11]]
print((x.T + w).T)
print('----------------------------')
# Another solution is to reshape w to be a column vector of shape (2, 1);
# we can then broadcast it directly against x to produce the same
# output.
print(x + np.reshape(w, (2, 1)))
print('----------------------------')

# Multiply a matrix by a constant:
# x has shape (2, 3). Numpy treats scalars as arrays of shape ();
# these can be broadcast together to shape (2, 3), producing the
# following array:
# [[ 2  4  6]
#  [ 8 10 12]]
print(x * 2)


[[ 4  5]
 [ 8 10]
 [12 15]]
[[2 4 6]
 [5 7 9]]
----------------------------
[[ 5  6  7]
 [ 9 10 11]]
----------------------------
[[ 5  6  7]
 [ 9 10 11]]
----------------------------
[[ 2  4  6]
 [ 8 10 12]]


Broadcasting typically makes your code more concise and faster, so you should strive to use it where possible.