## Numpy

# NumPy Basics: Introduction, Arrays, and Key Concepts

Welcome to the **NumPy** tutorial! This notebook will guide you through the basics of NumPy, why it's useful, and introduce you to key functionalities, including arrays, linear algebra, broadcasting, and random number generation.

## Table of Contents:
1. Introduction to NumPy
2. Why Use NumPy?
3. Understanding NumPy Arrays
4. Arrays vs Python Lists
5. Common Array Functions
6. Linear Algebra with NumPy
7. Broadcasting
8. Random Number Generation

---

## 1. Introduction to NumPy

**NumPy** stands for **Numerical Python**. It's a powerful library for numerical computations and data manipulation. NumPy provides a high-performance multidimensional array object and tools to work with these arrays.

**Numpy** is the core library for **scientific computing** in Python. It provides a high-performance **multidimensional** array object, and tools for working with these arrays

- It is fast, efficient, and a core library for scientific computing with Python.
- It supports large, multi-dimensional arrays and matrices, along with a collection of mathematical functions to operate on these arrays.

Let's start by importing NumPy:


In [1]:
import numpy as np

## Why Numpy

### NumPy is preferred for the following reasons:

- Speed: Operations on NumPy arrays are much faster than equivalent Python list operations.
- Convenience: NumPy provides many useful functions that make working with data much simpler.
- Memory Efficiency: NumPy arrays are more memory efficient than lists.
- Interoperability: NumPy arrays are interoperable with other libraries like pandas, matplotlib, and scipy.

## Arrays vs Python Lists

### **How is a NumPy array different from a Python list?**

- Homogeneity: NumPy arrays store elements of the same data type, while lists can store different types of elements.

- Performance: NumPy operations are more efficient than Python lists due to optimization in underlying C code
- Functionality: NumPy provides a vast collection of functions for performing mathematical and logical operations on arrays


In [2]:
# Python List
list_example = [1, "hello", 2, 3, 4, 5]

# NumPy Array
array_example = np.array([1, 2, 3, 4, 5], dtype=np.int32)

In [3]:
array_example

array([1, 2, 3, 4, 5], dtype=int32)

In [4]:
array_example = np.array([1, "hello", 2, 3, 4, 5], dtype=np.int32)

ValueError: invalid literal for int() with base 10: 'hello'

In [8]:
array_example

array(['1', 'hello', '2', '3', '4', '5'], dtype='<U21')

## Arrays and array construction

A numpy array is a **grid of values**, all of the **same type**, and is indexed by a **tuple of nonnegative integers**. The number of dimensions is the rank of the array; the shape of an array is a tuple of integers giving the size of the array along each dimension.

We can create a `numpy` array by passing a Python list to `np.array()`.

In [18]:
a = np.array([1, 2, 3])  # Create a rank 1 array

This creates the array we can see on the right here:

![](http://jalammar.github.io/images/numpy/create-numpy-array-1.png)

In [21]:
print(type(a), "\n", "Shape: ", a.shape, "\n", "Indexed elements: ", a[0], a[1], a[2])
a[0] = 5                 # Change an element of the array
print(a)

<class 'numpy.ndarray'> 
 Shape:  (3,) 
 Indexed elements:  5 2 3
[5 2 3]


## More Dimensions in Array

To create a `numpy` array with more dimensions, we can pass nested lists, like this:

![](http://jalammar.github.io/images/numpy/numpy-array-create-2d.png)

![](http://jalammar.github.io/images/numpy/numpy-3d-array.png)

In [27]:
b = np.array([[1,2],[3,4]])   # Create a rank 2 array
print(b)

[[1 2]
 [3 4]]


In [28]:
print(b.shape)

(2, 2)


### What about this?


In [25]:
b = np.array([[1,2],[3,4, 5]])
print(b)

[list([1, 2]) list([3, 4, 5])]


  b = np.array([[1,2],[3,4, 5]])


In [49]:
b = np.array([
    [[1,2],[3,4]],
    [[5,6], [7,8]]
])  ## adding 1 more block
print(b)

[[[1 2]
  [3 4]]

 [[5 6]
  [7 8]]]


In [50]:
print(b.shape)
print("Number of dimensions (rank):", b.ndim)


(2, 2, 2)
Number of dimensions (rank): 3


## Array indexing

Numpy offers several ways to index into arrays.

We can index and slice numpy arrays in all the ways we can slice Python lists:

![](http://jalammar.github.io/images/numpy/numpy-array-slice.png)

And you can index and slice numpy arrays in multiple dimensions. If slicing an array with more than one dimension, you should specify a slice for each dimension:

![](http://jalammar.github.io/images/numpy/numpy-matrix-indexing.png)

In [5]:
# Create the following rank 2 array with shape (3, 4)
# [[ 1  2  3  4]
#  [ 5  6  7  8]
#  [ 9 10 11 12]]
a = np.array([[1,2,3,4], [5,6,7,8], [9,10,11,12]])

# Use slicing to pull out the subarray consisting of the first 2 rows
# and columns 1 and 2; b is the following array of shape (2, 2):
# [[2 3]
#  [6 7]]
b = a[:2, 1:3]
print(b)

[[2 3]
 [6 7]]


In [6]:
# [[ 5  6]
#  [ 9 10]]
b = a[1:3, :2]
print(b)

[[ 5  6]
 [ 9 10]]


In [7]:
# Create the following rank 2 array with shape (3, 4)
a = np.array([[1,2,3,4], [5,6,7,8], [9,10,11,12]])
print(a)

[[ 1  2  3  4]
 [ 5  6  7  8]
 [ 9 10 11 12]]


In [53]:
row_r1 = a[1, :]    # Rank 1 view of the second row of a
row_r2 = a[1:2, :]  # Rank 2 view of the second row of a
row_r3 = a[[1], :]  # Rank 2 view of the second row of a
print(row_r1, row_r1.shape)
print(row_r2, row_r2.shape)
print(row_r3, row_r3.shape)

[5 6 7 8] (4,)
[[5 6 7 8]] (1, 4)
[[5 6 7 8]] (1, 4)


## Initialize the values of the array using different numpy methods

There are often cases when we want numpy to initialize the values of the array for us. numpy provides methods like `ones()`, `zeros()`, and `random.random()` for these cases. We just pass them the number of elements we want it to generate:

![](http://jalammar.github.io/images/numpy/create-numpy-array-ones-zeros-random.png)

We can also use these **methods** to produce **multi-dimensional arrays**, as long as we pass them a tuple describing the dimensions of the matrix we want to create:

![](http://jalammar.github.io/images/numpy/numpy-matrix-ones-zeros-random.png)

![](http://jalammar.github.io/images/numpy/numpy-3d-array-creation.png)

In [55]:
a = np.zeros((2,2))  # Create an array of all zeros
print(a)

[[0. 0.]
 [0. 0.]]


In [56]:
b = np.ones((1,2))   # Create an array of all ones
print(b)

[[1. 1.]]


In [57]:
c = np.full((2,2), 7) # Create a constant array
print(c)

[[7 7]
 [7 7]]


In [9]:
d = np.eye(2)        # Create a 2x2 identity matrix
print(d)
print(d.dtype)

[[1. 0.]
 [0. 1.]]
float64


In [59]:
e = np.random.random((2,2)) # Create an array filled with random values
print(e)

[[0.5607279  0.28773219]
 [0.2657614  0.84505912]]


### `vstack()` and `hstack()` (`row_stack` and `column_stack`)

Sometimes, we may want to construct an array from existing arrays by “stacking” the existing arrays, either vertically or horizontally. We can use `vstack()` (or `row_stack`) and `hstack()` (or `column_stack`), respectively.


In [None]:
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
np.vstack((a,b))

In [62]:
a = np.array([[7], [8], [9]])
b = np.array([[4], [5], [6]])
np.hstack((a,b))

array([[7, 4],
       [8, 5],
       [9, 6]])

### Datatypes

Every numpy array is a grid of elements of the same type. Numpy provides a large set of numeric datatypes that you can use to construct arrays. Numpy tries to guess a datatype when you create an array, but functions that construct arrays usually also include an optional argument to explicitly specify the datatype. Here is an example:

In [64]:
x = np.array([1, 2])  # Let numpy choose the datatype
y = np.array([1.0, 2.0])  # Let numpy choose the datatype
z = np.array([1, 2], dtype=np.int64)  # Force a particular datatype

print(x.dtype, y.dtype, z.dtype)

int64 float64 int64


You can read all about numpy datatypes in the [documentation](http://docs.scipy.org/doc/numpy/reference/arrays.dtypes.html).

## Array math

What makes working with `numpy` so powerful and convenient is that it comes with many *vectorized* math functions for computation over elements of an array. These functions are highly optimized and are *very* fast - much, much faster than using an explicit `for` loop.

For example, let’s create a large array of random values and then sum it both ways. We’ll use a `%%time` *cell magic* to time them.

**This tells us how powerful and optimized the numpy is**

In [66]:
a = np.random.random(100000000)

In [67]:
%%time
x = np.sum(a)

CPU times: user 34.4 ms, sys: 292 µs, total: 34.7 ms
Wall time: 33.4 ms


In [68]:
%%time
x = 0
for element in a:
  x = x + element

CPU times: user 9.94 s, sys: 0 ns, total: 9.94 s
Wall time: 9.94 s


Look at the “Wall Time” in the output - note how much faster the vectorized version of the operation is! This type of fast computation is a major enabler of machine learning, which requires a *lot* of computation.

Whenever possible, we will try to use these vectorized operations.

In [71]:
# Add

x = np.array([[1,2],[3,4]], dtype=np.float64)
y = np.array([[5,6],[7,8]], dtype=np.float64)

# Elementwise sum; both produce the array
print(x + y)
print(np.add(x, y))

[[ 6.  8.]
 [10. 12.]]
[[ 6.  8.]
 [10. 12.]]


In [72]:
# Elementwise difference; both produce the array
print(x - y)
print(np.subtract(x, y))

[[-4. -4.]
 [-4. -4.]]
[[-4. -4.]
 [-4. -4.]]


In [73]:
# Elementwise product; both produce the array
print(x * y)
print(np.multiply(x, y))

[[ 5. 12.]
 [21. 32.]]
[[ 5. 12.]
 [21. 32.]]


In [74]:
# Elementwise division; both produce the array
# [[ 0.2         0.33333333]
#  [ 0.42857143  0.5       ]]
print(x / y)
print(np.divide(x, y))

[[0.2        0.33333333]
 [0.42857143 0.5       ]]
[[0.2        0.33333333]
 [0.42857143 0.5       ]]


In [75]:
# Elementwise square root; produces the array
# [[ 1.          1.41421356]
#  [ 1.73205081  2.        ]]
print(np.sqrt(x))

[[1.         1.41421356]
 [1.73205081 2.        ]]


![](http://jalammar.github.io/images/numpy/numpy-array-subtract-multiply-divide.png)

### Dot product

In [76]:
x = np.array([[1,2],[3,4]])
y = np.array([[5,6],[7,8]])

v = np.array([9,10])
w = np.array([11, 12])

# Inner product of vectors; both produce 219
print(v.dot(w))
print(np.dot(v, w))

219
219


You can also use the `@` operator which is equivalent to numpy's `dot` operator.

In [77]:
# Matrix / vector product; both produce the rank 1 array [29 67]
print(x.dot(v))
print(np.dot(x, v))
print(x @ v)

[29 67]
[29 67]
[29 67]


# comming up next...

### Different other methods
### Transpose, Reshape
### Broadcast 
### Random Numbers with Numpy
### Linear Algebra with Numpy