# **intro to numpy**

## **Quick Overview**
NumPy (Numerical Python) is a fundamental package for scientific computing in Python. It provides support for large, multi-dimensional arrays and matrices, along with a collection of mathematical functions to operate on these arrays efficiently.

## **In-Depth Explanation**

### What is NumPy?
NumPy is a library for the Python programming language, adding support for large, multi-dimensional arrays and matrices, along with a large collection of high-level mathematical functions to operate on these arrays. It's the fundamental package for scientific computing with Python.

Key features of NumPy:
1. N-dimensional array object
2. Sophisticated (broadcasting) functions
3. Tools for integrating C/C++ and Fortran code
4. Useful linear algebra, Fourier transform, and random number capabilities

### **Why use NumPy?**
- Speed: NumPy uses optimized C code, making operations on large arrays much faster than pure Python.
- Memory efficiency: NumPy arrays use less memory and provide more convenient mechanisms for reading/writing items to disk.
- Convenience: NumPy offers many built-in functions and operations that simplify code and make it more readable.

In [83]:
# All imports
import numpy as np
import sys
import time

### **Python Lists vs NumPy Arrays**

In [84]:
# Python list
my_list = [1, 2, 3, 4]

# List element-wise addition requires explicit looping
result = [x + y for x, y in zip(my_list, my_list)]
print(result)  # Output: [2, 4, 6, 8]

[2, 4, 6, 8]


- **Dynamic and flexible:** Lists in Python are versatile and can store mixed data types (e.g., integers, strings, floats) in a single list. This makes lists highly flexible.
- **Slow for numerical computations:** Lists are slow when it comes to numerical computations because of the dynamic typing and memory overhead associated with each element in the list. Python lists are essentially arrays of pointers to objects, so accessing and manipulating these objects has significant overhead.
- **Element-wise operations:** Element-wise operations on lists require explicit loops. For example, if you want to add two lists element by element, you need a loop or list comprehension.

In [85]:
# make sure you installed numpy at first with `pip install numpy`
import numpy as np

# NumPy array
my_array = np.array([1, 2, 3, 4])

# Element-wise addition is performed without loops
result = my_array + my_array
print(result)  # Output: [2, 4, 6, 8]

[2 4 6 8]


- **Homogeneous data types:** Unlike Python lists, NumPy arrays require that all elements have the same data type. This makes them more memory-efficient and faster for numerical computations.
- **Efficient memory usage:** NumPy arrays use contiguous blocks of memory, which allows for faster access and manipulation. Additionally, they support vectorized operations, meaning element-wise operations can be performed without explicit loops.
- **Optimized for numerical tasks:** NumPy arrays are specifically designed for numerical operations, such as matrix manipulations, linear algebra, and more.

### **Data types in NumPy**

More in depth explanation and all the datatypes from here: https://numpy.org/devdocs/user/basics.types.html

In [86]:
# integer types
print(np.int8(42)) # 8-bit integer (-128 to 127)
print(np.int16(42)) # 16-bit integer (-32768 to 32767)
print(np.int32(42)) # 32-bit integer (-2147483648 to 2147483647)
print(np.int64(42)) # 64-bit integer a lot of range details: https://stackoverflow.com/questions/49762240/what-is-max-size-of-the-file-in-64-bit-system-using-numpy-memory-mapping

42
42
42
42


In [87]:
# unsigned integer types are basically the same as signed integer types, but they can only store non-negative values
print(np.uint8(42)) # 8-bit unsigned integer (0 to 255)
print(np.uint16(42)) # 16-bit unsigned integer (0 to 65535)
print(np.uint32(42)) # 32-bit unsigned integer (0 to 4294967295)
print(np.uint64(42)) # 64-bit unsigned integer (0 to 18446744073709551615)

42
42
42
42


In [88]:
# floating-point types are basically the same as integer types, but they can store fractional values
print(np.float16(42.0)) # 16-bit floating-point number
print(np.float32(42.0)) # 32-bit floating-point number
print(np.float64(42.0)) # 64-bit floating-point number

42.0
42.0
42.0


In [89]:
# complex types are complex numbers, which have a real part and an imaginary part
print(np.complex64(42 + 42j)) # 64-bit complex number
print(np.complex128(42 + 42j)) # 128-bit complex number

(42+42j)
(42+42j)


In [90]:
# boolean types are either True or False
print(np.bool_(True)) # Boolean

True


In [91]:
# Make sure your data is in the appropriate format, because of memory efficiency and performance

# Python list multiplication
my_list = list(range(1000000))
start_time = time.time()
result = [x * 2 for x in my_list]
print("Python list time:", time.time() - start_time)

# NumPy array multiplication
my_array = np.arange(1000000)
start_time = time.time()
result = my_array * 2
print("NumPy array time:", time.time() - start_time)

# NumPy arrays are more memory efficient than Python lists

# Python list
my_list = list(range(1000))
print(sys.getsizeof(my_list))

# NumPy array
my_array = np.arange(1000)
print(my_array.itemsize * my_array.size)

# Comparing numpy data types

# numpy.int8
array_int8 = np.arange(1000, dtype=np.int8)
start_time = time.time()
result = array_int8 * 2
print("numpy.int8 time:", time.time() - start_time)

# numpy.int64
array_int64 = np.arange(1000, dtype=np.int64)
start_time = time.time()
result = array_int64 * 2
print("numpy.int64 time:", time.time() - start_time)




Python list time: 0.01702404022216797
NumPy array time: 0.0049860477447509766
8056
8000
numpy.int8 time: 0.0002002716064453125
numpy.int64 time: 1.5735626220703125e-05


### **Multidimensionality**

NumPy arrays can have multiple dimensions. For example, a 1D array is like a list, a 2D array is like a matrix, and a 3D array is like a cube.

- **1D Array** (Vector): A sequence of numbers.
  - **Example:** [1, 2, 3, 4] (shape: (4,))
- **2D Array **(Matrix): A grid of numbers, where each row is a list.
  - **Example:** [[1, 2], [3, 4]] (shape: (2, 2))
- **3D Array**: A cube of numbers (stack of matrices).
  - **Example:** [[[1, 2], [3, 4]], [[5, 6], [7, 8]]] (shape: (2, 2, 2))
- **N-Dimensional Array**: An array with any number of dimensions.

Each NumPy array has an attribute called `.shape`, which tells you the size of each dimension of the array. For example, a 2D array might have a shape (3, 4), indicating that the array has 3 rows and 4 columns.
- A 1D array with shape (n,) has n elements.
- A 2D array with shape (m, n) has m rows and n columns.
- A 3D array with shape (p, m, n) can be viewed as p matrices of shape (m, n).


In [92]:
# Creating a 1D array
arr_1d = np.array([1, 2, 3, 4])
print(arr_1d)
print("Shape:", arr_1d.shape)  # Output: (4,)

[1 2 3 4]
Shape: (4,)


In [93]:
# Creating a 2D array
arr_2d = np.array([[1, 2], [3, 4]])
print(arr_2d)
print("Shape:", arr_2d.shape)  # Output: (2, 2)

[[1 2]
 [3 4]]
Shape: (2, 2)


In [94]:
# Creating a 3D array
arr_3d = np.array([[[1, 2], [3, 4]], [[5, 6], [7, 8]]])
print(arr_3d)
print("Shape:", arr_3d.shape)  # Output: (2, 2, 2)

[[[1 2]
  [3 4]]

 [[5 6]
  [7 8]]]
Shape: (2, 2, 2)


In [95]:
# Playing with dimensions
arr = np.array([1, 2, 3, 4, 5, 6, 7, 8])
print(arr)
print("Shape:", arr.shape)  # Output: (8,)
arr_2d = arr.reshape(2, 4)
print(arr_2d)
print("Shape:", arr_2d.shape)  # Output: (2, 4)
arr_3d = arr.reshape(2, 2, 2)
print(arr_3d)
print("Shape:", arr_3d.shape)  # Output: (2, 2, 2)

[1 2 3 4 5 6 7 8]
Shape: (8,)
[[1 2 3 4]
 [5 6 7 8]]
Shape: (2, 4)
[[[1 2]
  [3 4]]

 [[5 6]
  [7 8]]]
Shape: (2, 2, 2)


### **Array Creation in NumPy**

In [96]:
arr = np.arange(0, 10, 2) # Start at 0, stop at 10, step by 2
print(arr) # Output: [0 2 4 6 8]

[0 2 4 6 8]


In [97]:
arr = np.linspace(0, 1, 5) # Start at 0, stop at 1, with 5 elements
print(arr)  # Output: [0.   0.25 0.5  0.75 1.  ]

[0.   0.25 0.5  0.75 1.  ]


In [98]:
arr = np.zeros((2, 3)) # Create a 2x3 array of zeros
print(arr)
# Output:
# [[0. 0. 0.]
#  [0. 0. 0.]]

[[0. 0. 0.]
 [0. 0. 0.]]


In [99]:
arr = np.ones((3, 2)) # Create a 3x2 array of ones
print(arr)
# Output:
# [[1. 1.]
#  [1. 1.]
#  [1. 1.]]


[[1. 1.]
 [1. 1.]
 [1. 1.]]


In [100]:
arr = np.eye(3) # Create a 3x3 identity matrix
print(arr)
# Output:
# [[1. 0. 0.]
#  [0. 1. 0.]
#  [0. 0. 1.]]

[[1. 0. 0.]
 [0. 1. 0.]
 [0. 0. 1.]]


In [101]:
arr = np.random.rand(2, 3) # Create a 2x3 array of random numbers between 0 and 1
print(arr)
# Output:
# [[0.61446471 0.78552799 0.14013453]
#  [0.9239703  0.30630796 0.01995603]]

[[0.61675904 0.11044084 0.08303648]
 [0.69879938 0.5603615  0.37850267]]


In [102]:
arr = np.random.randn(2, 3) # Create a 2x3 array of random numbers from a standard normal distribution (mean 0, variance 1)
print(arr)
# Output:
# [[-0.76953065 -0.21063996  0.01985051]
#  [ 0.40232155 -1.54520434 -1.02461677]]

[[ 0.26473824 -0.41598733  0.54976851]
 [-0.27537448 -0.08428783 -0.53081326]]


### **Array Attributes**

In [103]:
arr = np.array([[1, 2, 3], [4, 5, 6]])
print(arr.shape)  # Output: (2, 3) which repesents 2 rows and 3 columns

(2, 3)


In [104]:
arr = np.array([1, 2, 3])
print(arr.ndim)  # Output: 1 which represents 1 dimension
arr_2d = np.array([[1, 2], [3, 4]])
print(arr_2d.ndim)  # Output: 2 which represents 2 dimensions

1
2


In [105]:
arr = np.array([[1, 2, 3], [4, 5, 6]])
print(arr.size)  # Output: 6 which represents the total number of elements in the array

6


In [106]:
arr = np.array([1, 2, 3])
print(arr.dtype)  # Output: int64 which represents the data type of the elements in the array

int64


In [107]:
arr = np.array([1, 2, 3], dtype=np.int32)
print(arr.itemsize)  # Output: 4 which represents the size in bytes of each element in the array

4


In [108]:
arr = np.array([1.5, 2.5, 3.5])
arr_int = arr.astype(int)
print(arr_int)  # Output: [1 2 3] which represents the array converted to integer type

[1 2 3]


### **Slicing & Masking**

In [109]:
# NumPy array slicing
arr = np.array([1, 2, 3, 4, 5, 6, 7, 8])
print(arr[0])  # Output: 1
print(arr[2:6])  # Output: [3 4 5 6]
print(arr[2:])  # Output: [3 4 5 6 7 8]
print(arr[:6])  # Output: [1 2 3 4 5 6]
print(arr[:])  # Output: [1 2 3 4 5 6 7 8]
print(arr[-1])  # Output: 8
print(arr[-3])  # Output: 6
print(arr[-3:])  # Output: [6 7 8]
print(arr[:-2])  # Output: [1 2 3 4 5 6]


1
[3 4 5 6]
[3 4 5 6 7 8]
[1 2 3 4 5 6]
[1 2 3 4 5 6 7 8]
8
6
[6 7 8]
[1 2 3 4 5 6]


In [110]:
# NumPy 2D array slicing
arr = np.array([[1, 2, 3, 4], [5, 6, 7, 8]])
print(arr[0, 0])  # Output: 1
print(arr[0, 2])  # Output: 3
print(arr[1, 0])  # Output: 5
print(arr[1, 2])  # Output: 7
print(arr[:, 0])  # Output: [1 5]
print(arr[:, 2])  # Output: [3 7]
print(arr[:, :2])  # Output: [[1 2] [5 6]]
print(arr[:, 2:])  # Output: [[3 4] [7 8]]
print(arr[0, :])  # Output: [1 2 3 4]
print(arr[1, :])  # Output: [5 6 7 8]

1
3
5
7
[1 5]
[3 7]
[[1 2]
 [5 6]]
[[3 4]
 [7 8]]
[1 2 3 4]
[5 6 7 8]


In [111]:
# NumPy 3D array slicing
arr = np.array([[[1, 2], [3, 4]], [[5, 6], [7, 8]]])
print(arr[0, 0, 0])  # Output: 1
print(arr[1, 0, 0])  # Output: 5
print(arr[:, 0, 0])  # Output: [1 5]
print(arr[:, 1, 0])  # Output: [3 7]
print(arr[:, 0, :])  # Output: [[1 2] [5 6]]
print(arr[:, 1, :])  # Output: [[3 4] [7 8]]
print(arr[0, :, :])  # Output: [[1 2] [3 4]]

1
5
[1 5]
[3 7]
[[1 2]
 [5 6]]
[[3 4]
 [7 8]]
[[1 2]
 [3 4]]


In [112]:
# NumPy array masking
arr = np.array([1, 2, 3, 4, 5, 6, 7, 8])
mask = arr > 4
print(mask)  # Output: [False False False False True True True True]
print(arr[mask])  # Output: [5 6 7 8]


[False False False False  True  True  True  True]
[5 6 7 8]


In [113]:
# NumPy 2D array masking
arr = np.array([[1, 2, 3, 4], [5, 6, 7, 8]])
mask = arr > 4
print(mask)  # Output: [[False False
#                        False False]
#                       [True True
#                        True True]]
print(arr[mask])  # Output: [5 6 7 8]

[[False False False False]
 [ True  True  True  True]]
[5 6 7 8]


In [114]:
# Using masking to modify values
arr = np.array([1, 2, 3, 4, 5, 6, 7, 8])
mask = arr > 4
arr[mask] = 0
print(arr)  # Output: [1 2 3 4 0 0 0 0]

[1 2 3 4 0 0 0 0]


### **Array Operations**

In [115]:
# Element-wise Arithmetic Operations
arr1 = np.array([1, 2, 3, 4])
arr2 = np.array([5, 6, 7, 8])

# Element-wise addition
result = arr1 + arr2
print(result)  # Output: [ 6  8 10 12]

# Element-wise subtraction
result = arr1 - arr2
print(result)  # Output: [-4 -4 -4 -4]

# Element-wise multiplication
result = arr1 * arr2
print(result)  # Output: [ 5 12 21 32]

# Element-wise division
result = arr1 / arr2
print(result)  # Output: [0.2        0.33333333 0.42857143 0.5       ]

# Element-wise exponentiation
result = arr1 ** arr2
print(result)  # Output: [    1    64  2187 65536]


[ 6  8 10 12]
[-4 -4 -4 -4]
[ 5 12 21 32]
[0.2        0.33333333 0.42857143 0.5       ]
[    1    64  2187 65536]


In [116]:
#Array-wide Arithmetic Operations
arr = np.array([[1, 2], [3, 4]])

# Sum of all elements
result = arr.sum()
print(result)  # Output: 10

result = arr.sum(axis=0)
print(result)  # Output: [4 6]

result = np.sum(arr, axis=1)
print(result)  # Output: [3 7]

# Minimum value
result = arr.min()
print(result)  # Output: 1

# Maximum value
result = arr.max()
print(result)  # Output: 4

# Mean
result = arr.mean()
print(result)  # Output: 2.5

result = arr.mean(axis=0)
print(result)  # Output: [2. 3.]

result = np.mean(arr, axis=1)
print(result)  # Output: [1.5 3.5]

# Standard deviation
result = arr.std()
print(result)  # Output: 1.118033988749895

10
[4 6]
[3 7]
1
4
2.5
[2. 3.]
[1.5 3.5]
1.118033988749895


In [117]:
# Universal Functions
arr = np.array([1, 2, 3, 4])

# Square root
result = np.sqrt(arr)
print(result)  # Output: [1.         1.41421356 1.73205081 2.        ]

# Exponential
result = np.exp(arr)
print(result)  # Output: [ 2.71828183  7.3890561  20.08553692 54.59815003]

# Logarithm
result = np.log(arr)
print(result)  # Output: [0.         0.69314718 1.09861229 1.38629436]

# Trigonometric functions
result = np.sin(arr)
print(result)  # Output: [ 0.84147098  0.90929743  0.14112001 -0.7568025 ]

result = np.cos(arr)
print(result)  # Output: [ 0.54030231 -0.41614684 -0.9899925  -0.65364362]

result = np.tan(arr)
print(result)  # Output: [ 1.55740772 -2.18503986 -0.14254654  1.15782128]

# Linear Algebra
arr1 = np.array([[1, 2], [3, 4]])

# Transpose
result = arr1.T
print(result)
# Output:
# [[1 3]
#  [2 4]]

# Matrix multiplication
result = np.dot(arr1, arr1)
print(result)

# Output:
# [[ 7 10]
#  [15 22]]

# Inverse
result = np.linalg.inv(arr1)
print(result)

# Output:
# [[-2.   1. ]
#  [ 1.5 -0.5]]

# Determinant
result = np.linalg.det(arr1)
print(result)  # Output: -2.0000000000000004

[1.         1.41421356 1.73205081 2.        ]
[ 2.71828183  7.3890561  20.08553692 54.59815003]
[0.         0.69314718 1.09861229 1.38629436]
[ 0.84147098  0.90929743  0.14112001 -0.7568025 ]
[ 0.54030231 -0.41614684 -0.9899925  -0.65364362]
[ 1.55740772 -2.18503986 -0.14254654  1.15782128]
[[1 3]
 [2 4]]
[[ 7 10]
 [15 22]]
[[-2.   1. ]
 [ 1.5 -0.5]]
-2.0000000000000004


In [118]:
# Concatenation and Stacking
arr1 = np.array([[1, 2], [3, 4]])
arr2 = np.array([[5, 6], [7, 8]])

# Concatenate along rows
result = np.concatenate((arr1, arr2), axis=0)
print(result)

# Output:
# [[1 2]
#  [3 4]
#  [5 6]
#  [7 8]]

# Concatenate along columns
result = np.concatenate((arr1, arr2), axis=1)
print(result)

# Output:
# [[1 2 5 6]
#  [3 4 7 8]]

# Stack arrays vertically
result = np.vstack((arr1, arr2))
print(result)

# Output:
# [[1 2]
#  [3 4]
#  [5 6]
#  [7 8]]

# Stack arrays horizontally
result = np.hstack((arr1, arr2))
print(result)

# Output:
# [[1 2 5 6]
#  [3 4 7 8]]


[[1 2]
 [3 4]
 [5 6]
 [7 8]]
[[1 2 5 6]
 [3 4 7 8]]
[[1 2]
 [3 4]
 [5 6]
 [7 8]]
[[1 2 5 6]
 [3 4 7 8]]


### **Broadcasting**

1. Dimensions Compatibility: When comparing two arrays, NumPy compares their shapes element-wise from right to left.
- Arrays are compatible for broadcasting if:
  - They have the same shape, or
  - One of the dimensions is 1 (can be stretched).

In [119]:
# Scalar and Array Broadcasting
arr = np.array([[1, 2], [3, 4]])
scalar = 2

# Add scalar to array
result = arr + scalar
print(result)

[[3 4]
 [5 6]]


In [120]:
# Array and Vector Broadcasting
arr = np.array([[1, 2], [3, 4]])
vector = np.array([5, 6])

# Add vector to array
result = arr + vector
print(result)

[[ 6  8]
 [ 8 10]]


In [121]:
# Brodcasting across multiple dimensions
arr1 = np.array([[1, 2], [3, 4]])
arr2 = np.array([[5], [6]])

# Add two arrays
result = arr1 + arr2
print(result)

# Multiplication
result = arr1 * arr2
print(result)

[[ 6  7]
 [ 9 10]]
[[ 5 10]
 [18 24]]


In [122]:
arr_1d = np.array([1, 2, 3])

# Reshaping the 1D array to match the shape (1, 3)
reshaped_arr = arr_1d.reshape((1, 3))

matrix_2d = np.array([[10, 20, 30], [40, 50, 60]])

# Now, broadcasting is possible
result = matrix_2d + reshaped_arr
print(result)


[[11 22 33]
 [41 52 63]]


In [123]:
def moving_average(arr, n=3):
    """Calculate the moving average of an array."""
    ret = np.cumsum(arr, dtype=float)
    ret[n:] = ret[n:] - ret[:-n]
    return ret[n - 1:] / n

signal = np.array([1, 2, 3, 4, 5, 6, 7])
print(moving_average(signal, n=3))

[2. 3. 4. 5. 6.]
