<a href="https://colab.research.google.com/github/devbassey/DataCamp/blob/main/Numpy.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **NUMPY**

NumPy is a fundamental Python package to efficiently practice data science. It is a powerful library in Python for numerical computing. It provides support for large, multi-dimensional arrays and matrices, along with a collection of high-level mathematical functions to operate on these arrays efficiently.



### Why is NumPy important?

*   **Efficiency**: NumPy's core is implemented in C and Fortran, making it much faster than standard Python lists for numerical operations.
*   **Powerful N-dimensional arrays**: It provides `ndarray` objects, which are fast and flexible containers for large datasets.
*   **Mathematical functions**: It comes with a vast collection of functions for performing mathematical operations on arrays, such as linear algebra, Fourier transforms, and random number capabilities.
*   **Foundation for other libraries**: Many other data science libraries, like Pandas and SciPy, are built on top of NumPy.

In [1]:
import numpy as np

# Create a NumPy array
arr = np.array([1, 2, 3, 4, 5])
print("NumPy array:", arr)

# Perform a simple operation (e.g., add 10 to each element)
arr_plus_ten = arr + 10
print("Array + 10:", arr_plus_ten)

# Create a 2D array (matrix)
matrix = np.array([[1, 2, 3], [4, 5, 6]])
print("\n2D NumPy array (matrix):\n", matrix)

# Get the shape of the array
print("Shape of matrix:", matrix.shape)

# Perform element-wise multiplication
arr_multiplied = arr * 2
print("\nArray multiplied by 2:", arr_multiplied)

NumPy array: [1 2 3 4 5]
Array + 10: [11 12 13 14 15]

2D NumPy array (matrix):
 [[1 2 3]
 [4 5 6]]
Shape of matrix: (2, 3)

Array multiplied by 2: [ 2  4  6  8 10]


In [2]:
baseball = [100, 320, 500, 600, 400, 320, 930]

type(baseball)

np_baseball = np.array(baseball)

print(np_baseball)

print(type(np_baseball))

[100 320 500 600 400 320 930]
<class 'numpy.ndarray'>


In [3]:
pip install numpy



In [4]:
import numpy as np


In [5]:
print(np.__version__)

2.0.2


A numpy array is superior to a python list. It is faster and gives us more operations.

In [6]:
array = np.array([1, 2, 3, 4, 5])
array = array * 2
print(array)

[ 2  4  6  8 10]


In [7]:
print(type(array))

<class 'numpy.ndarray'>


In [8]:
# Multi-dimensional Arrays using NumPy

array = np.array([[1, 2, 3], [4, 5, 6]]) # 2-dimensional array
print(array)
print(type(array))
print(array.ndim)
print(array.shape)

[[1 2 3]
 [4 5 6]]
<class 'numpy.ndarray'>
2
(2, 3)


In [9]:
# 2-dimensional array

students = ([["Utibe", "Bassey", "Inyang", "Henry", "Effiong"],
 ["Jeremiah", "Jerry", "Hopex", "Emmanuel", "John"]])
print(students)
print(type(students))
print(array.ndim)
print(array.shape)


[['Utibe', 'Bassey', 'Inyang', 'Henry', 'Effiong'], ['Jeremiah', 'Jerry', 'Hopex', 'Emmanuel', 'John']]
<class 'list'>
2
(2, 3)


## Slicing in NumPy

Slicing in NumPy is a way to extract a portion of an array, much like slicing a Python list, but with extensions for multi-dimensional arrays. It allows you to select elements, rows, or columns based on their indices.

The basic syntax for slicing is `array[start:end:step]`:
- `start`: The starting index (inclusive). If omitted, it defaults to `0`.
- `end`: The ending index (exclusive). If omitted, it defaults to the end of the dimension.
- `step`: The step size between elements. If omitted, it defaults to `1`.

### 1D Array Slicing
For one-dimensional arrays, slicing works very similarly to Python lists.

### 2D Array Slicing (Matrices)
For two-dimensional arrays (matrices), you can specify slices for both rows and columns. The syntax becomes `array[row_start:row_end:row_step, col_start:col_end:col_step]`.

- To select specific rows, you specify the row slice.
- To select specific columns, you specify the column slice.
- You can also mix integer indexing with slicing, for example, to select a specific row and a slice of its columns, or a slice of rows and a specific column.

In [10]:
# 1D Array Slicing Examples
import numpy as np

arr_1d = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
print("Original 1D array:", arr_1d)

# Slice from index 2 to 5 (exclusive)
print("arr_1d[2:6]:", arr_1d[2:6])

# Slice from beginning to index 5 (exclusive)
print("arr_1d[:5]:", arr_1d[:5])

# Slice from index 5 to end
print("arr_1d[5:]:", arr_1d[5:])

# Slice with a step of 2
print("arr_1d[::2]:", arr_1d[::2])

# Reverse the array
print("arr_1d[::-1]:", arr_1d[::-1])

# Select a single element (not slicing, but related)
print("arr_1d[3]:", arr_1d[3])

Original 1D array: [0 1 2 3 4 5 6 7 8 9]
arr_1d[2:6]: [2 3 4 5]
arr_1d[:5]: [0 1 2 3 4]
arr_1d[5:]: [5 6 7 8 9]
arr_1d[::2]: [0 2 4 6 8]
arr_1d[::-1]: [9 8 7 6 5 4 3 2 1 0]
arr_1d[3]: 3


In [11]:
# 2D Array Slicing Examples
import numpy as np

arr_2d = np.array([
    [10, 11, 12, 13],
    [20, 21, 22, 23],
    [30, 31, 32, 33],
    [40, 41, 42, 43]
])
print("Original 2D array:\n", arr_2d)

# Select the first two rows and all columns
print("\narr_2d[:2, :]:\n", arr_2d[:2, :])

# Select all rows and the last two columns
print("\narr_2d[:, -2:]:\n", arr_2d[:, -2:])

# Select rows from index 1 to 2 (exclusive) and columns from index 1 to 3 (exclusive)
print("\narr_2d[1:3, 1:3]:\n", arr_2d[1:3, 1:3])

# Select a single row (index 0) and all its columns
print("\narr_2d[0, :]:", arr_2d[0, :])

# Select all rows and a single column (index 2)
print("\narr_2d[:, 2]:", arr_2d[:, 2])

# Select a specific element (row 1, column 0)
print("\narr_2d[1, 0]:", arr_2d[1, 0])

Original 2D array:
 [[10 11 12 13]
 [20 21 22 23]
 [30 31 32 33]
 [40 41 42 43]]

arr_2d[:2, :]:
 [[10 11 12 13]
 [20 21 22 23]]

arr_2d[:, -2:]:
 [[12 13]
 [22 23]
 [32 33]
 [42 43]]

arr_2d[1:3, 1:3]:
 [[21 22]
 [31 32]]

arr_2d[0, :]: [10 11 12 13]

arr_2d[:, 2]: [12 22 32 42]

arr_2d[1, 0]: 20


## Arithmetic Operations in NumPy

NumPy provides efficient and flexible ways to perform arithmetic operations on arrays. These operations are typically *element-wise*, meaning they are applied to corresponding elements of arrays. NumPy also features a powerful mechanism called *broadcasting*, which allows arithmetic operations between arrays of different shapes under certain conditions.

### 1. Basic Element-wise Operations
Standard arithmetic operators (`+`, `-`, `*`, `/`, `**`, `%`) are applied element by element.

**Example:** Adding two arrays, or adding a scalar to an array.

### 2. Broadcasting
Broadcasting describes how NumPy treats arrays with different shapes during arithmetic operations. Subject to certain constraints, the smaller array is "broadcast" across the larger array so that they have compatible shapes.

**Rules for Broadcasting:**
1. If the arrays don't have the same number of dimensions, the shape of the smaller array is padded with ones on its left side.
2. If the shape of the two arrays does not match in any dimension, the array with shape equal to 1 in that dimension is stretched to match the other shape.
3. If in any dimension, the sizes are unequal and neither is 1, an error is raised.

### 3. Universal Functions (ufuncs)
NumPy also provides a suite of mathematical functions called "universal functions" (ufuncs) that operate element-wise on arrays. These functions are highly optimized and faster than performing the same operations using Python loops. Examples include `np.add`, `np.subtract`, `np.multiply`, `np.divide`, `np.sqrt`, `np.exp`, `np.sin`, `np.cos`, etc.

Let's look at some examples.

## Aggregate Functions in NumPy

Aggregate functions in NumPy are operations that compute a single summary value (or a smaller array of summary values) from an array. These functions are highly optimized and are much faster than performing similar operations with standard Python loops, especially for large datasets. They are fundamental for data analysis and statistical computations.

### How they work:
Aggregate functions typically take an array as input and return a single number representing a summary of the array's data (e.g., sum of all elements, maximum value, average). For multi-dimensional arrays, you can specify an `axis` along which to perform the aggregation.

### Common Aggregate Functions:
*   `np.sum()`: Computes the sum of all elements in the array.
*   `np.min()`: Finds the minimum value in the array.
*   `np.max()`: Finds the maximum value in the array.
*   `np.mean()`: Calculates the arithmetic mean (average) of the array elements.
*   `np.median()`: Computes the median of the array elements.
*   `np.std()`: Computes the standard deviation of the array elements.
*   `np.var()`: Computes the variance of the array elements.
*   `np.prod()`: Computes the product of all elements in the array.
*   `np.argmin()`: Returns the index of the minimum value.
*   `np.argmax()`: Returns the index of the maximum value.

### The `axis` parameter:
For multi-dimensional arrays, the `axis` parameter specifies along which dimension the aggregation should be performed:
*   `axis=0`: Aggregates down the rows (across columns), producing results for each column.
*   `axis=1`: Aggregates across the columns (across rows), producing results for each row.
*   If `axis` is not specified (or `None`), the operation is performed on the flattened array, resulting in a single scalar value.

In [None]:
import numpy as np

# --- 1D Array Aggregations ---
print("--- 1D Array Aggregations ---")
arr_1d = np.array([1, 5, 2, 8, 3, 9, 4, 6, 7])
print("Original 1D array:", arr_1d)

print("Sum:", np.sum(arr_1d))
print("Minimum:", np.min(arr_1d))
print("Maximum:", np.max(arr_1d))
print("Mean:", np.mean(arr_1d))
print("Standard Deviation:", np.std(arr_1d))
print("Product:", np.prod(arr_1d))
print("Index of Minimum (argmin):", np.argmin(arr_1d))
print("Index of Maximum (argmax):", np.argmax(arr_1d))

# You can also call these as methods on the array object
print("Sum (as method):", arr_1d.sum())


# --- 2D Array Aggregations with `axis` parameter ---
print("\n--- 2D Array Aggregations ---")
arr_2d = np.array([
    [1, 2, 3],
    [4, 5, 6],
    [7, 8, 9]
])
print("Original 2D array:\n", arr_2d)

print("\nSum of all elements:", np.sum(arr_2d)) # default: flattened array

print("\nSum along axis=0 (sum of columns):", np.sum(arr_2d, axis=0))
# Output: [1+4+7, 2+5+8, 3+6+9] = [12, 15, 18]

print("Sum along axis=1 (sum of rows):", np.sum(arr_2d, axis=1))
# Output: [1+2+3, 4+5+6, 7+8+9] = [6, 15, 24]

print("\nMean along axis=0 (mean of columns):", np.mean(arr_2d, axis=0))
print("Mean along axis=1 (mean of rows):", np.mean(arr_2d, axis=1))

print("\nMax along axis=0 (max in columns):", np.max(arr_2d, axis=0))
print("Max along axis=1 (max in rows):", np.max(arr_2d, axis=1))

print("\nArgmax along axis=0 (index of max in columns):", np.argmax(arr_2d, axis=0))
print("Argmax along axis=1 (index of max in rows):", np.argmax(arr_2d, axis=1))


In [12]:
import numpy as np

# --- 1. Basic Element-wise Operations ---
print("--- Basic Element-wise Operations ---")
arr1 = np.array([1, 2, 3, 4])
arr2 = np.array([5, 6, 7, 8])

# Addition
print("Addition (arr1 + arr2):", arr1 + arr2)

# Subtraction
print("Subtraction (arr1 - arr2):", arr1 - arr2)

# Multiplication
print("Multiplication (arr1 * arr2):", arr1 * arr2)

# Division
print("Division (arr1 / arr2):", arr1 / arr2)

# Exponentiation
print("Exponentiation (arr1 ** 2):", arr1 ** 2)

# Modulus
print("Modulus (arr2 % arr1):", arr2 % arr1)

# Scalar operations are also element-wise
print("Add scalar (arr1 + 10):", arr1 + 10)
print("Multiply scalar (arr2 * 3):", arr2 * 3)


# --- 2. Broadcasting Examples ---
print("\n--- Broadcasting Examples ---")
matrix = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
vector = np.array([10, 20, 30])

print("Matrix:\n", matrix)
print("Vector:", vector)

# Adding a 1D array (vector) to a 2D array (matrix)
# The vector is broadcast across each row of the matrix
print("\nMatrix + Vector (broadcasting):\n", matrix + vector)

# Another broadcasting example: adding a column vector
column_vector = np.array([[100], [200], [300]])
print("\nColumn Vector:\n", column_vector)
print("Matrix + Column Vector (broadcasting):\n", matrix + column_vector)


# --- 3. Universal Functions (ufuncs) ---
print("\n--- Universal Functions (ufuncs) ---")
arr_ufunc = np.array([-1, 0, 1, 2, 3])

print("Original array for ufuncs:", arr_ufunc)
print("Absolute value (np.abs):", np.abs(arr_ufunc))
print("Square root (np.sqrt):", np.sqrt(arr_ufunc[2:])) # sqrt on positive values
print("Exponential (np.exp):", np.exp(arr_ufunc))
print("Sine (np.sin):", np.sin(arr_ufunc))
print("Ceiling (np.ceil):", np.ceil(np.array([1.1, 2.5, 3.9])))
print("Floor (np.floor):", np.floor(np.array([1.1, 2.5, 3.9])))

--- Basic Element-wise Operations ---
Addition (arr1 + arr2): [ 6  8 10 12]
Subtraction (arr1 - arr2): [-4 -4 -4 -4]
Multiplication (arr1 * arr2): [ 5 12 21 32]
Division (arr1 / arr2): [0.2        0.33333333 0.42857143 0.5       ]
Exponentiation (arr1 ** 2): [ 1  4  9 16]
Modulus (arr2 % arr1): [0 0 1 0]
Add scalar (arr1 + 10): [11 12 13 14]
Multiply scalar (arr2 * 3): [15 18 21 24]

--- Broadcasting Examples ---
Matrix:
 [[1 2 3]
 [4 5 6]
 [7 8 9]]
Vector: [10 20 30]

Matrix + Vector (broadcasting):
 [[11 22 33]
 [14 25 36]
 [17 28 39]]

Column Vector:
 [[100]
 [200]
 [300]]
Matrix + Column Vector (broadcasting):
 [[101 102 103]
 [204 205 206]
 [307 308 309]]

--- Universal Functions (ufuncs) ---
Original array for ufuncs: [-1  0  1  2  3]
Absolute value (np.abs): [1 0 1 2 3]
Square root (np.sqrt): [1.         1.41421356 1.73205081]
Exponential (np.exp): [ 0.36787944  1.          2.71828183  7.3890561  20.08553692]
Sine (np.sin): [-0.84147098  0.          0.84147098  0.90929743  0.141