# ___

# [ Machine Learning in Geosciences ]

**Department of Applied Geoinformatics and Carthography, Charles University** 

*Shruti Pancholi  shruti.pancholi@natur.cuni.cz*
    
___

# Numpy
## NumPy Tutorial: Basics and Linear Algebra

NumPy (Numerical Python) is a fundamental library for numerical computing in Python.
It provides powerful N-dimensional array objects and tools for working with these arrays.
NumPy arrays are more efficient than standard Python lists for numerical operations.
It provides support for large, multi-dimensional arrays and matrices, along with 
a collection of mathematical functions to operate on them efficiently.
Numpy is a very important library which is also the basis of high-level libraries used for data analysis, visualization and machine learning. NumPy can perform various tasks like:

* Array Handling
* Mathematical Operations
* Linear Algebra

In [2]:
# Importing numpy as alias np
import numpy as np

print("NumPy Version:", np.__version__)

NumPy Version: 1.26.4


Let's start with arrays.

# Numpy Arrays

Numpy arrays essentially come in two flavors: **vectors and matrices**. Vectors are strictly 1-d arrays and matrices are 2-d (but you should note a matrix can still have only one row or one column). **Array is a fixed-sized array in memory that contains data of the same type**, such as integers or foating point values. 

The data type supported by an array can be accessed via the dtype attribute on the array. The dimensions of an array can be accessed via the shape attribute that returns a tuple describing the length of each dimension.


## Creating NumPy Arrays

### From a Python List

A simple way to create an array from data or simple Python data structures like a list is to use the *array()* 
function, e.g. from a list. 

In [16]:
# Creating Arrays from lists
arr1 = np.array([1, 2, 3, 4, 5])  # 1D array
arr2 = np.array([[1, 2, 3], [4, 5, 6]])  # 2D array
print("1D Array:", arr1)
print("2D Array:\n", arr2)

1D Array: [1 2 3 4 5]
2D Array:
 [[1 2 3]
 [4 5 6]]


In [7]:
# Array Properties
print("Shape:", arr2.shape)
print("Data Type:", arr2.dtype)
print("Size:", arr2.size)
print("Dimension:", arr2.ndim)

Shape: (2, 3)
Data Type: int32
Size: 6
Dimension: 2


## Built-in Methods

There are lots of built-in ways to generate Arrays

### zeros and ones

Generate arrays of zeros or ones

In [17]:
# Use np.zeros to generate an array containing zeros
zero_arr = np.zeros(3) 
print(zero_arr)

[0. 0. 0.]


In [18]:
# Array of only ones
np.ones((3,2)) # It can be of n-dimensions

array([[1., 1.],
       [1., 1.],
       [1., 1.]])

In [19]:
# Array containing None type values
no_arr = [None, None, None]
print(type(no_arr[2]))

<class 'NoneType'>


## Integers

In [24]:
# np.linspace returns evenly spaced numbers in a given interval
np.linspace(2,10,5)

array([ 2.,  4.,  6.,  8., 10.])

In [25]:
np.array((np.linspace(0,10,20), np.linspace(10,20,20))) # 2D array containing 2 evenly spaced arrays

array([[ 0.        ,  0.52631579,  1.05263158,  1.57894737,  2.10526316,
         2.63157895,  3.15789474,  3.68421053,  4.21052632,  4.73684211,
         5.26315789,  5.78947368,  6.31578947,  6.84210526,  7.36842105,
         7.89473684,  8.42105263,  8.94736842,  9.47368421, 10.        ],
       [10.        , 10.52631579, 11.05263158, 11.57894737, 12.10526316,
        12.63157895, 13.15789474, 13.68421053, 14.21052632, 14.73684211,
        15.26315789, 15.78947368, 16.31578947, 16.84210526, 17.36842105,
        17.89473684, 18.42105263, 18.94736842, 19.47368421, 20.        ]])

In [36]:
np.arange(5, 20, 3) # returns an array within a range of numbers with equal interval as specified (3)

array([ 5,  8, 11, 14, 17])

## Random 

Numpy also has lots of ways to create random number arrays:

### rand
Create an array of the given shape and populate it with
random samples from a **uniform distribution** 
over ``[0, 1)``.

In [28]:
np.random.rand(5)

array([0.86318194, 0.49343779, 0.87218402, 0.5723882 , 0.23354835])

In [29]:
np.random.rand(3,5)

array([[0.75888641, 0.92357591, 0.93724766, 0.61690049, 0.03326339],
       [0.25955475, 0.06546499, 0.21210225, 0.05922919, 0.56923579],
       [0.72170947, 0.01949822, 0.85852367, 0.50934889, 0.45490796]])

### randn

Return a sample (or samples) from the standard **normal distribution**. Unlike rand which is uniform. 


In [30]:
np.random.randn(2)

array([-0.17409872, -1.26135897])

### randint
Return random integers from `low` (inclusive) to `high` (exclusive).

In [31]:
np.random.randint(1,100)

52

In [32]:
np.random.randint(1,50,10)

array([ 6, 12, 22, 26, 37, 12, 47, 31, 24,  7])

## Array Attributes and Methods

Let's discuss some useful attributes and methods or an array:

In [35]:
my_arr = np.arange(0,50,5)
my_arr

array([ 0,  5, 10, 15, 20, 25, 30, 35, 40, 45])

In [37]:
# Get the shape or dimension of the array
my_arr.shape

(10,)

# Reshape

* Reshapes the array into the given dimensions
* The total number of elements must match before and after reshaping.
* reshape() does not modify the original array; it returns a new reshaped array.

In [38]:

my_arr.reshape(5,2)

array([[ 0,  5],
       [10, 15],
       [20, 25],
       [30, 35],
       [40, 45]])

In [46]:
new_arr = my_arr.reshape(5,2,1) # You can reshape any array into the multiples of it's dimension
new_arr

array([[[ 0],
        [ 5]],

       [[10],
        [15]],

       [[20],
        [25]],

       [[30],
        [35]],

       [[40],
        [45]]])

In [47]:
new_arr.shape

(5, 2, 1)

In [49]:
# Flattens any n-dimensional array into a 1D array
my_arr.reshape(-1) # Alternative: np.flatten()

array([ 0,  5, 10, 15, 20, 25, 30, 35, 40, 45])

In [61]:
my_arr.reshape(2, -1) # NumPy calculates the correct column size

array([[ 0,  5, 10, 15, 20],
       [25, 30, 35, 40, 45]])

### max, min, argmax, argmin

These are useful methods for finding **max** or **min** values. Or to find their **index locations** using argmin or argmax

In [62]:
my_arr.min() # Returns the single index of the max value from the flattened array.

0

In [63]:
my_arr.max()

45

In [66]:
my_arr.argmax() # Returns the single index of the max value from the flattened array. 

9

In [77]:
my_arr.argmin() # Returns the single index of the min value from the flattened array.

0

## Array Operations

Each elemnts of an array can be operated with the elements of another array, given the dimensions of the arrays are same. Let's take a look!

In [89]:
# Basic Operations
arr1 = np.array([1, 2, 3, 4, 5])
arr3 = np.array([10, 20, 30, 40, 50])

In [79]:
arr1 + arr3 # Array addition 

array([11, 22, 33, 44, 55])

In [99]:
arr3- arr1 # Array subtraction

array([ 9, 18, 27, 36, 45])

In [96]:
arr1*arr3 # Array Multiplication

array([ 10,  40,  90, 160, 250])

In [97]:
arr3/arr1 # Array division

array([10., 10., 10., 10., 10.])

In [100]:
# Gets sum of all the elements of an array
np.sum(arr1)

15

In [101]:
# Gets mean of the array elements
np.mean(arr1)

3.0

# Scalar Operations

* These involve performing operations between a single number (scalar) and a NumPy array.
* The scalar operation is broadcasted to all elements of the array.
* The scalar operation is applied element-wise to all elements in the array.

In [None]:
import numpy as np

In [None]:
arr1 = np.array([1, 2, 3, 4, 5])
arr3 = np.array([10, 20, 30, 40, 50])

In [None]:
# Array Multiplication with a scalar
scalar = 2
arr1 * scalar

array([ 2,  4,  6,  8, 10])

In [None]:
arr1 + scalar # Addition of a scalar to all the elements of the array

array([3, 4, 5, 6, 7])

In [None]:
# Division of all the array elements by a scalar
arr1 / scalar

array([0.5, 1. , 1.5, 2. , 2.5])

# Vector Operations

In [None]:
# Vector Operations
vec1 = np.array([1, 2, 3])
vec2 = np.array([4, 5, 6])

In [None]:
# Vector Addition
vec1 + vec2

array([5, 7, 9])

In [None]:
# Vector Differences
vec1 - vec2

array([-3, -3, -3])

In [None]:
np.dot(vec1, vec2)

32

In [None]:
np.cross(vec1, vec2)

array([-3,  6, -3])

## Linear ALgebra



In [None]:
# Matrix Operations
mat1 = np.array([[1, 2], [3, 4]])
mat2 = np.array([[5, 6], [7, 8]])
print("Matrix Addition:\n", mat1 + mat2)
print("Matrix Multiplication:\n", np.dot(mat1, mat2))
print("Element-wise Multiplication:\n", mat1 * mat2)
print("Matrix Transpose:\n", mat1.T)

Matrix Addition:
 [[ 6  8]
 [10 12]]
Matrix Multiplication:
 [[19 22]
 [43 50]]
Element-wise Multiplication:
 [[ 5 12]
 [21 32]]
Matrix Transpose:
 [[1 3]
 [2 4]]


In [None]:
# Inverse of a matrix
inv_mat = np.linalg.inv(mat2)  # Avoid singular matrix
inv_mat

array([[-4. ,  3. ],
       [ 3.5, -2.5]])

In [None]:
# Determinant of a matrix
np.linalg.det(mat1)

-2.0000000000000004

# Eigenvalues and eigen vectors

For a square matrix A, an eigenvector 𝑣 satisfies the equation:


                                𝐴𝑣 = 𝜆𝑣

where:

𝐴 is a square matrix.
𝜆 (eigenvalue) is a scalar that tells how much the eigenvector is scaled.
𝑣 is the eigenvector, which remains in the same direction after transformation.

In [None]:
'''Eigenvalues describe how a matrix scales its eigenvectors.'''

# Eigenvalues and Eigenvectors
values, vectors = np.linalg.eig(mat1)
values


array([-0.37228132,  5.37228132])

In [None]:
vectors

array([[-0.82456484, -0.41597356],
       [ 0.56576746, -0.90937671]])

# Solving Linear Equations (Ax = b)

In [None]:
A= np.array([[4, 2], 
              [1, 3]])

In [None]:
b = np.array([5, 6])
x = np.linalg.solve(A, b)
print("Solution to Ax = b:", x)


Solution to Ax = b: [0.3 1.9]


# Statistics

Basic Statistics

In [None]:
# Generating a random dataset
np.random.seed(42)  # For reproducibility
data = np.random.randint(1, 101, size=1000)  # 1000 random numbers between 1 and 100

In [None]:
np.mean(data)  # Mean

50.128

In [None]:
np.median(data)  # Median

51.0

In [None]:
np.std(data)  # Standard Deviation

29.558714721719547

In [None]:
np.var(data)  # Variance

873.7176159999999

In [None]:
# Percentiles
np.percentile(data, 25)  # 25th percentile


24.0

In [None]:
np.percentile(data, 50)  # 50th percentile (same as median)

51.0

In [None]:
np.percentile(data, 75)  # 75th percentile

75.0

# Norm

* In linear algebra, a norm is a function that assigns a non-negative length or size to a vector. 
* It provides a way to measure the magnitude of a vector in a given space. 
* Norms are widely used in optimization, machine learning, and numerical analysis.

In [3]:
# Norm Calculation
vector = np.array([3, 4])

In [4]:
np.linalg.norm(vector)  # Default is L2 norm (Euclidean)

5.0