# Introduction to NumPy

## What is NumPy?
NumPy is a fundamental package for scientific computing in Python. It provides support for arrays, matrices, and many mathematical functions that are useful in data analysis. In the field of geomatics, NumPy is particularly valuable for handling large datasets, performing numerical calculations, and integrating with other Python libraries for geospatial data analysis.

## Importance of NumPy in Geomatics and Data Analysis
In geomatics, you often work with large datasets, such as raster images, point clouds, or tabular data with coordinates. NumPy makes it easy to perform efficient computations on these datasets, whether you're performing basic arithmetic operations, calculating distances between points, or transforming coordinates.

This tutorial will guide you through the basics of NumPy and demonstrate how it can be applied to common tasks in geomatics.

NumPy runs behind the scenes in pandas. Pandas is built on top of NumPy and heavily relies on NumPy's data structures and functions to operate efficiently. NumPy is generally better suited than pandas for working with rasters due to efficiency.

## Creating NumPy Arrays
Let's start by creating a simple NumPy array

In [None]:
import numpy as np

In [None]:
array_1d = np.array([1, 2, 3, 4, 5])

In [None]:
print("1D Array:", array_1d)

There are several ways you can select elements in an array. When using brackets, the selection is similar to a list()

In [None]:
array_1d[0]

In [None]:
array_1d[1]

In [None]:
array_1d[-1]

Let's create a 2D array!

In [None]:
# Creating a 2D array (e.g., representing coordinates in a 2D space)
array_2d = np.array([[1, 2], [3, 4], [5, 6]])
print("2D Array:\n", array_2d)

In [None]:
# Array Data Types
# Checking the data type of an array
print("Data type of array_1d:", array_1d.dtype)


In [None]:
array_2d[0]

Let's simulate a raster image now.

In [None]:
# Define the dimensions of the raster (e.g., 100x100 pixels)
rows, cols = 100, 100

# Create an empty raster (initialized with zeros, representing a black image)
raster = np.zeros((rows, cols), dtype=np.uint8)
raster

Let's say we want to change the values of the first 3 rows and 2 rows. We can do that with slicing.

The slice raster[0:3, 0:3] selects the first three rows and the first 3 columns of the raster.



In [None]:
# You can set specific pixel values (for example, setting a square in the center to white)
raster[0:3, 0:70] = 255  

print(raster)

Using slicing we can access a single or a set of values from the elements within the array.

In [None]:

print("First element of raster:", raster[0,0])


In [None]:
print("First row of raster:", raster[0, :])

In [None]:
print("First column of raster:", raster[ :,0])

In [None]:

# Slicing the array
print("Last two elements of raster:", raster[-2,-2:])

We can also do mathematical operations quickly.

In [None]:

# Basic Array Operations
# Performing element-wise operations
raster_sum = raster - 1
print("Array after adding 10:", raster_sum)



This seems bizzare but there is an explanation. The uint8 data type in NumPy is an 8-bit unsigned integer, which can store values from 0 to 255. Since it is unsigned, it cannot represent negative numbers. The default option is unsigned and that is why this is happening. To change this we have to specify the format.

In [None]:
raster_signed = raster.astype(np.int16)  # or np.int32

In [None]:

raster_sum = raster_signed - 1
print("Array after adding 10:", raster_sum)



Works now!

In [None]:
raster_product = raster_signed * 2
print("Array after multiplying by 2:", raster_product)

### Array Manipulations

Reshaping an array in NumPy means changing the shape or structure of the array without changing its data.

In [None]:

# Reshaping an array
reshaped_raster = raster.reshape((5000, 2))
print("Reshaped raster:\n", reshaped_raster)


In [None]:
print("Raster:\n", raster)

In [None]:

# Reshaping an array
reshaped_raster = raster.reshape((2, 5000))
print("Reshaped raster:\n", reshaped_raster)


## Combining and Splitting Arrays

In [None]:
array_1d

In [None]:
# Concatenating two arrays
array_concat = np.concatenate((array_1d, array_1d))

In [None]:
array_concat

In [None]:
# Splitting an array into multiple arrays
array_split = np.split(array_1d, [2, 4])
print("Split Arrays:", array_split)

The second argument [2, 4] specifies the indices where the array should be split. For instance if we want to split the array after the first element the code would change to:

In [None]:
# Splitting an array into multiple arrays
array_split = np.split(array_1d, [1])
print("Split Arrays:", array_split)

We can also copy arrays very efficiently with np.copy.

In [None]:

# Creating a copy of an array
array_copy = array_1d.copy()
array_copy[0] = 100
print("Original Array:", array_1d)
print("Modified Copy:", array_copy)

### Other arithmetic operations

#### Summing

In [None]:
total_sum = np.sum(array_1d)
print("Sum of elements in array_1d:", total_sum)


In [None]:
total_sum = np.sum(raster)
print("Sum of elements in array_1d:", total_sum)


#### Descriptive statistics

In [None]:
# Finding the mean of the array
mean_value = np.mean(array_1d)
print("Mean of array_1d:", mean_value)

In [None]:
# Finding the minimum and maximum values
min_value = np.min(array_1d)
max_value = np.max(array_1d)
print("Min:", min_value, "Max:", max_value)

#### Trigonometric Functions

In [None]:

array_sin = np.sin(array_1d)
print("Sine of array_1d:", array_sin)

### Thresholding

Let's create a numpy array and use a value around the middle as a threshold.

In [None]:
import numpy as np

# Example array (imagine this as a small grayscale image)
image = np.array([[100, 150, 200],
                  [50, 130, 180],
                  [90, 170, 255]])

# Define the threshold
threshold = 128


Now we can manipulate the image values using the np.where operator. 

In [None]:

# Apply thresholding
thresholded_image = np.where(image < threshold, 0, 255)

print("Original Image:\n", image)
print("Thresholded Image:\n", thresholded_image)


This example is similar to boolean logic or if statements, but it leverages NumPy's vectorized operations, which allow you to apply conditions to entire arrays at once instead of using explicit loops or multiple if statements! In a traditional way, the task above could be solved with multiple if statements.

In [None]:
# if-else using loops
thresholded_image = np.zeros_like(image)  # Create an empty array of the same shape

for i in range(image.shape[0]):  # Loop over rows
    for j in range(image.shape[1]):  # Loop over columns
        if image[i, j] < 128:
            thresholded_image[i, j] = 0
        else:
            thresholded_image[i, j] = 255

print(thresholded_image)



## References for Further Learning
- [NumPy Documentation](https://numpy.org/doc/stable/)
- [SciPy Lecture Notes on NumPy](http://scipy-lectures.org/intro/numpy/index.html)
