<img src="images/notebook9_header.png" width="1024" alt="Python for Geospatial Data Science" style="border-radius:10px"/>

**Dr Gunnar Mallon** (g.mallon@rug.nl), *Department of Cultural Geography (Faculty of Spatial Science)*, *University of Groningen*

---

NumPy is a fundamental Python library for scientific computing, which is crucial for geospatial data analysis. We will explore what NumPy is and why it is important for working with geospatial data.

It stands for Numerical Python and is a powerful library in Python for performing numerical computations and manipulating large, multi-dimensional arrays and matrices. It provides high-performance mathematical functions and tools for efficiently working with numerical data.

### Why is NumPy important for geospatial data analysis?

NumPy is particularly important for geospatial data analysis due to its ability to handle and process large, multi-dimensional arrays efficiently. Geospatial data often consists of multiple dimensions such as latitude, longitude, elevation, and time. NumPy provides a convenient and efficient way to store, manipulate, and analyze these multi-dimensional arrays.

Some key reasons why NumPy is important for geospatial data analysis are:

1. **Efficient storage and processing**: NumPy arrays occupy less memory than regular Python lists and allow for faster processing of large datasets.

2. **Vectorized operations**: NumPy allows you to perform mathematical operations on entire arrays instead of individual elements, significantly improving the performance of computations.

3. **Integration with other libraries**: NumPy is the foundation for many other scientific libraries in Python, such as Pandas, Matplotlib, and SciPy. Understanding NumPy is essential for effectively using these libraries in geospatial data analysis.

### Installation and setup

Before we can start working with NumPy, we need to install it and set it up in our Python environment. Here are the steps to install NumPy:

1. **Install NumPy via conda**: Open your command prompt or terminal and run the following command:

```
conda install numpy
```

2. **Verify the installation**: Once the installation is complete, open a Jupyter Notebook and import the NumPy module using the following command:

```python
import numpy as np
```

If no errors occur, the installation was successful.


Now that we have NumPy installed and set up, we can move on to learning about its core features and how it can be used in geospatial data analysis.

## NumPy Arrays

To work with geospatial data, we often need to store large amounts of numeric data efficiently. NumPy arrays provide an efficient way to store and manipulate numerical data.

#### np.array()

The most basic way to create a NumPy array is by using the `np.array()` function. It takes a Python list or nested lists as input and returns a NumPy array.

```python
import numpy as np

# Creating a 1D array
array1d = np.array( [1, 2, 3, 4, 5] )
print(array1d)
# Output: [1 2 3 4 5]

# Creating a 2D array
array2d = np.array( [ [1, 2, 3], [4, 5, 6], [7, 8, 9] ] )
print(array2d)
# Output:
# [[1 2 3]
#  [4 5 6]
#  [7 8 9]]
```

#### np.zeros()

The `np.zeros()` function creates an array filled with zeros. It takes the shape of the desired array as a parameter.

```python
zeros_arr = np.zeros((3, 3))
print(zeros_arr)
# Output:
# [[0. 0. 0.]
#  [0. 0. 0.]
#  [0. 0. 0.]]
```


#### np.ones()

Similarly, the `np.ones()` function creates an array filled with ones.

```python
ones_arr = np.ones((2, 4))
print(ones_arr)
# Output:
# [[1. 1. 1. 1.]
#  [1. 1. 1. 1.]]
```

#### np.linspace()

The `np.linspace()` function creates an array of evenly spaced values over a specified range. It takes the start, end, and number of points as parameters.

```python
lin_arr = np.linspace(0, 10, 5)
print(lin_arr)
# Output: [ 0.   2.5  5.   7.5 10. ]
```

#### np.arange()

The `np.arange()` function creates an array with regularly spaced values between a start and end value, with a specified step size.

```python
range_arr = np.arange(0, 10, 2)
print(range_arr)
# Output: [0 2 4 6 8]
```

### Accessing and Modifying Elements

Once we have created NumPy arrays, we can access and modify their elements using various techniques.

#### Indexing and Slicing

NumPy arrays support multi-dimensional indexing and slicing. We can access individual elements by providing their indices or ranges of indices.

```python
arr = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])

print(arr[0, 1])
# Output: 2

print(arr[:2, 1:])
# Output:
# [[2 3]
#  [5 6]]
```

If you are having trouble figuring this one out, please just ask!

#### Broadcasting

Broadcasting allows us to perform mathematical operations on arrays with different shapes and dimensions. When operating on arrays with different shapes, NumPy automatically broadcasts the smaller array to match the shape of the larger array.

```python
arr = np.array([1, 2, 3, 4, 5])

print(arr + 2)
# Output: [3 4 5 6 7]
```

### Boolean Indexing

We can use Boolean indexing to filter arrays based on specific conditions. We create a Boolean array by applying a condition to an existing array, and then use that Boolean array to index the original array.

```python
arr = np.array([10, 15, 20, 25, 30])

condition = (arr > 20)
filtered_arr = arr[condition]

print(filtered_arr)
# Output: [25 30]
```

---
## 🚀 Exercise

1. Create a NumPy array of integers from 1 to 20. Use boolean indexing to create a new array containing only the even numbers from the original array.

---
### Array Shapes and Dimensions

NumPy provides useful functions to determine the shape and dimensions of arrays.

#### Shape Attribute

The `shape` attribute returns the dimensions of an array.

```python
arr = np.array([[1, 2, 3], [4, 5, 6]])

print(arr.shape)
# Output: (2, 3)
```


#### Reshaping Arrays

We can reshape arrays using the `reshape()` function. This function returns a new array with a modified shape, without changing the original array.

```python
arr = np.array([1, 2, 3, 4, 5, 6])

reshaped_arr = arr.reshape(2, 3)
print(reshaped_arr)
# Output:
# [[1 2 3]
#  [4 5 6]]
```

#### Transposing Arrays

The `transpose()` function swaps the dimensions of an array.

```python
arr = np.array([[1, 2, 3], [4, 5, 6]])

transposed_arr = arr.transpose()
print(transposed_arr)
# Output:
# [[1 4]
#  [2 5]
#  [3 6]]
```

---
## 🚀 Exercises

1. Create a 1D NumPy array containing the numbers from 0 to 9.


2. Create a 2D NumPy array of shape (3, 3) filled with random integers between 0 and 100.


3. Access the element at index (1, 2) of the above array.

4. Reshape the array from exercise 2 to have shape (9, 1).

5. Perform element-wise multiplication between two NumPy arrays of shapes (2, 3) and (3, 2).

---
## Basic Array Operations

Numpy is a powerful Python library that provides support for large, multi-dimensional arrays and matrices of numerical data, as well as a large collection of mathematical functions to operate on these arrays. In this section, we will explore some basic operations that can be performed on numpy arrays, with a focus on geospatial data.

### Element-wise operations

Element-wise operations are operations that are **performed on each element of an array individually**. Some common element-wise operations in numpy include arithmetic operations, mathematical functions, and statistical functions.

#### Arithmetic operations

Numpy supports all basic arithmetic operations such as addition, subtraction, multiplication, and division. These operations can be performed between two arrays or between an array and a scalar value.

Example:
```python
import numpy as np

# create two arrays
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])

# perform arithmetic operations
c = a + b  # element-wise addition
d = a - b  # element-wise subtraction
e = a * b  # element-wise multiplication
f = a / b  # element-wise division

print(c)  # [5 7 9]
print(d)  # [-3 -3 -3]
print(e)  # [4 10 18]
print(f)  # [0.25 0.4  0.5]
```


#### Mathematical functions

Some common mathematical functions include `sqrt`, `exp`, `log`, `sin`, `cos`, and `tan`.

Example:
```python
import numpy as np

# create an array
a = np.array([1, 2, 3])

# apply mathematical functions
b = np.sqrt(a)  # element-wise square root
c = np.exp(a)  # element-wise exponential
d = np.log(a)  # element-wise natural logarithm
e = np.sin(a)  # element-wise sine function

print(b)  # [1.        1.41421356 1.73205081]
print(c)  # [ 2.71828183  7.3890561  20.08553692]
print(d)  # [0.         0.69314718 1.09861229]
print(e)  # [0.84147098 0.90929743 0.14112001]
```

#### Array operations with scalars

Numpy allows for easy element-wise operations between arrays and scalar values. When an arithmetic operation is performed between an array and a scalar, the scalar value is broadcasted to match the shape of the array, and the operation is then performed element-wise.

Example:
```python
import numpy as np

# create an array
a = np.array([1, 2, 3])

# perform array operations with a scalar value
b = a + 5  # element-wise addition
c = a * 2  # element-wise multiplication
d = a / 3  # element-wise division

print(b)  # [6 7 8]
print(c)  # [2 4 6]
print(d)  # [0.33333333 0.66666667 1.]
```

#### Statistical functions

Numpy also provides a variety of statistical functions that can be applied element-wise to numpy arrays. Some common statistical functions include `sum`, `mean`, `max`, `min`, `std`, and `var`.

Example:
```python
import numpy as np

# create an array
a = np.array([[1, 2, 3], [4, 5, 6]])

# apply statistical functions
b = np.sum(a)  # sum of all elements
c = np.mean(a)  # mean of all elements
d = np.max(a)  # maximum value
e = np.min(a)  # minimum value
f = np.std(a)  # standard deviation
g = np.var(a)  # variance

print(b)  # 21
print(c)  # 3.5
print(d)  # 6
print(e)  # 1
print(f)  # 1.707825127659933
print(g)  # 2.9166666666666665
```

---
## Exercises:
1. Import the NumPy library and create a 2D array of shape (3, 4) with random values.

2. Calculate the mean, median, and standard deviation of the array created in exercise 1.


3. Create a 1D array of shape (10,) with values ranging from 0 to 9 and reshape it to (5, 2).


4. Multiply each element of the array created in exercise 3 by 2.


5. Create a new 2D array of shape (3, 3) and perform element-wise multiplication with the array created in exercise 3.


6. Create a random array of shape (100, 2) using NumPy's random module and calculate the minimum and maximum values for each column.


## Conclusion and Next Steps

### Summary of NumPy's importance in geospatial data analysis:
- NumPy is a fundamental library for numerical computing in Python
- It provides efficient operations for array manipulation and mathematical computations
- NumPy's array object is the foundation for many other libraries used in geospatial data analysis, such as Pandas and Matplotlib
- It allows for faster processing and analysis of large geospatial datasets
- NumPy's broadcasting feature simplifies computations on arrays of different sizes

### Additional resources for further learning:
- NumPy official documentation: [https://numpy.org/doc/](https://numpy.org/doc/)
- NumPy tutorial on GeeksforGeeks: [https://www.geeksforgeeks.org/python-numpy-module-tutorial/](https://www.geeksforgeeks.org/python-numpy-module-tutorial/)
- NumPy tutorial on DataCamp: [https://www.datacamp.com/community/tutorials/python-numpy-tutorial](https://www.datacamp.com/community/tutorials/python-numpy-tutorial)
- "Python for Data Science For Dummies" by John Paul Mueller and Luca Massaron