# Extract raster coordinates

In this Notebook we are going to extract not only the elevation but also the **coordinates** of several points randomly selected over a digital elevation model (DEM)

<img src="util/raster_DEM_UTM_coord.png" style="width: 600px">

## **Steps**
### 0. Import the necessary libraries

In [None]:
# Import libraries
import numpy as np  # Import the NumPy library for numerical operations, particularly for arrays.
import rasterio  # Import the rasterio library for reading and writing geospatial raster data (like GeoTIFFs).
import matplotlib.pyplot as plt  # Import matplotlib's pyplot module for creating plots and visualizations.

### 1. Load the DEM Raster  
When you load a raster file (like a DEM) using Rasterio in Python, you get more than just the raw elevation data. Here's a breakdown of what you get:

a. **DEM (Digital Elevation Model)**

This is the core data – a 2D array (or sometimes a multi-dimensional array for multi-band rasters) of elevation values. Each cell in the array corresponds to a location on the ground, and the value in that cell represents the elevation at that point.   

In [None]:
# Open the Digital Elevation Model (DEM) raster file
raster = rasterio.open('datos/dem.tif')

# Read the first band of the DEM (assuming it's a single-band raster)
dem = raster.read(1)  # Extracts a 2D array representing elevation values

# Get the number of rows and columns in the DEM
nrows, ncols = dem.shape

b. **Metadata**

Think of metadata as the descriptive information about your DEM. This can include:
- Spatial Reference System (SRS): Tells you how the DEM is georeferenced (e.g., latitude/longitude, UTM). This is crucial for knowing where your data is located on Earth.
- Data Type: The type of data used to store elevation values (e.g., integers, floating-point numbers). This affects precision and storage size.
- NoData Values: Values used to represent areas where elevation data is missing or invalid.   
- Number of Rows and Columns: The dimensions of the DEM grid.
- Units: The units of measurement for elevation (e.g., meters, feet).
- Creation Date: When the DEM was created or last modified.
- Source: Where the DEM data came from.

In [None]:
raster.meta

c. **Transform**

The transform (often called an affine transform) is a mathematical function that links the pixel coordinates in your DEM array to real-world coordinates in your chosen Spatial Reference System.
It essentially tells you how to go from a pixel in your array to a specific location on the ground (and vice versa).
The transform is usually represented as a 3x3 matrix, but Rasterio provides tools to work with it more easily.

**Why are Metadata and Transform Important?** Without metadata and the transform, your DEM is just a grid of numbers. You wouldn't know where those numbers correspond to on the Earth's surface.

In [None]:
raster.transform

### 2. Generate Random Points  
   - The shape of the DEM is obtained (`nrows, ncols`).
   - A total of `30` random points are generated within the raster’s row and column limits using `np.random.randint()`.

In [None]:
# Define the number of points to extract
n_points = 30

# Generate random row and column indices within the DEM dimensions
np.random.seed(42) # Set a random seed for reproducibility
row_ids = np.random.randint(0, nrows, n_points) # generate random integer values
col_ids = np.random.randint(0, ncols, n_points) # generate random integer values

### 3. Extract Elevation Values
   - Using the randomly generated coordinates, elevation values are retrieved from the `dem` array.

In [None]:
# Extract elevation values at the randomly selected points
elevations = dem[row_ids, col_ids]
print(elevations)

### 4. Extract the UTM coordinates
To transform the `row_coords` and `col_coords` into UTM latitude and longitude values, you can use the raster's transformation matrix. Rasterio provides a `transform` attribute. This allows you to map row and column coordinates (which correspond to pixel positions in the raster) into geospatial coordinates (latitude and longitude).

The general formula to convert row and column indices into geographic UTM coordinates is:

$
\text{longitude} = \text{transform}[0] \times \text{col} + \text{transform}[2]
$

$
\text{latitude} = \text{transform}[4] \times \text{row} + \text{transform}[5]
$

- `transform[0]` corresponds to the pixel width in geographic coordinates (longitude), and `transform[2]` is the longitude of the upper-left corner of the raster.
- `transform[4]` corresponds to the pixel height (negative value for north-to-south coordinate systems), and `transform[5]` is the latitude of the upper-left corner of the raster.

In [None]:
transform = raster.transform

# Convert row/col coordinates to latitude/longitude using the affine transform
lats, lons = [], []
for i in range(n_points):
    lon = transform[0] * col_ids[i] + transform[2]
    lat = transform[4] * row_ids[i] + transform[5]
    lats.append(lat)
    lons.append(lon)
print(lats)
print(lons)

### 5. Plot the DEM and Sample Points
   - The DEM is displayed using a terrain colormap (`cmap='terrain'`).
   - The extracted elevation points are overlaid on the DEM using `plt.scatter()`, with colors representing their elevation values.
   - A color bar, legend, and title are added for clarity.

In [None]:
# Plot DEM with extracted points
plt.figure(figsize=(10, 5))
plt.imshow(dem, cmap='terrain', origin='upper')
plt.scatter(col_ids, row_ids, c=elevations, edgecolor='k', cmap='coolwarm', label='Sample Points')
plt.colorbar(label='Elevation (m)')
plt.title(f'Extracted Coordinates (n={n_points})')

# Add labels for each point
for i in range(n_points):
    plt.text(col_ids[i], row_ids[i], '('+str(lats[i])+','+str(lons[i])+')', 
             fontsize=10, ha='right', color='white', weight='bold')
    
plt.show()

### 6. Save the results
Now, let's **save** the elevation values as a text file, in this case as 'comma-separated values' (**.csv**) file. For this purpose we can use Numpy

In [None]:
# Stack them column-wise
np.savetxt('lat_lon_elev.csv', (lats, lons, elevations))

Let's make some improvements

In [None]:
# Stack them column-wise
data = np.column_stack((lats, lons, elevations))
np.savetxt('lat_lon_elev.csv', data, header="lat,lon,elev",delimiter = ',')

Or even better, let's use the library **Pandas** and save it as an Excel file

In [None]:
import pandas as pd
df = pd.DataFrame(lats, columns = ['lat'])
df.head()

In [None]:
# Let's add two more columns, one for the longitude values and another one for the elevation values
df['lon'] = lons
df['elev'] = elevations
df.head()

Let's save the dataframe as an **Excel file** (.xlsx)

In [None]:
df.to_excel('lat_lon_elev.xlsx')