<a href="https://colab.research.google.com/github/XdstruCTor/climate-change-data/blob/main/NASA.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Imports the necessary Python libraries for data processing, visualization, and geospatial analysis. Key libraries include:
- `matplotlib.pyplot` for plotting and visualization.
- `numpy` for numerical operations.
- `pandas` for data manipulation and analysis.
- `rasterio` for working with geospatial raster data.

The `rasterio` library is installed using `pip` to enable reading, writing, and manipulating geospatial raster data. This step ensures that all dependencies, such as `affine`, `click`, and `cligj`, are installed.

Imports specific modules from `rasterio`:
- `merge` for combining multiple raster files into a single mosaic.
- `show` for visualizing raster data.
- `os` for interacting with the file system.

In [None]:
!pip install rasterio
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import rasterio
from rasterio.merge import merge
from rasterio.plot import show
import os

A ZIP file containing geospatial data is extracted to a specified directory. This step prepares the data for further processing by unpacking the compressed files.

In [None]:
import zipfile

In [None]:
zip_path = "path-to-data/TopDownEmissions_GOSAT_post_coal_GEOS_CHEM_2019.tif_undefined.zip"
extracted_dir = "path-to-exracted/extracted_data/"

In [None]:
with zipfile.ZipFile(zip_path, 'r') as zip_ref:
    zip_ref.extractall(extracted_dir)

In [None]:
extracted_files = os.listdir(extracted_dir)
print(f"Extracted files: {extracted_files}")

Only TIFF files (geospatial raster data) are selected from the extracted files. These files will be used for further analysis and merging.

Each TIFF file is opened using `rasterio`, and the file objects are stored in a list. This prepares the files for the mosaicking process, where multiple raster files are combined into a single raster.

The `merge` function from `rasterio` is used to combine the individual raster files into a single mosaic. This creates a unified raster dataset that covers the entire area of interest.

The merged raster is saved as a new TIFF file. The metadata from the first input file is used to ensure consistency in the output file.

The merged raster is visualized using `rasterio.plot.show`. The `viridis` colormap is applied to highlight the data values, providing a clear visual representation of the raster.

The Coordinate Reference System (CRS) and bounds of the merged raster are printed. This metadata is essential for understanding the spatial context of the data.

**dataset1 (GHCN-Daily, Version 3) documentation can be found here**



https://catalog.data.gov/dataset/global-historical-climatology-network-daily-ghcn-daily-version-32

In [None]:
tif_files = [os.path.join(extracted_dir, f) for f in os.listdir(extracted_dir) if f.endswith('.tif')]
src_files_to_mosaic = []

for tif_file in tif_files:
    src = rasterio.open(tif_file)
    src_files_to_mosaic.append(src)
mosaic, out_trans = merge(src_files_to_mosaic)
# metadata of the first file to use for the merged file
out_meta = src_files_to_mosaic[0].meta.copy()
out_meta
output_path = "merged_output.tif"
with rasterio.open(output_path, "w", **out_meta) as dest:
    dest.write(mosaic)

print(f"Merged file saved as: {output_path}")

# visualizing the merged raster
show(mosaic, cmap='viridis')
merged_tif = "merged_output.tif"

# print CRS
with rasterio.open(merged_tif) as dataset:
    print(f"CRS of the merged raster: {dataset.crs}")
    print(f"Bounds of the raster: {dataset.bounds}")


A bounding box is defined for the Democratic Republic of the Congo (DRC) to extract and visualize data for this specific region. The bounding box coordinates are specified in longitude and latitude.

Using the bounding box, a subset of the merged raster data is extracted and visualized. This allows for focused analysis of the DRC region.



A smaller bounding box is defined around Lubumbashi, a city in the DRC. Data for this specific location is extracted and stored in a DataFrame for further analysis.

A second dataset (methane emissions) is processed similarly to the first. The data is extracted, merged, and visualized, following the same workflow as before.

The metadata of the second merged raster is inspected, including the number of layers (bands), width, height, CRS, and bounds. To ensure the data is correctly processed and ready for analysis.

The raster data from the second dataset is converted into a Pandas DataFrame. This allows for further analysis and comparison with the first dataset.







In [None]:
from rasterio.transform import rowcol
from pyproj import Proj, transform
merged_tif = "merged_output.tif"
import matplotlib.pyplot as plt
from rasterio.windows import from_bounds


merged_tif = "merged_output.tif"

# bounding box for DRC
drc_bounds = {
    "west": 12.0,
    "east": 31.0,
    "south": -13.0,
    "north": 5.0
}

with rasterio.open(merged_tif) as dataset:
    # window
    window = from_bounds(drc_bounds['west'], drc_bounds['south'],
                         drc_bounds['east'], drc_bounds['north'],
                         transform=dataset.transform)

    data = dataset.read(1, window=window)

    plt.imshow(data, cmap='viridis', extent=(drc_bounds['west'], drc_bounds['east'],
                                             drc_bounds['south'], drc_bounds['north']))
    plt.colorbar(label='Values')
    plt.title("Data for Democratic Republic of the Congo (DRC)")
    plt.show()


In [None]:
# bounding box around Lubumbashi
lubum_box = {
    "west": 27.1960,
    "east": 27.7247,
    "south": -11.7934,
    "north": -11.4894
}
with rasterio.open(merged_tif) as dataset:
    # window
    window = from_bounds(lubum_box['west'], lubum_box['south'],
                         lubum_box['east'], lubum_box['north'],
                         transform=dataset.transform)

    data_l = dataset.read(1, window=window)
zip_path1 = "/content/drive/MyDrive/codes/methane_emis_fossil_199901.tif_undefined (1).zip"
extracted_dir1 = "/content/drive/MyDrive/extracted_data1/"

**Documentation 2nd dataset (methane emissions):**
https://earth.gov/ghgcenter/data-catalog/tm54dvar-ch4flux-monthgrid-v1

In [None]:
with zipfile.ZipFile(zip_path1, 'r') as zip_ref:
    zip_ref.extractall(extracted_dir1)
extracted_files1 = os.listdir(extracted_dir1)
print(f"Extracted files: {extracted_files1}")
tif_files1 = [os.path.join(extracted_dir1, f) for f in os.listdir(extracted_dir1) if f.endswith('.tif')]
src_files_to_mosaic1 = []

for tif_file in tif_files1:
    src = rasterio.open(tif_file)
    src_files_to_mosaic1.append(src)
mosaic1, out_trans1 = merge(src_files_to_mosaic1)
out_meta1 = src_files_to_mosaic1[0].meta.copy()

In [None]:
output_path1 = "merged_output1.tif"
with rasterio.open(output_path1, "w", **out_meta1) as dest:
    dest.write(mosaic1)

print(f"Merged file saved as: {output_path1}")

show(mosaic1, cmap='viridis')
merged_tif1 = "merged_output1.tif"

drc_bounds = {
    "west": 12.0,
    "east": 31.0,
    "south": -13.0,
    "north": 5.0
}
with rasterio.open(merged_tif1) as dataset:
    window = from_bounds(drc_bounds['west'], drc_bounds['south'],
                         drc_bounds['east'], drc_bounds['north'],
                         transform=dataset.transform)

    data1 = dataset.read(1, window=window)
    plt.imshow(data1, cmap='viridis', extent=(drc_bounds['west'], drc_bounds['east'],
                                             drc_bounds['south'], drc_bounds['north']))
    plt.colorbar(label='Values')
    plt.title("Data for Democratic Republic of the Congo (DRC)")
    plt.show()