# Selection of CO2 Data in the Upper Midwest from Large Atmospheric Datasets

# Jake Kastenbauer

Satellite observations of atmospheric CO2 tend to come in files with thousands to hundreds of thousands of data points. To find data points that meet the criteria for a given study area, geographic filters must first be applied. Here, steps are shown to demonstrate how CO2 data can be obtained for a study area within Minnesota and Wisconsin.

# 1. Load Modules and Dataset

First, modules are imported to read netCDF4 files and the dataset is imported.

In [1]:
# Import modules
import numpy as np
import netCDF4 as nc

In [2]:
# Open a NetCDF4 file
# This will be used to obtain sample CO2 data
nc_file = nc.Dataset('oco2.nc4', 'r')

# 2. Set Parameters to Filter Dataset

Then parameters are created to filter through data to find data points within Minnesota and Wisconsin with some arbitraty uppper and lower bounds. Variables are saved for CO2 concentration, latitude, and longitude of data points that meet the criteria.

In [3]:
# Create upper and lower bounds of acceptable areas
# First by latitutde
lower_lat = 43.00000
upper_lat = 49.22000

# Then longitude
lower_long = -97.50000
upper_long = -89.00000

In [4]:
# Create variables for latitude, longitude, and fractional CO2 data
latitude_variable = nc_file.variables['latitude']
longitude_variable = nc_file.variables['longitude']
co2_variable = nc_file.variables['xco2']

In [5]:
# Filter data based on latitude and longitude criteria
selected_indices = np.where(
    (lower_lat <= latitude_variable[:]) & (latitude_variable[:] <= upper_lat) &
    (lower_long <= longitude_variable[:]) & (longitude_variable[:] <= upper_long)
)[0] 

# 3. Point Selection

Finally, some random data points are selected that meet the criteria.

In [6]:
# Randomly select 10 data points that meet the criteria
if len(selected_indices) >= 10:
    random_indices = np.random.choice(selected_indices, 10, replace=False)
    
    # Use the selected indices to extract the desired data
    sampled_co2_data = co2_variable[random_indices]
    sampled_latitude_data = latitude_variable[random_indices]
    sampled_longitude_data = longitude_variable[random_indices]
    

    # Display the data
    print("Sampled CO2 data:", sampled_co2_data)
    print("Sampled Latitude data:", sampled_latitude_data)
    print("Sampled Longitude data:", sampled_longitude_data)

Sampled CO2 data: [418.26935 417.31525 417.67178 416.10205 418.47934 418.5734  418.49542
 417.5787  418.94144 418.73038]
Sampled Latitude data: [46.833374 46.08917  46.81354  46.84771  46.82212  46.83594  46.819393
 46.100273 46.844597 46.8161  ]
Sampled Longitude data: [-89.33213  -89.05609  -89.32468  -89.352776 -89.33152  -89.326195
 -89.33813  -89.057014 -89.33284  -89.318756]


The final output shows a few options for data points that can be used in subsequent CO2 analyses. Here, CO2 content is shown in parts per million (ppm) along with cooresponding geographic coordinates in decimal degrees.