##  Creating Weather Animation Part 2 - Weather Animation with HydroEstimator Data

In [None]:
#Your code goes here:

# Importing necessary libraries for handling file operations and data manipulation.

# The glob library is used for collecting file paths that match a specific pattern. This is useful in scenarios where multiple
# data files from a dataset, such as HydroEstimator satellite data, need to be accessed and processed together.


# xarray is a powerful Python library designed to work with labeled multi-dimensional arrays and datasets. It simplifies 
# working with complex data structures often found in satellite data, providing a more intuitive, pandas-like interface.
# This library is particularly useful for handling netCDF files commonly used in meteorology and climate science, which
# are formats that the HydroEstimator data might use.


In [None]:
#Your code goes here:

# Importing the os module to interact with the operating system. This module provides a portable way of using operating system
# dependent functionality like reading or writing to the file system.


# Obtain the current working directory where this notebook is running. This is useful for constructing paths that are relative
# to the notebook's location, ensuring that the code is portable and can be run on different machines or environments without
# modification.


# Append the subdirectory 'input_data' to the current directory. This is where we expect to find the HydroEstimator data files.
# Constructing paths in this manner allows for flexibility and easy configuration of file locations.


In [None]:
#Your code goes here:

# Generate a list of file paths for data files matching a specific pattern using the glob function. This function is part of the glob module

# Concatenate the directory path 'fllst' with the specific pattern for HydroEstimator data files. The pattern
# 'NPR.GEO.GHE.v1.*.nc.gz' is designed to match all compressed netCDF files in the 'input_data' directory that start with the HydroEstimator
# naming convention. These files contain geospatial data typically used for estimating precipitation, crucial in weather analysis and forecasting.

# The resulting list 'glst' will contain all file paths that match this pattern, enabling bulk processing of data files without manually
# specifying each file's name. This approach is highly efficient for handling large datasets common in satellite data analysis.


# Display the list of file paths to verify correct retrieval and to provide a clear idea of the data files available for analysis.
# This step is essential for debugging and ensuring that the setup correctly captures the targeted data files.


In [None]:
# Access the first element (index 0) from the list of file paths (glst).
# In Python, lists are zero-indexed, meaning the first element is accessed with index 0. This concept is fundamental in most programming
# languages and is crucial for iterating over and accessing elements within data structures efficiently.
fl = glst[0]

# Print the value of the 'fl' variable, which holds the path to the first HydroEstimator data file in the list.
# Printing this value serves as a check to ensure that we are retrieving the correct file path from our list of files.
# It's particularly useful for debugging and verifying that the file paths have been constructed and gathered correctly.
print(fl)

In [None]:
#Your code goes here:
# Importing necessary libraries for data handling


# Construct the path to the specific HydroEstimator data file you want to access.
# Here, the file is 'NPR.GEO.GHE.v1.Navigation.netcdf.gz' located in the 'input_data' subdirectory.
# First, obtain the current working directory.


# Append the 'input_data' directory and the specific file name to the current directory to create a full path.


# Open the dataset using xarray. Assuming the file is compressed in the gzip format, it needs to be opened accordingly.
# If xarray's open_dataset cannot directly open '.gz' compressed files, you may need to decompress the file first programmatically or manually.
                          # The engine parameter may vary based on file specifics and library versions.

# Display the dataset to verify that it has been loaded correctly.


In [None]:
# Access and retrieve a specific data point from the 'latitude' variable within the HydroEstimator dataset.
# This operation demonstrates how to extract individual values from a multi-dimensional array stored in an xarray dataset.

# The 'ncg' dataset contains various geospatial data, including latitude and longitude information. Data within this dataset
# is organized in dimensions that typically correspond to different axes in the data (e.g., time, latitude, longitude).

# Here, we are accessing the latitude value at the first row and first column of the dataset. Indexing in Python starts at 0,
# so '[0, 0]' refers to the first element in both the row and column dimensions. This is particularly useful when you need
# to retrieve a specific geographic coordinate for use in further calculations or when performing data verification.

# 'ncg['latitude']' accesses the latitude array within the dataset. Adding '[0, 0]' specifies the exact position in the array
# from which to extract the data. The '.data' at the end extracts the actual numerical value from the xarray DataArray object,
# making it usable as a standard Python variable (e.g., for calculations or output).

# It's important to understand this indexing method when working with satellite data, as precise data extraction is often
# necessary for analyzing specific areas or phenomena.


# Print the extracted latitude value to verify the correct data retrieval and to demonstrate the output.
# This step is crucial for debugging and ensuring the accuracy of data manipulation tasks.


In [None]:
#Your code goes here:

# The variable 'fl' is assumed to contain the file path of the NetCDF dataset we intend to open.
# It's important to verify that 'fl' is correctly defined and points to a valid NetCDF file before this step.

# Open a NetCDF dataset using xarray. This library simplifies the process of loading, manipulating, and analyzing complex
# multi-dimensional scientific data. NetCDF (Network Common Data Form) is a widely used format for array-oriented scientific
# data, especially in meteorology and oceanography.

# 'xr.open_dataset()' is used here to load the NetCDF file specified by the path in 'fl'. This function automatically
# handles the dataset's dimensions, coordinates, and attributes, providing an accessible and manipulable representation
# of the data in Python.


# Display the dataset object. This will output a summary of the dataset's contents, including its dimensions, coordinates,
# and variables. Displaying the dataset immediately after loading is a good practice as it provides an overview of the data's
# structure, available variables, and metadata, which are crucial for planning further data analysis tasks.



In [None]:
#Your code goes here:

# Calculate the longitude and latitude values based on the spatial dimensions of the 'rain' variable in the dataset.
# This is important for correctly georeferencing the data in geographical space.

# Retrieve the shape of the 'rain' variable, which represents precipitation data.
# 'ly' corresponds to the number of latitude points, and 'lx' corresponds to the number of longitude points.


# Calculate the increment per step in longitude (dtx) across the total range.
# The range here is assumed to be from -180 to 180 degrees. This is calculated by dividing the total range
# by the number of points in longitude (lx).


# Generate a list of longitude values starting from -180 degrees, increasing by 'dtx' with each step.


# Similarly, calculate the increment per step in latitude (dty) from -65 to 65 degrees.
# The division by 'ly' distributes the latitude values evenly between these bounds.


# Generate a list of latitude values starting from -65 degrees, increasing by 'dty' with each step.

              # Reverse the latitude array if needed, depending on the data orientation.

# Add longitude and latitude arrays as coordinates to the dataset. This enhances data accessibility and usability,
# enabling operations that require geographical referencing, such as plotting and spatial analysis.
#Your code goes here:



# Rename dimensions to be more intuitive. 'lines' are commonly used in image data but here it's more appropriate to use 'lat',
# and similarly, 'elems' is replaced with 'lon'. This renaming makes the dataset dimensions clearer and more standard,
# which is helpful for subsequent data handling and analysis.
#Your code goes here:



# Often in meteorological data, a zero value might represent missing or undefined data. Here, replace zero values in the 'rain'
# variable with NaN to properly represent missing data. This step is crucial for accurate statistical analysis and visualization,
# as it prevents zero values from skewing analyses and plots.
#Your code goes here:



# Print the updated dataset to verify changes.
#Your code goes here:




In [None]:
# Plot the 'rain' variable from the dataset using xarray's built-in plotting capabilities.
# Xarray integrates with Matplotlib to provide a convenient way to visualize data directly from DataArrays.
# This is particularly useful for quick examinations of data and preliminary analysis.

# 'nc['rain'].plot()' automatically creates a plot of the 'rain' variable. This function utilizes xarray's ability to handle
# labelled data, automatically selecting the appropriate plot type (in this case, likely a 2D image plot if 'rain' is a two-dimensional variable).
# It labels axes based on the dimensions of the data and uses the variable's metadata to set titles and colorbar labels.

# This kind of plot is essential for initial data exploration and quality control, allowing researchers and students to quickly
# identify patterns, anomalies, or issues within the dataset. For example, visualizing precipitation data can help in identifying
# areas of heavy rainfall or comparing observed patterns against meteorological predictions.

#Your code goes here:



# Additional customizations can be added to enhance the plot. For example, adding grid lines, setting axis limits,
# or adjusting the color scheme. These can be done using additional arguments in the plot function or by manipulating
# the Matplotlib axes object that the plot function returns.

# Example of customizing the plot:
# ax = nc['rain'].plot()
# ax.set_title('Rainfall Intensity')
# ax.set_xlabel('Longitude')
# ax.set_ylabel('Latitude')
# ax.grid(True)

In [None]:
# Select a subset of the dataset based on longitude and latitude slices
#Your code goes here:



In [None]:
# Selecting a specific geographic subset from the dataset based on longitude and latitude coordinates.
# This operation is crucial in spatial data analysis, especially when the focus is on a particular region or when 
# dealing with large datasets where reducing the area of interest can significantly enhance processing efficiency.

# The 'nc' dataset contains comprehensive global or regional weather data, but for specific analyses, 
# you might only need data from a particular area. Using xarray's .sel() method allows you to specify slices 
# for coordinates (in this case, longitude and latitude) to extract only the data relevant to your area of interest.

# Here, we define longitude slices from -123 to -74.5 and latitude slices from 37 to 10, 
# effectively narrowing down the dataset to cover a specific part of the Western Hemisphere, 
# likely focusing on significant portions of North and Central America.

# The 'slice' function is used to define the start and end points for each dimension (longitude and latitude), 
# which xarray uses to select the corresponding range from the dataset.
#Your code goes here:



# Display the newly selected subset of the dataset.
# This output allows us to verify that the dimensions and variable sizes reflect the new geographic constraints,
# ensuring that the selection was performed correctly. It's an essential step for confirming the data's integrity
# and appropriateness for subsequent analysis.
#Your code goes here:



# Further operations can now be performed on this subset, such as detailed data analysis, visualization, 
# or exporting to a different format for use in other applications or reports.

In [None]:
#Your code goes here:
# Import necessary libraries for plotting and mapping
              # Matplotlib for plotting
             # Matplotlib colors for colormap
              # Cartopy for geographic projections
              # Cartopy coordinate reference systems

In [None]:
# Define a custom color map for precipitation data visualization.
# Each tuple in the list represents an RGB color.
cmap_data = [
    (1.0, 1.0, 1.0),  # white
    (0.3137255012989044, 0.8156862854957581, 0.8156862854957581),  # light cyan
    (0.0, 1.0, 1.0),  # cyan
    (0.0, 0.8784313797950745, 0.501960813999176),  # aquamarine
    (0.0, 0.7529411911964417, 0.0),  # green
    (0.501960813999176, 0.8784313797950745, 0.0),  # yellow-green
    (1.0, 1.0, 0.0),  # yellow
    (1.0, 0.6274510025978088, 0.0),  # orange
    (1.0, 0.0, 0.0),  # red
    (1.0, 0.125490203499794, 0.501960813999176),  # magenta
    (0.9411764740943909, 0.250980406999588, 1.0),  # purple
    (0.501960813999176, 0.125490203499794, 1.0),  # dark purple
    (0.250980406999588, 0.250980406999588, 1.0),  # indigo
    (0.125490203499794, 0.125490203499794, 0.501960813999176),  # dark blue
    (0.125490203499794, 0.125490203499794, 0.125490203499794),  # black
    (0.501960813999176, 0.501960813999176, 0.501960813999176),  # gray
    (0.8784313797950745, 0.8784313797950745, 0.8784313797950745),  # light gray
    (0.9333333373069763, 0.8313725590705872, 0.7372549176216125),  # beige
    (0.8549019694328308, 0.6509804129600525, 0.47058823704719543),  # brown
    (0.6274510025978088, 0.42352941632270813, 0.23529411852359772),  # dark brown
    (0.4000000059604645, 0.20000000298023224, 0.0)  # deep brown
]

# Define the levels of precipitation to classify the data into different ranges.
# These levels are used for normalizing color representation based on precipitation intensity.
#Your code goes here:

# Create a colormap object using the defined color map data and name it 'precipitation'.
#Your code goes here:

# Create a BoundaryNorm object for normalization.
# It maps the precipitation values (clevs) into discrete intervals for the colormap.
#Your code goes here:

In [None]:
# Create a figure with a specified size (15x15 inches) for plotting.
#Your code goes here:

# Define the map projection as 'PlateCarree' (geographical coordinates).
#Your code goes here:

# Add a subplot to the figure with the specified projection.
#Your code goes here:

# Display the 'rain' data from the dataset as an image on the axes.
# Setting the extent of the image to match the geographical coordinates in the dataset.
# The colormap and normalization are applied to represent the rain intensity.
#Your code goes here:





# The commented line below represents an alternative method using contour filling.
# ax.contourf(ds['lon'], ds['lat'], ds['rain'].data, clevs, cmap=cmap, norm=norm)

# Add coastlines to the map for better geographical reference.
# Setting resolution to '50m' and line color to black with a linewidth of 0.85.
#Your code goes here:



# Set the extent of the map in longitude and latitude.
# This defines the visible area of the map.
a#Your code goes here:



# Add a colorbar to the figure for reference.
# 'shrink' controls the size of the colorbar.
# Set the title of the colorbar to 'mm/hr' to indicate the unit of rainfall intensity.
#Your code goes here:




In [None]:
#Your code goes here:

# Import the os module
 # Import datetime class from datetime module

# Parse a specific string format into a datetime object


In [None]:
# Set up a 15x15 inch figure for plotting
#Your code goes here:

# Use PlateCarree projection for geographic map plotting
#Your code goes here:

# Add a subplot to the figure with the specified map projection
#Your code goes here:

# Display 'rain' data as an image on the map with the specified color mapping and normalization
cf = ax.imshow(ds['rain'].data, 
               extent=(ds['lon'].min().data, ds['lon'].max().data, 
                       ds['lat'].min().data, ds['lat'].max().data), 
               cmap=cmap, norm=norm, transform=proj)

# Alternatively, uncomment to use contour filling for 'rain' data visualization
# cf = ax.contourf(ds['lon'], ds['lat'], ds['rain'].data, clevs, cmap=cmap, norm=norm)

# Add coastlines to the map for better reference, with specified resolution and styling
#Your code goes here:

# Define the geographical extent of the map
#Your code goes here:

# Set the title of the plot using the datetime object 'ddte'
#Your code goes here:

# Add a colorbar to the plot, with a title representing the unit of measurement
#Your code goes here:



In [None]:
#Your code goes here:



In [None]:
def read_hyest(fl):
    # Open the dataset from the file
    nc = xr.open_dataset(fl)

    # Get the shape of the 'rain' data to calculate latitudes and longitudes
    ly, lx = nc['rain'].shape

    # Calculate longitude values based on dataset shape
    dtx = (abs(-180) + abs(180)) / lx
    lons = [-180 + (dtx * x) for x in range(lx)]

    # Calculate latitude values based on dataset shape
    dty = (abs(-65) + abs(65)) / ly
    lats = [-65 + (dty * y) for y in range(ly)]
    lats.reverse()  # Reverse latitudes as they are typically from North to South

    # Assign calculated longitude and latitude values to the dataset
    nc['lon'] = lons
    nc['lat'] = lats

    # Rename dimensions for clarity
    nc = nc.rename({'lines': 'lat', 'elems': 'lon'})

    # Uncomment below to filter out zero values in 'rain' data
    # nc['rain'] = nc['rain'].where(nc['rain'].data != 0)
    
    return nc

In [None]:
#Your code goes here:



In [None]:
# Set up a 10x10 inch figure for plotting
fig = plt.figure(figsize=(10,10))

# Use PlateCarree projection for geographic map plotting
proj = ccrs.PlateCarree()

# Add a subplot to the figure with the specified map projection
ax = fig.add_subplot(1, 1, 1, projection=proj)

# Add coastlines to the map for better reference
ax.coastlines(resolution='50m', color='black', linewidth=0.85)

# Define the geographical extent of the map
ax.set_extent([-92.0, -78.0, 10.0, 20.0])

# Initialize a list to store image frames for animation
ims = []

# Loop through each file in the list 'glst'
for n, fl in enumerate(glst):
    # Read dataset using the custom function 'read_hyest'
    nc = read_hyest(fl)

    # Subset the dataset for the desired longitude and latitude range
    ds = nc.sel(lon=slice(-123, -74.5), lat=slice(37, 10))

    # Process the 'rain' data for the current frame
    d0 = ds['rain']
    dd = d0 if n == 0 else dd + d0

    # Plot the data as an image and add to the frame list
    im = ax.imshow(dd.data, extent=(ds['lon'].min().data, ds['lon'].max().data, ds['lat'].min().data, ds['lat'].max().data), 
                   cmap=cmap, norm=norm, transform=proj)
    frame = [im]
    ims.append(frame)

# Prevent display of static plot
plt.close()

# Create the animation
ani = animation.ArtistAnimation(fig, ims)

# Render the animation in the notebook as an interactive JavaScript widget
HTML(ani.to_jshtml())

In [None]:
#Your code goes here:

# Initialize a figure with a specified size


# Set the map projection to PlateCarree (equidistant cylindrical projection)


# Add a subplot with the specified projection


# Draw coastlines for reference, with a specified resolution and style


# Set the geographic extent of the map


# Initialize a list to hold frames for animation


# Loop through each file in the list 'glst'
for n, fl in enumerate(glst):
    
    
    # Read the dataset from the file using the custom function
    

    # Subset the dataset for a specific longitude and latitude range
    

    # Extract the 'rain' data from the dataset
    

    # Accumulate the 'rain' data over iterations or start new for the first iteration
    if n == 0:
        dd = d0.copy()
    else:
        dd += d0
    
    # Create an image from the 'rain' data and add it to the current frame
    im = ax.imshow(dd.data, 
                   extent=(ds['lon'].min().data, ds['lon'].max().data, 
                           ds['lat'].min().data, ds['lat'].max().data), 
                   cmap=cmap, norm=norm, transform=proj)
    
    frame.append(im)
    ims.append(frame)
    
# Close the plt.show() window to prevent display of a static plot


# Create an animation from the frames


# Display the animation as an interactive JavaScript widget in Jupyter Notebook
