# Accessing Real-Time NOAA GOES16 Data from AWS bucket
##
This notebook demonstrates how to access and visualize real-time NOAA GOES16 satellite data from the AWS s3 bucket. We will use various Python libraries to read, process, and visualize this data.


In [12]:
# Make sure you have all the following libraries installed

import warnings
warnings.filterwarnings('ignore')
warnings.simplefilter('ignore', SyntaxWarning)
import boto3
from botocore import UNSIGNED
from botocore.client import Config
from satpy.scene import Scene
from satpy.utils import debug_on

from datetime import datetime
import time
from glob import glob
import os

In this section, we extract various components of the current UTC date and time, including the year, month, day, and hour. Additionally, we convert the current date to its corresponding Julian day number and display it. This process is essential for time-sensitive data operations, such as handling satellite imagery timestamps.

In [6]:
# Get the current UTC time
now = datetime.utcnow()

# Extract year, month, day, and hour
year = now.year
month = now.month
day = now.day
hour = now.hour

# Convert the current date to Julian day
julian_day = now.timetuple().tm_yday

#print the current date
print(julian_day)

166


## Listing Files from NOAA GOES16 Bucket on AWS S3 Using Anonymous Access
#
This section demonstrates how to initialize a session with Amazon S3 without authentication (anonymous access) and lists files from the NOAA GOES16 data bucket. This method is used to retrieve publicly available satellite data, where no AWS credentials are required. The example lists objects within a specific prefix, which needs to be adjusted based on your specific date and time requirements.

In [7]:


# Initialize a session using anonymous access
s3_client = boto3.client('s3', config=Config(signature_version=UNSIGNED))

# Define the bucket name and prefix
bucket_name = 'noaa-goes16'
Prefix = f'ABI-L1b-RadF/{year}/{julian_day:03d}/{hour:02d}/'  # Adjust based on your date and time requirements

# List objects within the specified prefix
response = s3_client.list_objects_v2(Bucket=bucket_name, Prefix=Prefix)

# Check if there are any contents in the response
if 'Contents' in response:
    for obj in response['Contents']:
        print(obj['Key'])
else:
    print("No files found in the specified path.")


ABI-L1b-RadF/2024/166/16/OR_ABI-L1b-RadF-M6C01_G16_s20241661600211_e20241661609519_c20241661609561.nc
ABI-L1b-RadF/2024/166/16/OR_ABI-L1b-RadF-M6C01_G16_s20241661610211_e20241661619519_c20241661619572.nc
ABI-L1b-RadF/2024/166/16/OR_ABI-L1b-RadF-M6C01_G16_s20241661620211_e20241661629519_c20241661629561.nc
ABI-L1b-RadF/2024/166/16/OR_ABI-L1b-RadF-M6C01_G16_s20241661630211_e20241661639519_c20241661639554.nc
ABI-L1b-RadF/2024/166/16/OR_ABI-L1b-RadF-M6C02_G16_s20241661600211_e20241661609519_c20241661609557.nc
ABI-L1b-RadF/2024/166/16/OR_ABI-L1b-RadF-M6C02_G16_s20241661610211_e20241661619519_c20241661619550.nc
ABI-L1b-RadF/2024/166/16/OR_ABI-L1b-RadF-M6C02_G16_s20241661620211_e20241661629519_c20241661629545.nc
ABI-L1b-RadF/2024/166/16/OR_ABI-L1b-RadF-M6C02_G16_s20241661630211_e20241661639519_c20241661639546.nc
ABI-L1b-RadF/2024/166/16/OR_ABI-L1b-RadF-M6C03_G16_s20241661600211_e20241661609519_c20241661609570.nc
ABI-L1b-RadF/2024/166/16/OR_ABI-L1b-RadF-M6C03_G16_s20241661610211_e20241661619519

## Now let's see if we can see how many channels are there without downloading the data 

#

To determine the available channels and composite variables in an S3 bucket for NOAA GOES data without downloading the data, you can list the objects in the bucket and parse their filenames. This approach leverages the naming convention of the files to identify different channels and composites.

In [9]:
# first step would be to setup a S3 client using boto3 package for anonymous access to the NOAA GOES S3 bucket.

import boto3
from botocore import UNSIGNED
from botocore.client import Config
from collections import defaultdict

# Initialize the S3 client with anonymous access
s3_client = boto3.client('s3', config=Config(signature_version=UNSIGNED))

In [10]:
# Now as you have created a client to connect the s3 bucket, we need to specify which s3 bucket we want to acess

# Define the bucket name and prefix
bucket_name = 'noaa-goes16'
prefix = f'ABI-L1b-RadF/{year}/{julian_day:03d}/{hour:02d}/' # Adjust based on your date and time requirements

# List objects within the specified prefix to get file names
response = s3_client.list_objects_v2(Bucket = bucket_name, Prefix = prefix)

# Initialize dictionaries to store channels and composites
channels = defaultdict(int)
composites = defaultdict(int)



In [11]:
# Process the list of files
if 'Contents' in response:
    for obj in response['Contents']:
        key = obj['Key']
        filename = key.split('/')[-1]
        parts = filename.split('_')
        
        if len(parts) > 2:
            channel = parts[1][-3:]  # This extracts the channel information (e.g., RadC02)
            channels[channel] += 1
        

# Print the available channels and their counts
print("Available channels and their counts:")
for channel, count in channels.items():
    print(f"{channel}: {count}")


Available channels and their counts:
C01: 4
C02: 4
C03: 4
C04: 4
C05: 4
C06: 4
C07: 4
C08: 4
C09: 4
C10: 4
C11: 4
C12: 4
C13: 4
C14: 4
C15: 4
C16: 4


**If you run the above code again and again you would see that it keeps getting updated!!**

## Understanding Data Updates and the Importance of Real-Time Access

### Data Update Mechanisms

The NOAA GOES16 satellite transmits data continuously, covering a wide range of atmospheric and environmental conditions across the Western Hemisphere. This data is crucial for meteorology and climate monitoring. It includes high-resolution imagery and other meteorological parameters that are updated every few minutes.

Data from the GOES16 satellite is captured and then pushed to AWS in near real-time, allowing users to access updated information without significant delays. This real-time stream ensures that every new piece of data—be it cloud patterns, storm developments, or atmospheric changes—is made available to users as quickly as possible.

### Why Real-Time Access Is Crucial

1. **Weather Forecasting and Alerts**: Real-time data access is critical for weather forecasting. Meteorologists rely on up-to-the-minute data to predict severe weather events, including hurricanes, tornadoes, and thunderstorms, which can save lives and property by providing timely warnings.

2. **Research and Environmental Monitoring**: Scientists studying climate patterns, atmospheric science, and environmental changes depend on the most current data to ensure the accuracy of their work. Delays in data access can affect the relevance and effectiveness of scientific research.

3. **Commercial Applications**: Industries such as agriculture, aviation, and maritime operations rely heavily on accurate weather forecasts which depend on real-time data. For instance, farmers need timely weather information to plan irrigation and harvesting, while airlines use real-time weather data to route flights safely and efficiently.

4. **Public Safety and Emergency Response**: During natural disasters, real-time satellite data can guide evacuation plans and emergency responses, providing updated information on disaster progression and affected areas.

### How Users Access This Data

To access this real-time data, users can connect to AWS where the data is stored in S3 buckets. The data is organized and can be accessed publicly or with authenticated sessions depending on the data's sensitivity and intended usage. This setup allows users from around the world to download and utilize the data for a variety of applications seamlessly.

The use of services like AWS ensures scalability and reliability in data delivery, which is vital for maintaining continuous access to satellite data for global users.




## Running the Real-Time Data Updater

In this section, we discuss how we automate the process of fetching and updating real-time satellite data from NOAA GOES16 using a background process.

### Overview of `run_updater()`

The `run_updater()` function is a crucial part of our notebook's capability to access and process real-time data. This function is designed to run continuously in the background, checking for new satellite data updates and loading them into our analysis environment as soon as they become available.

### Implementation Details

The `run_updater()` function is part of a Python package called `real_time_package`, specifically within the `runner` module. Here’s a brief overview of how it works:

1. **Background Thread**: The function initiates a background thread that runs independently of the main notebook execution. This allows the notebook to remain responsive and capable of performing other tasks while data updates are being managed.

2. **Data Checking**: The background thread periodically checks for new data files in the NOAA GOES16 AWS S3 bucket. It uses predefined criteria to determine which files are new since the last check.

3. **Data Loading**: Once new files are detected, the function automatically loads them into the current session, ensuring that the dataset in use is always up to date.

4. **Error Handling**: It includes robust error handling mechanisms to manage issues like network interruptions or access errors, ensuring the process is reliable and the notebook environment remains stable.

### Usage

```python
from real_time_package.runner import run_updater

# Start the data updater in the background
run_updater()


In [14]:
from real_time_package.runner import run_updater

run_updater()

## Having updated the list of paths, let's now visualize this data directly without the need for downloading it.

In [15]:
import s3fs

#This time we have txt file where all the file path names are updated according to the time stamp.

# Define the s3 bucket and file path
bucket_name = 'noaa-goes16'

# Read the file paths from the local file
path_to_channel = 'channel_paths.txt'
with open(path_to_channel, 'r') as f:
    file_paths = [line.strip() for line in f.readlines()]

# Prefix the file paths with 's3://' to use them directly from S3
s3_file_paths = [f's3://{bucket_name}/{path}' for path in file_paths]

# Initialize the S3 filesystem with anonymous acess
s3 = s3fs.S3FileSystem(anon = True)



## Visualize

In [None]:
%%time

# Create a Satpy Scene with the file-like objects from S3
scn = Scene(filenames=s3_file_paths, reader='abi_l1b')


In [None]:
%%time

composite_variable = 'airmass'
scn.load([composite_variable], use_dask=True)

In [None]:
%%time

#print the start and end times
print(f"Start time: {scn.start_time}")
print(f"End time: {scn.end_time}")

#plot the data using matplotlib
# scn.show(composite_variable)

# save the image to a file
# output_path = f'{composite_variable}.png'
# scn.save_dataset(rgb_im, filename=output_path)
# print(f"Airmass image saved at: {output_path}")



### Creation of Composite Variables

Composite variables are synthesized views created from multiple individual data channels, offering a more comprehensive understanding of the observed scenes. Here’s how Satpy creates these composites:

1. **Combining Channels**: Satpy combines data from multiple channels based on predefined or custom algorithms. For example, a true-color image composite uses red, green, and blue spectral data to create a representation similar to what the human eye would see.
2. **Enhancements**: After combining the channels, Satpy applies enhancements to improve the visual contrast and color of the composite images. These enhancements might include adjusting brightness, contrast, and color curves to make certain features more distinguishable.
3. **Mapping**: The data is often reprojected onto a standard map projection for easy integration and comparison with other geospatial data, facilitating uses in weather forecasting, research, and decision support systems.

### Example: True Color Composite

A common composite created by Satpy is the "True Color" image, which simulates natural coloration of the Earth as seen from space. It uses data from the red, green, and blue visible light channels. Here’s a simple example of how to create this composite in Satpy:

```python
# Load the necessary channels
channels = ['C01', 'C02', 'C03']  # These channel names may vary based on the satellite
scn.load(channels)

# Create the true color composite
true_color = scn['true_color']

# Show the composite
true_color.show()
