## NDFD XML REST API

Exploring the NDFD XML REST API for weather data retrieval and processing using python

Access forecasts using the format:https://api.weather.gov/gridpoints/{office}/{gridX},{gridY}/forecast and then Retrieve grid information using:https://api.weather.gov/points/{latitude},{longitude}

XML Schema: The schema for DWML
https://digital.weather.gov/xml/schema/DWML.xsd

XMLS Design:
https://digital.weather.gov/xml/mdl/XML/Design/MDL_XML_Design.htm


In [77]:
import requests
import xml.etree.ElementTree as ET
import pandas as pd
from datetime import date # , timedelta, datetime


lat =  42.70243
lon = -84.48565

DEFAULT_USER_AGENT = '(enviroweather.msu.edu, ewx@enviroweather.msu.edu)'



From NWS: A User Agent is required to identify your application. This string can be anything, and the more unique to your application the less likely it will be affected by a security event. If you include contact information (website or email), we can contact you if your string is associated to a security event. This will be replaced with an API key in the future.

User-Agent: (myweatherapp.com, contact@myweatherapp.com)

headers = {
    'User-Agent': 'enviroweather.msu.edu, ewx@enviroweather.msu.edu'
}

## Summary Forecast - not used


e.g. descriptive forecast used for map click display.  

**This is not used**, 
it does not contain relative humidity and other values that are available in the 
unsummarized 'digital' forecast (see below), and you can't specify a date

To use this, you first get the grid ID and office for a given lat/lon, then use that to 
get the summary forecast

In [78]:
def get_ndfd_grid(lat, lon, user_agent=DEFAULT_USER_AGENT):
    """Retrieve NDFD grid information for given latitude and longitude.
    Args:
        lat (float): Latitude of the location.
        lon (float): Longitude of the location.
        user_agent (str): User-Agent string for the HTTP request to identify the application.
    Returns:
        tuple: A tuple containing relevant NWS Office code, gridX and gridY to use in NDFD API calls.
    """
    
    nws_grid_url = f"https://api.weather.gov/points/{lat},{lon}"
    
    headers = {"User-Agent": user_agent}
    
    response = requests.get(nws_grid_url, headers = headers)
    grid_data = response.json()
    grid_x = grid_data['properties']['gridX']
    grid_y = grid_data['properties']['gridY']
    nws_office = grid_data['properties']['gridId'] 

    return (nws_office, grid_x, grid_y)

NDFD_FORMATS = {
    "GeoJSON" : "application/geo+json",
    "JSON-LD" : "application/ld+json",
    "DWML" : "application/vnd.noaa.dwml+xml",
    "OXML" : "application/vnd.noaa.obs+xml",
    "CAP" : "application/cap+xml",
    "ATOM" : "application/atom+xml"
}

def get_current_ndfd_summary_forecast(lat, lon, output_format = None, raw=True, user_agent = DEFAULT_USER_AGENT):
    """Retrieve current forecast from NDFD API for given latitude and longitude.   
    Args:
        lat (float): Latitude of the location.
        lon (float): Longitude of the location.
        raw (bool, optional): If True, returns raw response object. Defaults to True
        output_format (str, optional): Desired output format. Options are keys from NDFD_FORMATS.
        user_agent (str, optional): User agent string for API requests.
    Returns:
        response object
    """

    (wfo, grid_x, grid_y) = get_ndfd_grid(lat, lon)

    if output_format and output_format not in list(NDFD_FORMATS.keys()):
        raise ValueError(f"Invalid output format. Choose from {NDFD_FORMATS}")
    
    forecast_units="si"  # Use "us" for imperial units
    
    # if raw:
    #     forecast_url = f"https://api.weather.gov/gridpoints/{wfo}/{grid_x},{grid_y}?units={forecast_units}"
    # else:
    
    forecast_url = f"https://api.weather.gov/gridpoints/{wfo}/{grid_x},{grid_y}/forecast?units={forecast_units}"


    headers = {"User-Agent": user_agent}
    if output_format:
        headers["Accept"] = NDFD_FORMATS[output_format]
    
    forecast_response = requests.get(forecast_url, headers=headers)
    
    # if xml, then there is no json.  and downstream may want to process response codes, etc
    return(forecast_response)

Example usage of summary forecast:

In [79]:
lat =  42.70243
lon = -84.48565

(wfo, grid_x, grid_y) = get_ndfd_grid(lat, lon)

print(wfo, grid_x, grid_y)


GRR 82 38


In [80]:
fcst_response = get_current_ndfd_summary_forecast(lat, lon,raw = True) # output format doesn't seem to matter for 'raw', output_format = "DWML")

fcst = fcst_response.json()
list(fcst.keys())


['@context', 'type', 'geometry', 'properties']

JSON Format unsummarized forecast - save to disk

In [81]:
import json
with open("ndfd_example_summary_forecast.json", "w") as f: 
    f.write(json.dumps(fcst, indent=4))
    

The summary forecast can be retrieved in XML format as well.  

see the file ndfd_example_summary_forecast.xml for example output

In [82]:
fcst_response = get_current_ndfd_summary_forecast(lat, lon, output_format = "DWML")

In [68]:
with open('ndfd_example_summary_forecast.xml', 'w') as f:
    f.write(fcst_response.text)


## Summarized data - not used

Example Usage: 

`https://digital.weather.gov/xml/sample_products/browser_interface/ndfdBrowserClientByDay.php?lat=38.99&lon=-77.01&format=24+hourly&numDays=7&XMLformat=TSML`

> NOTE: There is no choice by user of meteorological elements for these summarization functions. The following summarized elements are always returned by default: Maximum Temperature, Minumum Temperature, 12 Hourly Probability of Precipitation, Weather, Icons, and Hazards (Watches, Warnings, and Advisories).

That does not include relative humidity, wind speed, wind direction, or other elements that may be of interest. For more control over the elements returned, use the NDFD XML REST API to retrieve unsummarized data and perform your own summarization.

There is no code to get this type of NDFD output


## Unsummarized Weather (aka 'digital' or 'xml') - what we use

NDFD XML REST API Documentation for unsumarized forecast data

Docummentation for using this REST Service is available at 
`https://graphical.weather.gov/xml/rest.php/docs/docs/#use_it`

element names = https://graphical.weather.gov/xml/docs/elementInputNames.php

status of various ndfd products (for historical forecasts) https://vlab.noaa.gov/documents/6609493/7858379/NDFDStatus.pdf


In [None]:

# previous code, see below for more detailed version




# def ndfd_example():
#     working_example = "https://digital.weather.gov/xml/sample_products/browser_interface/ndfdBrowserClientByDay.php?lat=38.99&lon=-77.01&format=24+hourly&numDays=7&XMLformat=DWML"
    
#     # dwml by default, not summarized 
#     base_url = "https://digital.weather.gov/xml/sample_products/browser_interface/ndfdXMLclient.php?Unit=m&lat=32.5&lon=-81.5&product=time-series&begin=2025-12-08T00:00:00&end=2030-04-20T00:00:00&maxt=maxt&mint=mint&rh=rh&wspd=wspd&qpf=qpf"
   
#     forecast_url = f"{base_url}"
    
#     forecast_response = requests.get(forecast_url)
#     return(forecast_response)
    
    
# def ndfd_rest(user_agent = DEFAULT_USER_AGENT, output_format = None):
#     tomorrow = date.today()+timedelta(days = 1)
#     begin_datetime = datetime.combine(tomorrow, time = datetime.min.time())
#     end_datetime = begin_datetime + timedelta(days = 6)
#     begin = begin_datetime.isoformat()
#     end = end_datetime.isoformat()
    
#     lat = 38.99
#     lon = -77.01
#     forecast_elements = "maxt=maxt&mint=mint&rh=rh&wsp=wspd&qpf=qpf&appt=appt" 
#     base_url = "https://digital.weather.gov/xml/sample_products/browser_interface/ndfdXMLclient.php?"
#     forecast_url = f"{base_url}?lat={lat}&lon={lon}&product=time-series&{forecast_elements}"  #&begin={begin}&end={end}
#     print(forecast_url)
#     headers = {"User-Agent": user_agent}
#     if output_format:
#         headers["Accept"] = NDFD_FORMATS[output_format]
    
#    forecast_response = requests.get(forecast_url, headers=headers)
#    return(forecast_response)


# resp = ndfd_example()
# with open('ndfd_digital_weather.xml', 'w') as f:
#     f.write(resp.text)   
    

Functions to get unsummarized NDFD forecast data and combine into a pandas
data frame with daily summaries

In [69]:

def construct_ndfd_digital_forecast_url(lat, lon, begin=None, end=None):
    # dwml by default, not summarized 
    base_url = "https://digital.weather.gov/xml/sample_products/browser_interface/ndfdXMLclient.php"
    
    if begin is None:
        date_today =  date.today().isoformat() + "T00:00:00"
    else:
        date_today = begin
        
    if end is None:
        date_future = '2030-04-20T00:00:00'
    else:
        date_future = end
        
    forecast_params = f"Unit=m&lat={lat}&lon={lon}&product=time-series&begin={date_today}&end={date_future}&maxt=maxt&mint=mint&rh=rh&wspd=wspd&qpf=qpf"
    
    forecast_url = f"{base_url}?{forecast_params}"
    
    return forecast_url

def request_ndfd_digital_forecast(lat, lon, user_agent = DEFAULT_USER_AGENT):
    
    # dwml by default, not summarized 
    # base_url = "https://digital.weather.gov/xml/sample_products/browser_interface/ndfdXMLclient.php?Unit=m"
    
    date_today =  date.today().isoformat() + "T00:00:00"
    date_future = '2030-04-20T00:00:00'
    
    
    # forecast_params = f"&lat={lat}&lon={lon}&product=time-series&begin={date_today}&end={date_future}&maxt=maxt&mint=mint&rh=rh&wspd=wspd&qpf=qpf"
    
    # forecast_elements = "maxt=maxt&mint=mint&rh=rh&wsp=wspd&qpf=qpf&appt=appt" 
    # base_url = "https://digital.weather.gov/xml/sample_products/browser_interface/ndfdXMLclient.php?"
    # forecast_url = f"{base_url}?lat={lat}&lon={lon}&product=time-series&{forecast_elements}"  #&begin={begin}&end={end}
     
    forecast_url = construct_ndfd_digital_forecast_url(lat, lon, begin=date_today, end=date_future) # f"{base_url}?{forecast_params}"
    

    headers = {"User-Agent": user_agent}
    
    forecast_response = requests.get(forecast_url, headers=headers)
    return(forecast_response)
    
  

def get_start_times(root, time_layout_key):
    time_layouts = root.findall('.//time-layout')
    for tl in time_layouts:
        layout_key = tl.find('layout-key').text
        if layout_key == time_layout_key:
            start_times = [st.text for st in tl.findall('start-valid-time')]
            return start_times
    return []


def weather_metric_xml_to_df(root, metric_path):
    weather_values = root.find(metric_path)
    time_layout_key = weather_values.get('time-layout')
    start_times = get_start_times(root, time_layout_key)
    values = [v.text for v in weather_values.findall('value')]
    
    df = pd.DataFrame(
        {
            'forecast_time': start_times,
            'value': values
        }
    )
    df['forecast_date'] = pd.to_datetime(df['forecast_time']).dt.date
    df['value'] = pd.to_numeric(df['value'], errors='coerce')
    return df

def weather_metric_name_from_xml(root, metric_path):
    weather_values = root.find(metric_path)
    unit_name = weather_values.get('units')
    value_name = weather_values.find('name').text
    return f"{value_name} ({unit_name})"
    
    
def daily_forecast_summary(lat, lon, hourly_weather = None):
    ######
    # add hourly weather into df here for today
    # the way it's added depends on the metric and how that's stored in df from the xml
    resp = request_ndfd_digital_forecast(lat, lon)
    # if resp has an error # note status code is always 200 even if params are invalid
    if "ERROR" in resp.text.upper():
        # extract error message from resp.text and put in raise msg
        print(resp.text)
        raise ValueError("Error retrieving NDFD digital weather data.")
        
        
    root = ET.fromstring(resp.text)
    
    
    metric_path = './/humidity'
    metric_name = weather_metric_name_from_xml(root, metric_path)
    humidity_df = weather_metric_xml_to_df(root, metric_path)
    # humidity must be summarized 
    humidity_daily = pd.DataFrame(
        { 
            f'Maximum {metric_name}': humidity_df.groupby('forecast_date')['value'].max(), 
            f'Minimum {metric_name}': humidity_df.groupby('forecast_date')['value'].min()
        }
    )
    
    
    metric_path = ".//temperature[@type='minimum']"
    metric_name = weather_metric_name_from_xml(root, metric_path)
    min_temperature_df = weather_metric_xml_to_df(root, metric_path)
    min_temperature_daily = pd.DataFrame(
        { 
            f'{metric_name}': min_temperature_df.groupby('forecast_date')['value'].first()
        }
    )
    
    metric_path = ".//temperature[@type='maximum']"
    metric_name = weather_metric_name_from_xml(root, metric_path)
    max_temperature_df = weather_metric_xml_to_df(root, metric_path)
    max_temperature_daily = pd.DataFrame(
        { 
            f'{metric_name}': max_temperature_df.groupby('forecast_date')['value'].first()  
        }
    )
    
    summary_df = pd.concat([humidity_daily, min_temperature_daily, max_temperature_daily], axis=1)
    
    return summary_df



### Get example data

this gets example XML data and saved to a file

In [73]:
lat = 42.73
lon = -84.55
api_response = request_ndfd_digital_forecast(lat, lon)

with open('ndfd_example_unsummarized_forecast.xml', 'w') as f:
     f.write(api_response.text)

use function to calculate daily values and combine into single dataframe

*the min temperature for the final day is not present* NaN


In [75]:
# Lansing (approximately)
lat = 42.73
lon = -84.55

daily_forecast_df = daily_forecast_summary(lat, lon)
daily_forecast_df

Unnamed: 0_level_0,Maximum Relative Humidity (percent),Minimum Relative Humidity (percent),Daily Minimum Temperature (Celsius),Daily Maximum Temperature (Celsius)
forecast_date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
2025-12-11,81,80,-11.0,
2025-12-12,88,68,-11.0,-3.0
2025-12-13,96,61,-17.0,-8.0
2025-12-14,84,68,-13.0,-8.0
2025-12-15,92,70,-10.0,-6.0
2025-12-16,92,81,-3.0,0.0
2025-12-17,93,86,0.0,4.0
2025-12-18,97,81,,5.0


Relative humidity and other forecasted measurements that are hourly will not be 
a complete day for the current date.  For example, if the current time is 3 PM,
the first day of a 7-day forecast will only include data from 3 PM to midnight. 


It's easy to calculate the daily average relative humidity from the hourly data for
Subsequent days since they will have a fulls days worth of data.

For the forecasted daily average for today, we need to combine the existing hourly data 
for the location.   This would mean accessing the database.  However we want to keep
this library database-free so it's general purpose.

Hence create a method that will accept previous hourly data for today for all 
metrics to calculate a daily forecast for today by combinging them.  

Min Temperature is an issue because I believe min temp is only forecasted for the 
first half ot eh day, even though the min  temp could occur in the evening or overnight.

in summary, to create daily forecasts, 
 - get the hourly forecast data from NDFD XML REST API
 - accept all of today's hourly data for all metrics as an input parameter
 - create a data frame the combines today's hourly data with the forecasted data for subsequent days
 - calculate daily summaries from the combined data frame and return that as the daily forecast

why should this be done in this package inteead of the QC or utils package that 
has the database hourly values?  otherwise we'd have to return the raw hourly data
from the forecast, and some days only have 3-hour averages.  If we returns the daily average
we wouldn't know how many hours were used to calculate the average, so we'd have to 
return a data structure that had houlr data for today, but average for subsequent days, 
which is inelegant.   

I think it's better to : 
 - get today's hourly data so far
 - call this library with that data and the lat/lon for the forecast
    - this library retrieves the hourly forecast data from NDFD XML REST API
    - combines today's hourly data with the forecasted hourly data
    - returns daily summaries

the can be methods for getting raw data in a group of data frames, different 
data frames for each metric becuase they are on different time scales. 

