Skip to content

rmcgranaghan/precipNet

Repository files navigation

precipNet

This software repository contains code to create and interrogate the 'precipNet' machine learning model developed to specify magnetospheric particle precipitation into the ionosphere as observed by the Defense Meteorological Satellite Program (DMSP) spacecraft SSJ 4/5 observations.

Development and details of precipNet are provided here (link to preprint).

Previous work by the International Space Sciences Institute (ISSI) team "Novel approaches to multiscale geospace particle transfer: Improved understanding and prediction through uncertainty quantification and machine learning" laid the foundation for this work and provides many useful resources.

Dependencies

Notebooks and Scripts

  • Precipitation_Model_Evaluation_Utilities.ipynb
    • Functions to calculate auroral boundaries and hemispheric powers given global high-latitude energy flux maps
  • standard_assessment_metrics_function.ipynb
    • Function to calculate the standard assessment metrics, the set of which follows guidance for geospace given by Liemohn et al., 2018
  • time_hist2.py
    • Function to calculate the time history of OMNI data (solar wind and geomagnetic indices) given data frame
  • Final__Data_Read_And_Prepare.ipynb
    • Sample notebook revealing how to read in the database and prepare it for machine learning investigation
  • Final__ML_model_load.ipynb
    • Notebook that shows how to read in the serialized model (located in 'ml_model' directory) and shows how to make predictions with it
    • Note that the steps below show how to construct data samples to be used for model predictions
  • Existing resources from the ISSI team: https://github.com/rmcgranaghan/ISSI_geospaceParticles
  • Text files describing the input features: 'inputfeature_labels.txt' (the full set of input features) and 'inputfeature_labels_reduced.txt' (the features after the feature importance process was conducted)
  • custom_tail_loss.ipynb
    • a function for a custom tail loss function that can be used as a new objective function with these data
  • New resources will appear here as they are prepared

Database creation:

  • The central data file used in the scripts is titled ''ML_DB_subsamp.csv'' and is provided here (DOI to published dataset forthcoming). The steps to create those data were (note that we do not provide intermediate datasets):
    1. Access NASA-provided DMSP data at https://cdaweb.gsfc.nasa.gov/pub/data/dmsp/
    2. Read CDF files for given satellite (e.g., F-16)
    3. Collect the following variables at one-second cadence: SC_AACGM_LAT, SC_AACGM_LTIME, ELE_TOTAL_ENERGY_FLUX, ELE_TOTAL_ENERGY_FLUX_STD, ELE_AVG_ENERGY, ELE_AVG_ENERGY_STD, ID_SC
    4. Sub-sample the variables to one-minute cadence and eliminate any rows for which ELE_TOTAL_ENERGY_FLUX is NaN
    5. Combine all individual satellites into single yearly files
    6. For each yearly file, use nasaomnireader to obtain solar wind and geomagnetic index data programmatically and timehist2 to calculate the time histories of each parameter. Collate with the DMSP observations and remove rows for which any solar wind or geomagnetic index data are missing.
    7. For each row, calculate cyclical time variables (e.g., local time -> sin(LT) and cos(LT))
    8. Merge all years

The database for this work is considered 'Artificial Intelligence (AI)-ready' and can serve as a 'challenge data set' for further exploration. It has been published on Zenodo. If used, please cite:

Ryan M. McGranaghan, Téo Bloch; Jack Ziegler; Spencer Hatch; Enrico Camporeale; Mathew Owens; Kristina Lynch; Jesper Gjerloev; Binzheng Zhang; Susan Skone. (2020). DMSP Particle Precipitation AI-ready Data (Version 1.0.0-alpha) [Data set]. Zenodo. http://doi.org/10.5281/zenodo.4281122

Fruitful paths for ML investigation:

We believe it helpful to prioritize such next steps based on our experience and to inspire active extension of this work. We envision that those from the ML practitioner community may particularly benefit from the recommendations. Below is a prioritized list of ML investigations with accompanying justification for each recommendation ranking:

  • Develop an understanding of importance of auroral boundaries information to the prediction of particle precipitation and to the specification of the entire magnetosphere-ionosphere-thermosphere system. Auroral boundaries organize the high-latitudes and therefore the magnetosphere-ionosphere coupling. The regions are distinguished by different coupling to the magnetosphere, different behavior of the ionosphere-thermosphere, and are reflected by distinct particle precipitation characteristics. We have attempted here to understand the model's capability to specify the auroral boundaries, but the question remains about the information content of auroral boundary data. Hardy et al., [2008] discovered that the boundaries are the organization that separates precipitation populations, so it stands to reason that they would also be important to improving precipitation models. Two investigations are recommended: the extent to which auroral boundary data are the key to improved precipitation models and the improvement to GCMs possible with improved boundaries.
  • Explore the transfer of knowledge from trained ML models to new applications for space weather. Transfer learning is the process of storing knowledge gained while solving one problem and applying it to a different but related problem. Many space weather applications share characteristics, so transfer learning may offer a framework to spread information more effectively. The question to answer in this investigation is how transferring knowledge can improve model capability? There is indeed some precedent Clausen and Nickisch, [2018]. \end{enumerate}

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published