<table style="width:100%; background-color: #D9EDF7">
  <tr>
    <td style="border: 1px solid #CFCFCF">
      <b>Weather data: Main notebook</b>
      <ul>
        <li><a href="main.ipynb">Main Notebook</a></li>
        <li><a href="download.ipynb">Downloading Notebook</a></li>
        <li>Documentation</li>
      </ul>
      <br>This Notebook is part of the <a href="http://data.open-power-system-data.org/weather_data">Weather data Datapackage</a> of <a href="http://open-power-system-data.org">Open Power System Data</a>.
    </td>
  </tr>
</table>


# What is MERRA-2?
The MERRA-2 dataset provided by NASA Goddard Space Flight Center covers a wide range of reanalysis weather data for the whole globe.

>*The Modern-Era Retrospective analysis for Research and Applications, Version 2 (MERRA-2) provides data beginning in 1980. It was introduced to replace the original MERRA dataset because of the advances made in the assimilation system that enable assimilation of modern hyperspectral radiance and microwave observations, along with GPS-Radio Occultation datasets. It also uses NASA ozone observations after 2005. Additional advances in both the GEOS-5 model and the GSI assimilation system are included in MERRA-2. Spatial resolution remains about the same (about 50 km in the latitudinal direction) as in MERRA.*

>*Along with the enhancements in the meteorological assimilation, MERRA-2 takes some significant steps towards GMAO’s target of an Earth System reanalysis. MERRA-2 is the first long-term global reanalysis to assimilate space-based observations of aerosols and represent their interactions with other physical processes in the climate system. MERRA-2 includes a representation of ice sheets over (say) Greenland and Antarctica.*

>*(taken from http://gmao.gsfc.nasa.gov/reanalysis/MERRA-2/)*

MERRA-2 offers **51 different datasets** with hundreds of weather parameters. They usually come in the **NetCDF dataformat**, an open binary format primarily used in climate and geosciences.
> _"NetCDF is a set of software libraries and self-describing, machine-independent data formats that support the creation, access, and sharing of array-oriented scientific data."_ ([Source](http://www.unidata.ucar.edu/software/netcdf/))

NetCFD files can be understood as a multidimensional list file – i.e.  something like a collection of individual lists in one file. Due to the number of variables in one dataset and the vast number of geo points a file of one dataset for a single day can be as big as several hundred MB in size.

## More Information
- [Overview page by the Global Modeling and Assimilation Office GMAO](http://gmao.gsfc.nasa.gov/reanalysis/MERRA-2/)
- [List of all MERRA-s datasets](http://goldsmr4.sci.gsfc.nasa.gov/dods/) (the ".info"-link leads to a metadata page)
- [Extensive MERRA-2-page of the GEOS–Chem Wiki](http://wiki.seas.harvard.edu/geos-chem/index.php/MERRA-2)
- [List of MERRA-2 datasets and their variables](http://gmao.gsfc.nasa.gov/projects/yotc/GMAO_YOTC_Product_Collections.pdf)
- [Detailed list of MERRA-2 file specifications and dataset contents](http://gmao.gsfc.nasa.gov/pubs/docs/Bosilovich785.pdf)

# What is OPeNDAP?

OPenDAP stands for “Open-source Project for a Network Data Access Protocol”. It is a HTTP-based standard protocol for data transmission designed specifically for science data. OPeNDAP is based on Data Access Protocol DAP (currently Version 2.0, Aug 2011). The standard has been developed by NASA scientists (DAP is a "NASA Community standard"). OPenDAP provides data types to accommodate gridded data, relational data, and time series and also allows users to define their own data types. Standards for encapsulating structured data, annotating the data with attributes and adding semantics that describe the data are also included. This includes subsetting capabilities that allow easy data access.

There is a host of programs and tools with build-in OPeNDAP-libraries to obtain, view, edit and use data. E.g. [Panoply](http://www.giss.nasa.gov/tools/panoply/) or the [Pydap library](http://www.pydap.org/) for Python.

**Example data set:** http://test.opendap.org/opendap/data/nc/fnoc1.nc (not directly accessible)
- To view the original data -> `append .ascii` (http://test.opendap.org/opendap/data/nc/fnoc1.nc.ascii)
- Dataset Descriptor Structure (DDS): description of the "shape" of the data (using a vaguely C-like syntax) -> `append .dds` to the URL (http://test.opendap.org/opendap/data/nc/fnoc1.nc.dds)
- Data Attribute Structure (DAS): contains information about the data (e.g. units and the name of the variable) -> `append .das` to the URL (http://test.opendap.org/opendap/data/nc/fnoc1.nc.das)
- DDS+DAS -> for returning DDS and DAS in a single request `append .info` (http://test.opendap.org/opendap/data/nc/fnoc1.nc.info)
- Simple Dataset Access Form: Browser-friendly form to subset data manually -> `append .html` (http://test.opendap.org/opendap/data/nc/fnoc1.nc.html)

**More Information on OPeNDAP**
- [OPenDAP Quick Start](http://docs.opendap.org/index.php/QuickStart)
- [OPenDAP User Guide](http://docs.opendap.org/index.php/UserGuide)
- [DAP Specifications](https://earthdata.nasa.gov/files/ESE-RFC-004v1.1.pdf)
- [NASA Earthdata Webinar (Youtube): Simplifed NASA Earth Science Data Access through OPeNDAP](https://www.youtube.com/watch?v=AJQ3m3E8SCY)
- [NASA Earthdata Webinar (Youtube): Improving Accessibility and Use of NASA Earth Science Data (u.a. OPeNDAP)](https://www.youtube.com/watch?v=N_BC7ZrWUwY)
- [OPenDAP Client Software](http://www.opendap.org/support?q=whatClients)
- [How to download a spatial and variable subset of Level 1B data using OPeNDAP](http://disc.sci.gsfc.nasa.gov/recipes/?q=recipes/How-to-download-a-spatial-and-variable-subset-of-Level-1B-data-using-OPeNDAP)
- [How to Obtain Data in NetCDF Format via OPeNDAP](http://disc.sci.gsfc.nasa.gov/recipes/?q=recipes/How-to-Obtain-Data-in-NetCDF-Format-via-OPeNDAP)
- [How to View Remote Data in OPeNDAP with Panoply](http://disc.sci.gsfc.nasa.gov/recipes/?q=recipes/How-to-View-Remote-Data-in-OPeNDAP-with-Panoply)

# Why OPSD uses MERRA-2
- It is freely available and accessible.
- It provides worldwide weather data.
- Its data goes back to 1980 and is constantly updated (monthly with a delay of approx. 3 weeks).
- It provides wind & temperature data in six different heights and thus allows for the creation of height profiles.
- It provides hourly values (even though they are calculated down from 3-hour-data).
- It has a worldwide resolution of 0.625° by 0.5°.
- It uses the OPeNDAP standard (see below) which makes data access easier.
- It has a huge international community.

# How to obtain original MERRA-2 data manually
There is a host of methods to obtain the original MERRA-2 datasets manually. Some of which are:

## Direct FTP
- [ftp://goldsmr4.sci.gsfc.nasa.gov/data/s4pa/](ftp://goldsmr4.sci.gsfc.nasa.gov/data/s4pa/)
- For sub-daily data click “MERRA2” 
- Choose the correct dataset (see [document](http://gmao.gsfc.nasa.gov/pubs/docs/Bosilovich785.pdf for dataset names) for dataset names) 
- click youself through the folders to find datasets for desired days

## Simple Subset Wizard (SSW)
- http://disc.sci.gsfc.nasa.gov/SSW
- click "select Data Sets" (Button next to keyword box)
- expand "Goddard Earth Sciences Data and Information Services Center"
- expand "MERRA-2"
- choose dataset(s) (hovering over dataset name reveals more detailed info for dataset, see [document](http://gmao.gsfc.nasa.gov/pubs/docs/Bosilovich785.pdf for dataset names))
- Click "choose" to confirm selection
- Enter date range
- Enter South,West,North,East coordinates or use the map to define spatial boundaries
- Hit "Search for Datasets"
- Choose parameters from found datasets
- Hit "Subset Selected Data Sets" to confirm selection (bottom middle button)
- Click "View Subset Results" (bottom right button) to display results of all datasets or hit green downwards arrow sign behind each individiaul dataset
- Files have to be downloaded individually for each day and for each dataset!
- _Alternative: Download text file (*.inp) with compiled download links to individual files (e.g. for use with download managers like wget)_

## FTP Subsetter
- http://disc.sci.gsfc.nasa.gov/daac-bin/FTPSubset2.pl
- Entering selected data similar to Simple Subset Wizard
- Choose dataset from dropdown list (see [document](http://gmao.gsfc.nasa.gov/projects/yotc/GMAO_YOTC_Product_Collections.pdf)
- Define spatial boundaries with map or by entering coordinates
- Define timeframe
- Select boxes with parameters
- Choose additional options
- Hit "Start Search"
- Download individual daily files manually

## Mirador
http://mirador.gsfc.nasa.gov/

*Search only works with keyword! -> One has to know the exact filename to access the correct data instantaniously (e.g. without searching through all available data for "wind").*

- enter keyword (i.e. the name of the dataset, see [dataset list](http://gmao.gsfc.nasa.gov/pubs/docs/Bosilovich785.pdf for dataset names))
- Enter timeframe
- choose spatial boundaries from map or enter in format `(minLat, minLon)(maxLat, maxLon)`
- ignore advanced search options if not looking for data from specific event (e.g. storms, volcanoe outbreaks etc.)
- Hit "Search GES-DISC"
- Results Page:
  - View Files: List of downloadable daily files (including separate XMKL file with metadata)
  - Info: Documentation page for dataset with citation, description etc.
  - Data calendar: Opens page with access to complete available timeframe (1980-2015 currently) regardless of previous choice of timeframe
  - Button "List selected Files": Displays similar List to "View Files" (see above)
  - Button "Timeline view"
  - Button "Add selected Files to Cart": Self-explanatory
  - click "Checkout" directs to download page
    - Basic download options: Download link container for wget or curl
    - More download options:
      - Download as .jar
      - Textfile(.shtml) for the use with browser download manager addins
      - more options for wget and curl

## More information on obtaining the data
- [How to obtain/plot/analyze data](http://reanalyses.org/atmosphere/how-obtainplotanalyze-data)
- [Software for Manipulating or Displaying NetCDF Data](http://www.unidata.ucar.edu/software/netcdf/docs/software.html)
- [Data Cookbook: tons of descriptions how to obtain and view MERRA-2 data](http://disc.sci.gsfc.nasa.gov/recipes/?q=recipe-cookbook)

# OPSD's approach to weather data
**Weather data differ significantly from the other data types** used resp. provided by OPSD in that the sheer size of the data packages greatly exceeds OPSD's capacity to host them in a similar way as feed-in timeseries, power plant data etc. While the other data packages also offer a complete one-klick download of the bundled data packages with all relevant data this is impossible for weather datasets like MERRA-2 due to their size (variety of variables, very long timespan, huge geographical coverage etc.). It would make no sense to mirror the data from the NASA servers. We only offer a sampla data set for Germany and the year 2016

Instead we choose to provide "only" a **documented methodological script** (as a kind of tutorial) that allows to download and filter specific datasets, parameters, geographical areas and timeframes and export them in an easily readable format (CSV). The following method describes one way to automatically obtain the desired weather data from the MERRA-2 database and aims to simplify resp. unify the above mentioned manual methods in a single script. The use of MERRA-2 is only exemplary - through the use of the OPenDAP interface it can be adapted to other datasets using the same protocol.

The script is tailored to the needs of energy system modellers that a) do not want to downlad and haggle with the original MERRA-2 data manually, and those who on the other side b) do not just want to take over ready-made feed-ins calculated by tools like [renewables.ninja](https://www.renewables.ninja/) but rather want to use their own feed-in tools with processed weather data.

Please note that the structure of NASA's use of the OPeNDAP standard (namely the system of filenames and locations etc.) has **changed frequently** over the past. Unfortunately we are not always able to keep the script up-to-date to all changes.

# Solar Radiation
Unfortunately, the MERRA-2 dataset does not offer variables for the direct and indirect radiation, which is necessary for the calculation of the PV-feed in. Thus users have to apply a method to calculate the direct and indirect radiation from the given parameters in MERRA-2 (SWGDN and SWTDN).

## Methods to divide global into (in)direct radiation
There are different approaches to the division of global radiation. A 2010 [Master Thesis](csold.unibe.ch/students/theses/msc/34.pdf) by F. Lanini (University of Bern) gives a good overview and evaluation of different decomposition models (p. 32 ff.) Most methods applied in models are based on the method by Reindl et al. (D.T. Reindl, W.A. Beckman, and J.A. Duffie. Diffuse fraction correlations. _Solar Energy_, 45(1):1 – 7, 1990.)

Implementations can be found for example in 
* the global solar energy estimator (gsee) as part of [renewables.ninja](https://www.renewables.ninja/) (see https://github.com/renewables-ninja/gsee)
* the Renewable Energy Atlas by Anders. A. Sondergaard (within the Project "Optimization of the future power systems’ main RE sources – wind and sun" at Aarhus University in 2013, see https://github.com/FRESNA/atlite) and documented below.

Other ressources/studies discussing the topic:
* Boland, Ridley, Brown (2007): Models of diffuse solar radiation (http://dx.doi.org/10.1016/j.renene.2007.04.012)
* Helbig (2009): Application of the radiosity approach to the radiation balance in complex terrain (http://dx.doi.org/10.5167/uzh-30798)
* Ridley, Boland, Lauret (2009): Modelling of diffuse solar fraction with multiple predictors (http://dx.doi.org/10.1016/j.renene.2009.07.018)
* Lauret, Boland, Ridley (2012): Bayesian statistical analysis applied to solar radiation modelling (http://dx.doi.org/10.1016/j.renene.2012.01.049)
* Pfenninger, Staffell (2016): Long-term patterns of European PV output using 30 years of validated hourly reanalysis and satellite data (http://dx.doi.org/10.1016/j.energy.2016.08.060)

## Documentation of possible method
Based on Reindl et al. (1990) and Sondergaard (2013) the following two parameters available from the MERRA-2 dataset can be used:
* SWGDN = Surface (ground-level) incident shortwave flux (i.e. Global Horizontal Irradiance, GHI)
* SWTDN = Total horizontal radiation at top of the atmosphere (TOA)

(Please be aware that one of the disadvantages of this and other methods is the poor accuracy at low solar angles.)

### Step 1: Solar Altitude Angle alpha
\begin{equation*}
sin(\alpha)=sin(L)sin(\delta)+cos(L)cos(\delta)cos(h)
\end{equation*}
with
* **L: Latitude** in degree (°), positive = Northern Hemisphere
* **Delta: Declination**
\begin{equation*}
\delta=23.45°*sin(\frac{360°}{365}*(284-N))
\end{equation*}
where _N is the Day Number_: January 1st = 0, January 2nd = 1, ...)


* **h: Hour Angle**
\begin{equation*}
h=(AST-12 hours)*15\frac{°}{hour}
\end{equation*}

**AST: Apparent Solar Time**
\begin{equation*}
AST=LST+ET+4\frac{min}{°}*(SL-LL)
\end{equation*}
with
* **LST: Local Standard Time**
* **ET: Equation of time** (in minutes!)
\begin{equation*}
ET=9.87 * sin[2*(N-81)*\frac{4\pi}{364}] - 7.83*cos[(N-81)*\frac{2\pi}{364}] - 1.5*sin[(N-81)*\frac{2\pi}{364}]
\end{equation*}
* **SL: Standard Longitude** (positive = Western Hemisphere)
* **LL: Local Longitude**

### Step 2: Clearness Index k
\begin{equation*}
k=\frac{SWGDN}{SWTDN*sin(\alpha)}
\end{equation*}

### Step 3: Share of Indirect Radiation
$\frac{I_{diffuse}}{I} = \begin{cases}
1.02-0.254k+0.0123sin(\alpha) & 0 < k \le 0.3 \\
1.4-1.794k+0.177sin(\alpha) & 0.3 < k \le 0.78 \\
0.486k+0.182sin(\alpha) & 0.78 < k \\ \end{cases}$

### Step 4: Direct & Indirect Radiation
\begin{equation*}
I_{diffuse}=SWGDN*\frac{I_{diffuse}}{I}
\end{equation*}

\begin{equation*}
I_{direct}=SWGDN*(1-\frac{I_{diffuse}}{I})
\end{equation*}