Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add functions to read and retrieve SolarAnywhere irradiance data #1497

Merged
merged 36 commits into from Dec 20, 2023
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
Show all changes
36 commits
Select commit Hold shift + click to select a range
d7deb80
Add cams.get_cams_radiation function
Feb 22, 2021
510f08e
Revert "Add cams.get_cams_radiation function"
Feb 22, 2021
8743519
Add get, read, parse solaranywhere functions
AdamRJensen Jul 22, 2022
b5b2508
Add whatsnew
AdamRJensen Jul 22, 2022
0158719
Updates to get_solaranywhere
AdamRJensen Jul 22, 2022
2aeafb1
Minor doc updates
AdamRJensen Jul 22, 2022
43eca90
Updated default values & add POE
AdamRJensen Jul 23, 2022
8456aee
Properly raise start/end ValueError
AdamRJensen Jul 26, 2022
cc4135d
Merge branch 'master' into solaranywhere
AdamRJensen Aug 16, 2022
78a3edb
Add api_key to pytest-remote-data.yml
AdamRJensen Aug 16, 2022
2d52719
Add test coverage
AdamRJensen Aug 16, 2022
7b0fd25
Set encoding to iso-8859-1
AdamRJensen Aug 16, 2022
acff9e1
Remove solaranywhere_api_key
AdamRJensen Aug 18, 2022
192e1b3
Remove parse_solaranywhere
AdamRJensen Aug 18, 2022
11ad2f1
Merge branch 'master' into solaranywhere
AdamRJensen Aug 18, 2022
1917da1
Update tests
AdamRJensen Aug 18, 2022
2ec4627
Update error message handling
AdamRJensen Oct 6, 2022
b7db5b1
Merge branch 'main' into solaranywhere
AdamRJensen Dec 19, 2023
3457ab0
Merge remote-tracking branch 'upstream/main' into solaranywhere
AdamRJensen Dec 19, 2023
ece667f
Update iotools.rst
AdamRJensen Dec 19, 2023
2711135
Update v0.10.3.rst
AdamRJensen Dec 19, 2023
3f440a9
Update __init__.py
AdamRJensen Dec 19, 2023
40ed834
Address code review by kandersolar
AdamRJensen Dec 19, 2023
59216d2
Update v0.9.2.rst
AdamRJensen Dec 19, 2023
6c3ca61
Update v0.9.2.rst
AdamRJensen Dec 19, 2023
e0abc76
Update tests
AdamRJensen Dec 19, 2023
de116aa
Update flake8.yml
AdamRJensen Dec 19, 2023
4744615
Update v0.10.3.rst
AdamRJensen Dec 19, 2023
0bcb74d
Update solaranywhere documentation
AdamRJensen Dec 19, 2023
2ddd01f
Add additional solaranywhere tests
AdamRJensen Dec 19, 2023
0a7f9e2
Update .github/workflows/flake8.yml
AdamRJensen Dec 19, 2023
736b0cc
Implement review changes from kandersolar
AdamRJensen Dec 19, 2023
a009d05
Update test_solaranywhere.py
AdamRJensen Dec 19, 2023
927ea6e
Merge branch 'main' into solaranywhere
AdamRJensen Dec 20, 2023
62ba7d9
Apply suggestions from code review
AdamRJensen Dec 20, 2023
6db7fa5
Switch to isinstance
AdamRJensen Dec 20, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
3 changes: 3 additions & 0 deletions docs/sphinx/source/reference/iotools.rst
Expand Up @@ -37,6 +37,9 @@ of sources and file formats relevant to solar energy modeling.
iotools.get_cams
iotools.read_cams
iotools.parse_cams
iotools.get_solaranywhere
iotools.read_solaranywhere
iotools.parse_solaranywhere

A :py:class:`~pvlib.location.Location` object may be created from metadata
in some files.
Expand Down
3 changes: 3 additions & 0 deletions docs/sphinx/source/whatsnew/v0.9.2.rst
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Revert this, or is it fixing a mistake?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no change there so I can't remove the file... I think it's fine leaving it. @kandersolar leave a thumb if you have an opinion

Expand Up @@ -11,6 +11,9 @@ Enhancements
* Add :py:func:`pvlib.tracking.calc_surface_orientation` for calculating
single-axis tracker ``surface_tilt`` and ``surface_azimuth`` from
rotation angles. (:issue:`1471`, :pull:`1480`)
* Add :py:func:`pvlib.iotools.read_solaranywhere` and
:py:func:`pvlib.iotools.get_solaranywhere` for reading and retrieving
SolarAnywhere solar irradiance data. (:pull:`1497`, :discussion:`1310`)
AdamRJensen marked this conversation as resolved.
Show resolved Hide resolved

Bug fixes
~~~~~~~~~
Expand Down
1 change: 1 addition & 0 deletions pvlib/data/variables_style_rules.csv
Expand Up @@ -4,6 +4,7 @@ latitude;latitude
longitude;longitude
dni;direct normal irradiance
dni_extra;direct normal irradiance at top of atmosphere (extraterrestrial)
dni_clear;clear sky direct normal irradiance
dhi;diffuse horizontal irradiance
bhi;beam/direct horizontal irradiance
ghi;global horizontal irradiance
Expand Down
3 changes: 3 additions & 0 deletions pvlib/iotools/__init__.py
Expand Up @@ -21,3 +21,6 @@
from pvlib.iotools.sodapro import get_cams # noqa: F401
from pvlib.iotools.sodapro import read_cams # noqa: F401
from pvlib.iotools.sodapro import parse_cams # noqa: F401
from pvlib.iotools.solaranywhere import get_solaranywhere # noqa: F401
from pvlib.iotools.solaranywhere import read_solaranywhere # noqa: F401
from pvlib.iotools.solaranywhere import parse_solaranywhere # noqa: F401
308 changes: 308 additions & 0 deletions pvlib/iotools/solaranywhere.py
@@ -0,0 +1,308 @@
"""Functions to read and retrieve SolarAnywhere data."""

import requests
import pandas as pd
import time
import json

URL = 'https://service.solaranywhere.com/api/v2'

# Dictionary mapping SolarAnywhere names to standard pvlib names
# Names with spaces are used in SolarAnywhere files, and names without spaces
# are used by the SolarAnywhere API
VARIABLE_MAP = {
'Global Horizontal Irradiance (GHI) W/m2': 'ghi',
'GlobalHorizontalIrradiance_WattsPerMeterSquared': 'ghi',
'DirectNormalIrradiance_WattsPerMeterSquared': 'dni',
'Direct Normal Irradiance (DNI) W/m2': 'dni',
'Diffuse Horizontal Irradiance (DIF) W/m2': 'dhi',
'DiffuseHorizontalIrradiance_WattsPerMeterSquared': 'dhi',
'AmbientTemperature (deg C)': 'temp_air',
'AmbientTemperature_DegreesC': 'temp_air',
'WindSpeed (m/s)': 'wind_speed',
'WindSpeed_MetersPerSecond': 'wind_speed',
'Relative Humidity (%)': 'relative_humidity',
'RelativeHumidity_Percent': 'relative_humidity',
'Clear Sky GHI': 'ghi_clear',
'ClearSkyGHI_WattsPerMeterSquared': 'ghi_clear',
'Clear Sky DNI': 'dni_clear',
'ClearSkyDNI_WattsPerMeterSquared': 'dni_clear',
'Clear Sky DHI': 'dhi_clear',
'ClearSkyDHI_WattsPerMeterSquared': 'dhi_clear',
'Albedo': 'albedo',
'Albedo_Unitless': 'albedo',
}

DEFAULT_VARIABLES = [
'StartTime', 'ObservationTime', 'EndTime',
'GlobalHorizontalIrradiance_WattsPerMeterSquared',
'DirectNormalIrradiance_WattsPerMeterSquared',
'DiffuseHorizontalIrradiance_WattsPerMeterSquared',
'AmbientTemperature_DegreesC', 'WindSpeed_MetersPerSecond',
'Albedo_Unitless', 'DataVersion'
]


def get_solaranywhere(latitude, longitude, api_key, start=None, end=None,
time_resolution=60, spatial_resolution=0.1,
true_dynamics=False, source='SolarAnywhereLatest',
variables=DEFAULT_VARIABLES, missing_data='Omit',
url=URL, map_variables=True, max_response_time=300):
"""Retrieve historical time series irradiance data from SolarAnywhere.

The SolarAnywhere API is described in [1]_ and [2]_. A detailed list of
available options for the input parameters can be found in [3]_.
AdamRJensen marked this conversation as resolved.
Show resolved Hide resolved

Parameters
----------
latitude: float
In decimal degrees, north is positive (ISO 19115).
longitude: float
In decimal degrees, east is positive (ISO 19115).
api_key: str
SolarAnywhere API key.
start: datetime like, optional
First timestamp of the requested period. If a timezone is not
specified, UTC is assumed. Not applicable for TMY data.
end: datetime like, optional
Last timtestamp of the requested period. If a timezone is not
AdamRJensen marked this conversation as resolved.
Show resolved Hide resolved
specified, UTC is assumed. Not applicable for TMY data.
time_resolution: {60, 30, 15, 5}, default: 60
Time resolution in minutes. For TMY data, time resolution has to be 60
min. (hourly).
AdamRJensen marked this conversation as resolved.
Show resolved Hide resolved
spatial_resolution: {0.1, 0.01, 0.005}, default: 0.1
Spatial resolution in degrees.
true_dynamics: bool, default: False
Whether to apply SolarAnywhere TrueDynamics statistical processing.
Only available for the 5-min time resolution.
AdamRJensen marked this conversation as resolved.
Show resolved Hide resolved
source: str, default: 'SolarAnywhereLatest'
Data source. Options include: 'SolarAnywhereLatest' (historical data),
'SolarAnywhereTGYLatest' (TMY for GHI), or 'SolarAnywhereTDYLatest'
(TMY for DNI). Specific dataset versions can also be specified, e.g.,
'SolarAnywhere3_2' (see [3]_ for a full list of options).
variables: list-like, default: :const:`DEFAULT_VARIABLES`
Variables to retrieve (see [4]_). Available variables depend on
whether historical or TMY data is requested.
missing_data: {'Omit', 'FillAverage'}, default: 'Omit'
Method for treating missing data.
url: str, default: :const:`pvlib.iotools.solaranywhere.URL`
Base url of SolarAnywhere API.
map_variables: bool, default: True
When true, renames columns of the Dataframe to pvlib variable names
where applicable. See variable :const:`VARIABLE_MAP`.
AdamRJensen marked this conversation as resolved.
Show resolved Hide resolved
max_response_time: int, default: 300
Time in seconds to wait for requested data to become available.

Returns
-------
data: pandas.DataFrame
Timeseries data from SolarAnywhere. The index is the observation time
(middle of period) in UTC.
metadata: dict
Metadata available (includes site latitude, longitude, and altitude).

See Also
--------
pvlib.iotools.read_solaranywhere

Note
----
SolarAnywhere data requests are asynchronous, and it might take several
minutes for the data to become available.

Examples
--------
>>> # Retrieve one month of SolarAnywhere data for Atlanta, GA
>>> data, meta = pvlib.iotools.get_solaranywhere(
... latitude=33.765, longitude=-84.395, api_key='redacted',
... start=pd.Timestamp(2020,1,1), end=pd.Timestamp(2020,2,1)) # doctest: +SKIP

References
----------
.. [1] `SolarAnywhere API
<https://www.solaranywhere.com/support/using-solaranywhere/api/>`_
.. [2] `SolarAnywhere irradiance and weather API requests
<https://developers.cleanpower.com/irradiance-and-weather-data/irradiance-and-weather-requests/>`_
.. [3] `SolarAnywhere API options
<https://developers.cleanpower.com/irradiance-and-weather-data/complete-schema/createweatherdatarequest/options/>`_
.. [4] `SolarAnywhere variable definitions
<https://www.solaranywhere.com/support/data-fields/definitions/>`_
""" # noqa: E501
headers = {'content-type': "application/json; charset=utf-8",
'X-Api-Key': api_key,
'Accept': "application/json"}

payload = {
"Sites": [{
"Latitude": latitude,
"Longitude": longitude
}],
"Options": {
"OutputFields": variables,
"SummaryOutputFields": [], # Do not request summary/monthly data
"SpatialResolution_Degrees": spatial_resolution,
"TimeResolution_Minutes": time_resolution,
"WeatherDataSource": source,
"MissingDataHandling": missing_data,
}
}

if true_dynamics:
payload['Options']['ApplyTrueDynamics'] = True

# Add start/end time if requesting non-TMY data (SolarAnywhereLatest)
if source == 'SolarAnywhereLatest':
if (start is None) or (end is None):
ValueError('When requesting non-TMY data, specifying `start` and'
'`end` is required.')
# start/end are required to have an associated time zone
if start.tz is None:
start = start.tz_localize('UTC')
if end.tz is None:
end = end.tz_localize('UTC')
payload['Options']["StartTime"] = start.isoformat()
payload['Options']["EndTime"] = end.isoformat()

# Convert the payload dictionary to a JSON string (uses double quotes)
payload = json.dumps(payload)
# Make data request
request = requests.post(url+'/WeatherData', data=payload, headers=headers)
# Raise error if request is not OK
if request.ok is False:
raise ValueError(request.json()['Message'])
# Retrieve weather request ID
weather_request_id = request.json()["WeatherRequestId"]

# The SolarAnywhere API is asynchronous, hence a second request is
# necessary to retrieve the data (WeatherDataResult).
start_time = time.time() # Current time in seconds since the Epoch
# Attempt to retrieve results until the max response time has been exceeded
while True:
time.sleep(5) # Sleep for 5 seconds before each data retrieval attempt
results = requests.get(url+'/WeatherDataResult/'+weather_request_id, headers=headers) # noqa: E501
results_json = results.json()
if results_json.get('Status') == 'Done':
if results_json['WeatherDataResults'][0]['Status'] == 'Failure':
raise ValueError(results_json['WeatherDataResults'][0]['ErrorMessages']) # noqa: E501
break
elif results_json.get('StatusCode') == 'BadRequest':
raise ValueError(f"Bad request: {results_json['Message']}")
elif (time.time()-start_time) > max_response_time:
raise TimeoutError('Time exceeded the `max_response_time`.')

# Extract time series data
data = pd.DataFrame(results_json['WeatherDataResults'][0]['WeatherDataPeriods']['WeatherDataPeriods']) # noqa: E501
# Set index and convert to UTC time
data.index = pd.to_datetime(data['ObservationTime'])
data.index = data.index.tz_convert('UTC')
AdamRJensen marked this conversation as resolved.
Show resolved Hide resolved
if map_variables:
data = data.rename(columns=VARIABLE_MAP)

# Parse metadata
meta = results_json['WeatherDataResults'][0]['WeatherSourceInformation']
meta['time_resolution'] = results_json['WeatherDataResults'][0]['WeatherDataPeriods']['TimeResolution_Minutes'] # noqa: E501
# Rename and convert applicable metadata parameters to floats
meta['latitude'] = float(meta.pop('Latitude'))
meta['longitude'] = float(meta.pop('Longitude'))
meta['altitude'] = float(meta.pop('Elevation_Meters'))
return data, meta


def read_solaranywhere(filename, map_variables=True):
"""
Read a SolarAnywhere formatted file into a pandas DataFrame.

The SolarAnywhere file format and the variables are described in [1]_. The
SolarAnywhere file format resembles the TMY3 file format but contains
additional variables and meatadata.
AdamRJensen marked this conversation as resolved.
Show resolved Hide resolved

Parameters
----------
fbuf: file-like object
File-like object containing data to read.
AdamRJensen marked this conversation as resolved.
Show resolved Hide resolved
map_variables: bool, default: True
When true, renames columns of the Dataframe to pvlib variable names
where applicable. See variable :const:`VARIABLE_MAP`.
AdamRJensen marked this conversation as resolved.
Show resolved Hide resolved

Returns
-------
data: pandas.DataFrame
Timeseries data from SolarAnywhere. Index is localized to UTC.
metadata: dict
Metadata available in the file.

See Also
--------
pvlib.iotools.get_solaranywhere, pvlib.iotools.parse_solaranywhere

References
----------
.. [1] `SolarAnywhere historical data file formats
<https://www.solaranywhere.com/support/historical-data/file-formats/>`_
"""
with open(str(filename), 'r') as fbuf:
content = parse_solaranywhere(fbuf, map_variables=map_variables)
return content


def parse_solaranywhere(fbuf, map_variables=True):
"""
Parse a file-like buffer with data in the format of a SolarAnywhere file.

The SolarAnywhere file format and the variables are described in [1]_. The
SolarAnywhere file format resembles the TMY3 file format but contains
additional variables and meatadata.

Parameters
----------
fbuf: file-like object
File-like object containing data to read.
map_variables: bool, default: True
When true, renames columns of the Dataframe to pvlib variable names
where applicable. See variable :const:`VARIABLE_MAP`.

Returns
-------
data: pandas.DataFrame
Timeseries data from SolarAnywhere. Index is localized to UTC.
metadata: dict
Metadata available in the file.

See Also
--------
pvlib.iotools.read_solaranywhere, pvlib.iotools.get_solaranywhere

References
----------
.. [1] `SolarAnywhere historical data file formats
<https://www.solaranywhere.com/support/historical-data/file-formats/>`_
"""
# Parse metadata contained within the first line
firstline = fbuf.readline().strip().split(',')
meta = {}
meta['USAF'] = int(firstline.pop(0))
meta['name'] = firstline.pop(0)
meta['state'] = firstline.pop(0)
meta['TZ'] = float(firstline.pop(0))
meta['latitude'] = float(firstline.pop(0))
meta['longitude'] = float(firstline.pop(0))
meta['altitude'] = float(firstline.pop(0))

# SolarAnywhere files contain additional metadata than the TMY3 format.
# The additional metadata is specified as key-value pairs, where each entry
# is separated by a slash, and the key-value pairs are separated by a
# colon. E.g., 'Data Version: 3.4 / Type: Typical Year / ...'
for i in ','.join(firstline).split('/'):
if ':' in i:
k, v = i.split(':')
meta[k.strip()] = v.strip()

# Read remaining part of file which contains the time series data
data = pd.read_csv(fbuf)
# Set index to UTC
data.index = pd.to_datetime(data['ObservationTime(GMT)'],
format='%m/%d/%Y %H:%M', utc=True)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
data.index = pd.to_datetime(data['ObservationTime(GMT)'],
format='%m/%d/%Y %H:%M', utc=True)
data.index = pd.to_datetime(data['ObservationTime(LST)'],
format='%m/%d/%Y %H:%M')

Maybe better to use the local standard time column?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If so, I think the tests will need to be adjusted

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you are right that LST is preferred. My main point for saying this is that it's much easier for a user to convert to UTC than to convert to LST,

if map_variables:
data = data.rename(columns=VARIABLE_MAP)

return data, meta