# Solar Energy


Salient's historical and downscale data include the components necessary to calculate solar energy output. This notebook will show you how to obtain the source meteorological data and then convert it to energy based on array characteristics.


In [None]:
import os
import sys

try:
    import salientsdk as sk
except ModuleNotFoundError as e:
    if os.path.exists("../salientsdk"):
        sys.path.append(os.path.abspath(".."))
        import salientsdk as sk
    else:
        raise ModuleNotFoundError("Install salient SDK with: pip install salientsdk")

# Analyze one year of data:
year = "2023"
(start_date, end_date) = (f"{year}-01-01", f"{year}-12-31")
# When we want to plot a timeseries, focus on a specific month:
plt_time = slice(f"{year}-07-02", f"{year}-07-05")
# When we want to plot a single location, choose this one
plt_loc = "Roadrunner TX"

sk.set_file_destination("solar_example")
sk.login("SALIENT_USERNAME", "SALIENT_PASSWORD")

<requests.sessions.Session at 0x7f4f183c2b90>

## Get Solar Meteorological Data


### Set geographic bounds

The Salient SDK uses a `Location` object to specify the geographic bounds of a request. Let's analyze solar energy at 3 sites in Maine, Texas, and California. We can also predict solar energy over a gridded area defined by a polygon shapefile.


In [None]:
if False:  # Analyze solar at a single location
    loc = sk.Location(lat=31.2194, lon=-102.1922)
elif True:  # Analyze solar at 3 locations with a location_file
    loc_file = sk.upload_location_file(
        lats=[44.3327, 31.2194, 34.830556],
        lons=[-69.781, -102.1922, -118.398056],
        names=["3 Corners ME", "Roadrunner TX", "Solar Star CA"],
        rated_capacity=[109, 400, 579],  # MW
        geoname="solar_example",
        force=False,
    )
    loc = sk.Location(location_file=loc_file)

else:  # Analyze solar over the Western ERCOT region with a shapefile
    # fmt: off
    coords = [
        (-103.25, 32.00),(-105.50, 32.00),(-105.50, 31.00),(-104.50, 29.50),(-103.25, 29.00),(-102.50, 29.75),
        (-101.25, 29.50),(-100.50, 29.00),(-98.000, 34.00),(-100.00, 34.50),(-100.00, 35.75),(-103.25, 35.75),
    ]
    # fmt: on

    shape_file = sk.upload_shapefile(coords, "ercot_west", force=False)
    loc = sk.Location(shapefile=shape_file)

### Get Historical Observed Data


The `data_timeseries_solar` function is a convenience function that calls `data_timeseries` to generate an hourly historical timeseries with the appropriate weather inputs for solar analysis.


In [None]:
hist = sk.solar.data_timeseries_solar(loc=loc, start=start_date, end=end_date)
print(hist)
hist["tsi"].sel(time=plt_time).plot.line(x="time");

<xarray.Dataset> Size: 1MB
Dimensions:    (time: 8737, location: 3)
Coordinates:
  * time       (time) datetime64[ns] 70kB 2023-01-01 ... 2023-12-31
  * location   (location) object 24B '3 Corners ME' ... 'Solar Star CA'
    lat        (location) float64 24B 44.33 31.22 34.83
    lon        (location) float64 24B -69.78 -102.2 -118.4
Data variables:
    temp       (time, location) float64 210kB 8.167 19.81 10.54 ... 17.8 11.02
    wspd       (time, location) float64 210kB 2.912 3.424 5.352 ... 1.917 3.013
    tsi        (time, location) float64 210kB 0.0 48.79 52.78 ... 0.0 40.2 163.8
    dhi        (time, location) float64 210kB 0.0 21.02 52.56 ... 31.27 56.02
    dni        (time, location) float64 210kB 0.0 400.4 0.9458 ... 133.0 483.5
    elevation  (location) float32 12B 92.69 780.5 975.8


### Get Forecast Ensembles

`downscale_solar` is a specialized version of `downscale` that fetches all the hourly data variables needed to calculate solar energy. Hourly downscale is a compute- and data-intensive process so the call may take a few minutes.


In [None]:
fcst = sk.solar.downscale_solar(loc=loc, date=start_date, members=11)
print(fcst)
fcst["tsi"].sel(time=plt_time, location=plt_loc).plot.line(
    x="time", color=(0.7, 0.7, 0.7, 0.2), add_legend=False
)
hist["tsi"].sel(time=plt_time, location=plt_loc).plot.line(
    x="time", color="orange", add_legend=False
);

<xarray.Dataset> Size: 7MB
Dimensions:        (time: 8783, ensemble: 11, location: 3)
Coordinates:
    analog         (time, ensemble) datetime64[ns] 773kB NaT ... 1996-12-31T2...
  * time           (time) datetime64[ns] 70kB 2023-01-01T01:00:00 ... 2024-01...
  * ensemble       (ensemble) int32 44B 0 1 2 3 4 5 6 7 8 9 10
  * location       (location) object 24B '3 Corners ME' ... 'Solar Star CA'
    lat            (location) float64 24B 44.33 31.22 34.83
    lon            (location) float64 24B -69.78 -102.2 -118.4
    forecast_date  datetime64[ns] 8B 2023-01-01
Data variables:
    wspd           (time, location, ensemble) float32 1MB nan nan ... 2.931
    temp           (time, location, ensemble) float32 1MB nan nan ... 13.64
    tsi            (time, location, ensemble) float32 1MB 0.0 0.0 ... 162.8
    dni            (time, location, ensemble) float32 1MB 0.0 0.0 ... 145.8
    dhi            (time, location, ensemble) float32 1MB 0.0 0.0 ... 109.1
    elevation      (location) flo

## Sun to Power


`run_pvlib_dataset` uses NREL's `pvlib` to convert the meteorological dataset into a timeseries with `ac` and `dc` watts.


In [None]:
# model_chain is a vector pvlib model chains, which specifies panels, inverters, etc.
model_chain = sk.solar.dataset_to_modelchain(hist)
pwr_hist = sk.solar.run_pvlib_dataset(hist, model_chain=model_chain)
print(pwr_hist)
pwr_hist["ac"].sel(time=plt_time).plot.line(x="time");

<xarray.Dataset> Size: 1MB
Dimensions:               (time: 8737, location: 3)
Coordinates:
  * time                  (time) datetime64[ns] 70kB 2023-01-01 ... 2023-12-31
  * location              (location) object 24B '3 Corners ME' ... 'Solar Sta...
    lat                   (location) float64 24B 44.33 31.22 34.83
    lon                   (location) float64 24B -69.78 -102.2 -118.4
Data variables:
    ac                    (location, time) float64 210kB 0.0 0.0 ... 0.2807
    dc                    (location, time) float64 210kB 0.0 0.0 ... 0.3043
    effective_irradiance  (location, time) float64 210kB 0.0 0.0 ... 286.8 297.2
    poa_direct            (location, time) float64 210kB 0.0 0.0 ... 152.6 216.5
    poa_sky_diffuse       (location, time) float64 210kB 0.0 0.0 ... 132.7 93.23
    poa_ground_diffuse    (location, time) float64 210kB 0.0 0.0 ... 4.612 3.669


`run_pvlib_dataset` accepts also a data input with ensemble `members` of the style returned by `downscale_solar`. This helps you calculate multiple potential future power outputs.


In [None]:
# mc is a pvlib model chain, which specifies panels, inverters, etc.
model_chain = sk.solar.dataset_to_modelchain(fcst)
pwr_fcst = sk.solar.run_pvlib_dataset(fcst, model_chain=model_chain)

print(pwr_fcst)
pwr_fcst["ac"].sel(time=plt_time, location=plt_loc).plot.line(
    x="time", color=(0.7, 0.7, 0.7, 0.5), add_legend=False
)
pwr_hist["ac"].sel(time=plt_time, location=plt_loc).plot.line(
    x="time", color="orange", add_legend=False
);

<xarray.Dataset> Size: 14MB
Dimensions:               (time: 8783, ensemble: 11, location: 3)
Coordinates:
  * time                  (time) datetime64[ns] 70kB 2023-01-01T01:00:00 ... ...
    forecast_date         datetime64[ns] 8B 2023-01-01
    analog                (time, ensemble) datetime64[ns] 773kB NaT ... 1996-...
  * ensemble              (ensemble) int32 44B 0 1 2 3 4 5 6 7 8 9 10
  * location              (location) object 24B '3 Corners ME' ... 'Solar Sta...
    lat                   (location) float64 24B 44.33 31.22 34.83
    lon                   (location) float64 24B -69.78 -102.2 -118.4
Data variables:
    ac                    (location, ensemble, time) float64 2MB 0.0 ... 0.1916
    dc                    (location, ensemble, time) float64 2MB nan ... 0.2133
    effective_irradiance  (location, ensemble, time) float64 2MB nan ... 208.3
    poa_direct            (location, ensemble, time) float64 2MB nan ... 93.78
    poa_sky_diffuse       (location, ensemble, time) f

## Validate

Because we have historical observed and forecast meteorology and have run both through the same sun-to-power function, we can evaluate the forecast in the power domain.


In [None]:
diff = (pwr_fcst["ac"] - pwr_hist["ac"]).cumsum("time") / pwr_hist["ac"].cumsum("time")
diff.attrs["long_name"] = "Cumulative " + pwr_hist["ac"].attrs["long_name"] + " Error"
diff.attrs["units"] = "%"

diff_avg = diff.mean("ensemble", keep_attrs=True)

diff_slice = slice(f"{year}-11-01", f"{year}-12-31")
diff.sel(location=plt_loc, time=diff_slice).plot.line(
    x="time", color=(0.7, 0.7, 0.7, 0.5), add_legend=False
)
diff_avg.sel(location=plt_loc, time=diff_slice).plot.line(
    x="time", color="black", add_legend=False
);