Skip to content
This repository was archived by the owner on Sep 11, 2023. It is now read-only.
This repository was archived by the owner on Sep 11, 2023. It is now read-only.

Simplify loading NetCDF files in load_solar_pv_data #602

@JackKelly

Description

@JackKelly

Detailed Description

It used to be necessary to do this to load NetCDF quickly from a cloud storage bucket:

  with fsspec.open(filename, mode="rb") as file:
      file_bytes = file.read()

  with io.BytesIO(file_bytes) as file:
      pv_power = xr.open_dataset(file, engine="h5netcdf")
      pv_power = pv_power.sel(datetime=slice(start_dt, end_dt))
      pv_power_df = pv_power.to_dataframe()

  # Save memory
  del file_bytes

But it looks like the underlying libraries have been optimised. Now, it appears that the following, simpler version of the code is at least as fast (I tested on Google Cloud today):

  with fsspec.open(filename, mode="rb") as file:
      pv_power = xr.open_dataset(file, engine="h5netcdf")
      pv_power = pv_power.sel(datetime=slice(start_dt, end_dt))
      pv_power_df = pv_power.to_dataframe()

As such, we can simplify this code in nowcasting_dataset.data_sources.pv.pv_data_source.load_solar_pv_data.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions