Skip to content

A fix to read the time dimension in a (much) more efficient manner.#448

Merged
einola merged 2 commits intodevelopfrom
issue434_time_dimension
Jul 3, 2020
Merged

A fix to read the time dimension in a (much) more efficient manner.#448
einola merged 2 commits intodevelopfrom
issue434_time_dimension

Conversation

@einola
Copy link
Copy Markdown
Member

@einola einola commented Jul 3, 2020

This pull request addresses issue #434 by reading the time dimension in a much more efficient manner. Instead of reading in the entire time vector and doing a linear search for the right place every time we now only read in the first two steps, infer the file's time step and from that read directly the correct record. This leads to a huge speed increase when reading in large datasets over a slow network (it halves the processing time on my local machine using ERA5 read in from johansen).

This assumes that time in the file is

  • Monotonously increasing
  • Evenly spaced

There are datasets in /Data/sim/data on johansen that don't adhere to this (in particular Sylvain's currents-from-altimeter dataset) - but we hardly use those anymore.

@einola einola requested review from docguibou and tdcwilliams July 3, 2020 07:21
Copy link
Copy Markdown
Contributor

@tdcwilliams tdcwilliams left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems good - the other dataset it could affect is the PIOMAS one - does that matter?

Comment thread model/externaldata.cpp Outdated
Comment thread model/externaldata.cpp
Co-authored-by: Timothy Williams <tdcwilliams@gmail.com>
@einola einola merged commit ef5b86f into develop Jul 3, 2020
@einola einola deleted the issue434_time_dimension branch July 3, 2020 13:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants