Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove the time_bnds #1246

Closed
wants to merge 9 commits into from
Closed

Remove the time_bnds #1246

wants to merge 9 commits into from

Conversation

zxdawn
Copy link
Member

@zxdawn zxdawn commented Jun 30, 2020

Currently, the start_time is the minimum start_times while end_time is the maximum end_times.

It's better to support the datatime array which is useful for time_series blending in MultiScene.

Delete the useless time_bnds for new version of xarray.

encoding['time_bnds'] = bounds_enc # FUTURE: Not required anymore with xarray-0.14+

@sfinkens
Copy link
Member

Thanks for looking into this! I tried it out with the following example, but it doesn't work yet:

In [1]: from glob import glob                                                                                                                                                                                                                
In [2]: from satpy import MultiScene                                                                                                                                                                                                         
In [3]: from satpy.multiscene import timeseries                                                                                                                                                                                                                                                                                                                                                                                                                             
In [4]: filenames = glob('/data/georing/ahi_hsd/*', recursive=True)                                                                                                                                                                          
In [5]: mscn = MultiScene.from_files(filenames, reader="ahi_hsd")                                                                                                                                                                            
In [6]: mscn.load(['B13'])
In [7]: blended_scene = mscn.blend(blend_function=timeseries)
In [8]: blended_scene.save_datasets(writer='cf', filename='test.nc')       
Traceback:
...                                                                                                                                                       
ValueError: different number of dimensions on data and dims: 3 vs 2

I think this is because the start/end times passed to make_time_bounds are not the coordinates added by timeseries.
You should also keep in mind that

  • There are use cases where the multiscene was blended with a different method (stack instead of timeseries for example). Then there will be no start_time or end_time coordinates.
  • By default the CF writer prepends the dataset name to non-dimensional coordinates: start_time -> B13_start_time. See the make_alt_coords_unique method.

If this turns out to be too much work, a quick fix could be to not add time bounds if the time dimension is larger than 1.

@ghost
Copy link

ghost commented Jan 12, 2021

Congratulations 🎉. DeepCode analyzed your code in 2.295 seconds and we found no issues. Enjoy a moment of no bugs ☀️.

👉 View analysis in DeepCode’s Dashboard | Configure the bot

@zxdawn
Copy link
Member Author

zxdawn commented Jan 12, 2021

The time_bnds is useless in the new version of xarray:

encoding['time_bnds'] = bounds_enc # FUTURE: Not required anymore with xarray-0.14+

I delete it instead of figuring how to make it work for many situations ...

@zxdawn zxdawn changed the title Support start_time and end_time coordinates (datetime array) Remove the time_bnds Jan 12, 2021
@mraspaud
Copy link
Member

@zxdawn I'm not sure I understand this PR fully: Passing the time bounds to the encoding is according to the comment not necessary after xarray 0.14. However that line is not removed. Moreover, that is removed is the creation and insertion of the time bounds variable, and I don't understand why that is?

@zxdawn
Copy link
Member Author

zxdawn commented Apr 27, 2021

@mraspaud Sorry, I forgot to delete the line of passing the time bounds to the encoding. Updated now.

@mraspaud
Copy link
Member

No problem. Did the tests not need updating?

Copy link
Member

@mraspaud mraspaud left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, the only thing left I think is change the minimum version of xarray in setup.py

@mraspaud
Copy link
Member

yeah, you need to fix the tests, making sure xarray is doing its job too.

@zxdawn
Copy link
Member Author

zxdawn commented Apr 27, 2021

Updated the setup.py and deleted/modified the tests of time bounds now.

@codecov
Copy link

codecov bot commented Apr 27, 2021

Codecov Report

Merging #1246 (097a74b) into master (b564e0c) will decrease coverage by 0.01%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1246      +/-   ##
==========================================
- Coverage   92.62%   92.61%   -0.02%     
==========================================
  Files         258      258              
  Lines       37847    37777      -70     
==========================================
- Hits        35057    34987      -70     
  Misses       2790     2790              
Flag Coverage Δ
behaviourtests 4.85% <0.00%> (+<0.01%) ⬆️
unittests 92.75% <100.00%> (-0.02%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
satpy/tests/writer_tests/test_cf.py 99.66% <100.00%> (-0.03%) ⬇️
satpy/writers/cf_writer.py 95.00% <100.00%> (-0.19%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b564e0c...097a74b. Read the comment docs.

@mraspaud
Copy link
Member

@zxdawn thanks! I'm wondering what the comment saying "the time bounds can be removed after xarray 0.14" actually means. Does it mean that there is a building mechanism in xarray for that now, and should we test that? @sfinkens do you know?

@sfinkens
Copy link
Member

@mraspaud @zxdawn I think there was a misunderstanding in the conversation here and on slack some time ago.

xarray does not automatically add time bounds when you save a dataset to netcdf. Although not required by the CF conventions, they provide very valuable information and in my opinion we should keep them.

That comment in _set_default_time_encoding was about the encoding of time bounds. Before version 0.14 xarray did not recognize that time bounds and time coordinates should be encoded in the same way. Instead time could be encoded as days since 1970 whereas time_bounds would be in microseconds since 2020-01-01 12:15:30.

If I remember correctly, the issue here is that make_time_bounds computes a single tuple (tstart, tend), even if there are multiple timestamps in the dataset. In that case you would probably want bounds for every time stamp.

@mraspaud
Copy link
Member

@zxdawn you had another reason for removing this, right?

@zxdawn
Copy link
Member Author

zxdawn commented May 17, 2021

@mraspaud As explained by @sfinkens, the time bounds for each timestamp are required. I don't have another reason for removing this. May I close this and make another PR, this one looks a little confused?

@mraspaud
Copy link
Member

Yes, absolutely! I'll just go ahead and close this, feel free to start another PR.

@mraspaud mraspaud closed this May 17, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Dimension name of time in time_bounds is duplicated with the time dimension name of variable
3 participants