Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow kwargs to open_boutdataset #102

Merged
merged 1 commit into from
Jan 11, 2020
Merged

Conversation

TomNicholas
Copy link
Collaborator

Generalised open_boutdataset to accept arbitrary kwargs. The kwargs go to xarray.open_mfdataset first, and if they aren't recognised there they go down to xarray.open_dataset.

This allows you to potentially speed up opening many files by passing data_vars='minimal', coords='minimal', compat='override', parallel='True' as described here.

This also means the drop_variables argument doesn't need to be explicitly there anymore as it's covered by the kwargs being passed down to xarray.open_dataset.

@johnomotani johnomotani changed the base branch from Reload_squashed_files to master January 11, 2020 15:26
@johnomotani
Copy link
Collaborator

How does this change eliminate the possible need for drop_variables? I guess we still need the special handling for _BOUT_PER_PROC_VARIABLES, and if so what happens if some user code happens to add some variable to the output that behaves in a similar way (e.g. a scalar with different values in each BOUT.dmp.*.nc file)? Wouldn't there be an error unless that variable is dropped explicitly?

@TomNicholas
Copy link
Collaborator Author

How does this change eliminate the possible need for drop_variables?

Because drop_variables is an argument to xarray.open_dataset, which kwargs will pass down to. So it's still available as an option, it just doesn't need to be explicitly listed as an argument to open_boutdataset.

what happens if some user code happens to add some variable to the output that behaves in a similar way

Because open_boutdataset occurs before preprocess or the combining process within open_mfdataset, passing drop_variables='problem_variable' to open_boutdataset should still work fine.

Wouldn't there be an error unless that variable is dropped explicitly?

Yes, but we will still have the option to drop it explicitly.

@johnomotani
Copy link
Collaborator

Ah, I see. Sorry, I'd looked at open_mfdataset and saw it didn't have a drop_variables argument, but didn't realise that open_dataset does and the kwargs will drop through to there.

kwargs are great, but they do make the help() less helpful sometimes

@TomNicholas TomNicholas merged commit 41921a6 into master Jan 11, 2020
@johnomotani johnomotani deleted the open_boutdataset-kwargs branch March 16, 2020 08:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants