Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor grdc_data class #109

Merged
merged 20 commits into from Jun 23, 2021
Merged

Refactor grdc_data class #109

merged 20 commits into from Jun 23, 2021

Conversation

SarahAlidoost
Copy link
Contributor

@SarahAlidoost SarahAlidoost commented Jun 16, 2021

closes #70

@SarahAlidoost SarahAlidoost marked this pull request as ready for review June 17, 2021 10:57
Copy link
Member

@sverhoeven sverhoeven left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can the docstring of get_grdc_data be fixed?
Also I now tested manually, but it would be nice to have automated tests.

Works as expected

By putting 6335020_Q_Day.Cmd.txt in cwd.

with data_home arg

from ewatercycle.observation.grdc import get_grdc_data
df, meta = get_grdc_data('6335020', '2000-01-01T00:00Z', '2001-01-01T00:00Z', data_home='.')

with ewatercycle.yaml:grdc_location: $PWD

from ewatercycle.config import CFG
CFG.load_from_file('./ewaterycle.yaml')
from ewatercycle.observation.grdc import get_grdc_data
df, meta = get_grdc_data('6335020', '2000-01-01T00:00Z', '2001-01-01T00:00Z')

without data_home and config

from ewatercycle.observation.grdc import get_grdc_data
df, meta = get_grdc_data('6335020', '2000-01-01T00:00Z', '2001-01-01T00:00Z')
...
ValueError: Provide the grdc path using `data_home` argument or using `grdc_location` in ewatercycle configuration file.

ewatercycle/observation/grdc.py Show resolved Hide resolved
Comment on lines 21 to 32
station_id: The station id to get. The station id can be found in the
catalogues at
https://www.bafg.de/GRDC/EN/02_srvcs/21_tmsrs/212_prjctlgs/project_catalogue_node.html
start_time: Start time of model in UTC and ISO format string e.g.
'YYYY-MM-DDTHH:MM:SSZ'.
end_time: End time of model in UTC and ISO format string e.g.
'YYYY-MM-DDTHH:MM:SSZ'.
parameter: optional. The parameter code to get, e.g. ('Q') discharge,
cubic meters per second.
data_home : optional. The directory where the
daily grdc data is located. If left out will use the grdc_location in the
eWaterCycle configuration file.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice to see the switch from Numpy to Google docstring. However needs some more formatting as now the args section is rendered as a plain long string, instead of a bullet list.

The args should be indented 4 spaces and multiline descriptions should have double indent.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is fixed. I noticed that in the rendered document, Args is renamed to Parameters. Is there a way to avoid this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did not find it, napoleon. apidoc, autodoc and sphinx_rtd_theme dont have clear option to change that.

ewatercycle/observation/grdc.py Outdated Show resolved Hide resolved
ewatercycle/observation/grdc.py Outdated Show resolved Hide resolved
ewatercycle/observation/grdc.py Outdated Show resolved Hide resolved
start_time: str,
end_time: str,
parameter: str = 'Q',
data_home: str = None):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are adding typings to argument. Can you also add typing for return

Suggested change
data_home: str = None):
data_home: str = None) -> Tuple[pd.DataFrame, Dict[str, Union[str, int, float]]]:

With at top of file

from typing import Tuple, Dict, Union

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is fixed. I found that pd.DataFrame is not a type, but pd.core.frame.DataFrame.

ewatercycle/observation/grdc.py Outdated Show resolved Hide resolved
ewatercycle/observation/grdc.py Outdated Show resolved Hide resolved
@SarahAlidoost
Copy link
Contributor Author

Can the docstring of get_grdc_data be fixed?
Also I now tested manually, but it would be nice to have automated tests.

Works as expected

By putting 6335020_Q_Day.Cmd.txt in cwd.

with data_home arg

from ewatercycle.observation.grdc import get_grdc_data
df, meta = get_grdc_data('6335020', '2000-01-01T00:00Z', '2001-01-01T00:00Z', data_home='.')

with ewatercycle.yaml:grdc_location: $PWD

from ewatercycle.config import CFG
CFG.load_from_file('./ewaterycle.yaml')
from ewatercycle.observation.grdc import get_grdc_data
df, meta = get_grdc_data('6335020', '2000-01-01T00:00Z', '2001-01-01T00:00Z')

without data_home and config

from ewatercycle.observation.grdc import get_grdc_data
df, meta = get_grdc_data('6335020', '2000-01-01T00:00Z', '2001-01-01T00:00Z')
...
ValueError: Provide the grdc path using `data_home` argument or using `grdc_location` in ewatercycle configuration file.

@sverhoeven thank you for reviewing, great comments, I learned many things. Your comments are addressed and more tests are added too. Please have another look.

ewatercycle/observation/grdc.py Outdated Show resolved Hide resolved
ewatercycle/observation/grdc.py Outdated Show resolved Hide resolved
tests/observation/test_grdc.py Outdated Show resolved Hide resolved
tests/observation/test_grdc.py Outdated Show resolved Hide resolved
ewatercycle/observation/grdc.py Show resolved Hide resolved
Copy link
Member

@sverhoeven sverhoeven left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docs now looks OK.
Thanks for the additional tests and nice use of fixture for expected results.

I see the log message with

import logging
logging.basicConfig(level=logging.INFO)

Irom ewatercycle.observation.grdc import get_grdc_data
df, meta = get_grdc_data('6335020', '2000-01-01T00:00Z', '2001-01-01T00:00Z', data_home='.')
INFO:grdc.py:GRDC station 6335020 is selected. The river name is: RHINE RIVER.The coordinates are: (51.756918, 6.395395).The catchment area in km2 is: 159300.0. There are 0 missing values during 2000-01-01T00:00Z_2001-01-01T00:00Z at this station. See the metadata for more information.
``

I have a couple of very minor suggestions, but overall ready to merge

SarahAlidoost and others added 4 commits June 23, 2021 16:56
Co-authored-by: Stefan Verhoeven <s.verhoeven@esciencecenter.nl>
Co-authored-by: Stefan Verhoeven <s.verhoeven@esciencecenter.nl>
Co-authored-by: Stefan Verhoeven <s.verhoeven@esciencecenter.nl>
Co-authored-by: Stefan Verhoeven <s.verhoeven@esciencecenter.nl>
@sonarcloud
Copy link

sonarcloud bot commented Jun 23, 2021

Kudos, SonarCloud Quality Gate passed!

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

100.0% 100.0% Coverage
0.0% 0.0% Duplication

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

get_grdc_data should use CFG['grdc_location'] internally
2 participants