Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add command line interface #55

Merged
merged 5 commits into from
Jun 16, 2020
Merged

Conversation

amotl
Copy link
Member

@amotl amotl commented Jun 11, 2020

Dear Benjamin and Daniel,

as outlined within #50, we wanted to start bringing the features of this library to the command line similar to what users have been accustomed with dwdweather2.

Thanks again for the ongoing efforts you are putting into this library. Being able to acquire data across historical+recent data sets using the new DWDStationRequest.collect_data() method without having to assemble them manually in any way is really powerful.

While we haven't tested any other combinations of parameters other than those listed within the examples section of dwd --help yet, you might want to merge these updates right away when pulling in the stationdata-request-definition branch to master. Saying that, there might well be some fixups required in this matter to remedy quirks with more parameter combinations.

Cheers,
Andreas.

Copy link
Member

@gutzbenj gutzbenj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a look on the changes. Thanks for your effort to implement the client function. Please take a look at my comments, mostly covering some small things I've noticed.

python_dwd/parsing_data/parse_data_from_files.py Outdated Show resolved Hide resolved
python_dwd/cli.py Show resolved Hide resolved
DWDOrigColumns.UPM.value: DWDColumns.UPM.value,
DWDOrigColumns.TXK.value: DWDColumns.TXK.value,
DWDOrigColumns.TNK.value: DWDColumns.TNK.value,
DWDOrigColumns.TGK.value: DWDColumns.TGK.value,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good job! Can you take care of the other parameters (1_minute, 10_minutes,...) as well?

Copy link
Member Author

@amotl amotl Jun 12, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure! Maybe we can discuss the direction where this should go within #54 a bit more (I would also like to ask @wetterfrosch to join us there) and I will do that with a later contribution? I just wanted to give you an idea about that and to have at least something of value for the first steps into the CLI thing.

When diving back in here, I would like to add meaningful column names for all resolutions. However, there are some issues to resolve before, like #58 and a different one for ingesting hourly data I will add through a different PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's ok for me until we get the feature for every data at some later point. Would you mind adding on option to keep the original names? This way there'd still be an option to get the "original" data?

Copy link
Member Author

@amotl amotl Jun 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you mind adding on option to keep the original names?

I've added an option humanize_column_names to the collect_dwd_data method as outlined within #54 (comment).

tests/parsing_data/test_fetch_data.py Outdated Show resolved Hide resolved
@amotl amotl force-pushed the cli-ng branch 2 times, most recently from 649a705 to 2d2cd77 Compare June 12, 2020 01:53
@amotl
Copy link
Member Author

amotl commented Jun 12, 2020

Dear Benjamin,

thanks for your suggestions, I've adjusted the PR accordingly at some spots. Please let me know whether this needs more work for the initial merge.

Also, please note that this is currently heading towards the stationdata-request-definition branch. After you have merged #49 and #57, do you believe we should direct it towards the master branch already?

Cheers,
Andreas.

@gutzbenj
Copy link
Member

Dear Benjamin,

thanks for your suggestions, I've adjusted the PR accordingly at some spots. Please let me know whether this needs more work for the initial merge.

Also, please note that this is currently heading towards the stationdata-request-definition branch. After you have merged #49 and #57, do you believe we should direct it towards the master branch already?

Cheers,
Andreas.

Please redirect the PR to master, as it as of now includes all the recent changes!

@amotl amotl changed the base branch from Feature/21/stationdata-request-definition to master June 12, 2020 16:16
@amotl amotl changed the base branch from master to Feature/21/stationdata-request-definition June 12, 2020 16:26
I don't know what is going on here, but these 
adjustments made it work for me on Python 3.8.0
@amotl amotl changed the base branch from Feature/21/stationdata-request-definition to master June 15, 2020 23:16
@amotl
Copy link
Member Author

amotl commented Jun 15, 2020

Hi Benjamin,

Please redirect the PR to master, as it as of now includes all the recent changes!

I have just done so. However, I recognized that the specific feature to acquire data across the {historical,recent} data sets like outlined through dwd --help:

# The real power horse: Acquire data across historical+recent data sets
dwd readings --station=44,1048 --parameter=kl --resolution=daily --period=historical,recent --date=1969-01-01/2020-06-11

does not work based on the current master branch. Saying that, I am 100% sure it worked on top of the stationdata-request-definition branch.

Currently, invoking that command croaks with

Error: 'start_date' must be smaller or equal to 'end_date'.
Traceback (most recent call last):
  File "/Users/amo/dev/panodata/sources/python_dwd/.venv3/bin/dwd", line 11, in <module>
    load_entry_point('python-dwd', 'console_scripts', 'dwd')()
  File "/Users/amo/dev/panodata/sources/python_dwd/python_dwd/cli.py", line 89, in run
    request = DWDStationRequest(
  File "/Users/amo/dev/panodata/sources/python_dwd/python_dwd/dwd_station_request.py", line 70, in __init__
    self.period_type = sorted(self.period_type)
TypeError: '<' not supported between instances of 'PeriodType' and 'PeriodType'

I've just investigated a few commits from the named branch and found 0f1faa7 to be probably the one which is missing in order to make PeriodType sortable.

With kind regards,
Andreas.

@amotl
Copy link
Member Author

amotl commented Jun 15, 2020

I've just investigated a few commits from the named branch and found 0f1faa7 to be probably the one which is missing in order to make PeriodType sortable.

Re-added on behalf of you through #62. When using that,

dwd readings --station=44,1048 --parameter=kl --resolution=daily --period=historical,recent --date=1969-01-01/2020-06-11

works flawlessly again.

@amotl amotl force-pushed the cli-ng branch 3 times, most recently from 8bd1f39 to d2ebabb Compare June 16, 2020 00:21
This avoids poisoning STDOUT with log messages which really
should go to STDERR. In turn, this makes room for real output
on STDOUT originating from the upcoming CLI module.
From "column_name_mapping.py" and "column_names_enumeration.py",
we are seeing the intention to map meteorological short identifiers
to more meaningful English long names.

This complements the current implementation by providing appropriate
mappings for daily climate summary data (kl) and also adds an
appropriate test case for that.

While being on it, we discovered that "_parse_dwd_data" as well as
"collect_dwd_data" somehow wouldn't actually account for column names
to be propagated, so we adjusted some spots on data frame handling.

Trivia:
- The test case has been marked as "remote" to be able to tell unit
  tests based on fixtures and full integration tests apart.
- When massaging the data frame after parsing data from CSV,
  the "EOR" column gets dropped right away as it actually
  has no real value on downstream processing.
This adds an entrypoint program "dwd" which can be used
to explore the features of this library on the command line.
@gutzbenj gutzbenj merged commit d6a8ab5 into earthobservations:master Jun 16, 2020
@amotl
Copy link
Member Author

amotl commented Jun 16, 2020

Thanks for merging this!

@amotl amotl mentioned this pull request Jun 16, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants