Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using meaningful column/field names #54

Closed
amotl opened this issue Jun 10, 2020 · 9 comments · Fixed by #113
Closed

Using meaningful column/field names #54

amotl opened this issue Jun 10, 2020 · 9 comments · Fixed by #113
Labels
enhancement New feature or request help wanted Extra attention is needed

Comments

@amotl
Copy link
Member

amotl commented Jun 10, 2020

Dear Benjamin and Daniel,

for starting a discussion around assigning meaningful English names to meteorological short identifiers, I wanted to humbly point out eaaf936 coming from our PR #55.

We had similar things within the knowledgebase module knowledge.py of dwdweather2 and recognized you also already started to put efforts into the aspect of appropriately mapping DWD-specific original parameters, names and such to identifiers which are more suitable for human consumption.

So, we would like to ask you if you appreciate the approach to expand that very aspect on all field names available?

With kind regards,
Andreas.

@amotl amotl changed the title Using meaningful field names Using meaningful column/field names Jun 10, 2020
@gutzbenj
Copy link
Member

Dear @amotl ,

I had mainly left those "data column names" untouched because I'd have a hard time choosing the right names for the values. There are those given names by DWD which are sometimes as long as a sentence which is also not really an option.

Furthermore when extending the column names, the data becomes hardly printable as for data which has several parameters, this would easily span over the whole screen, thus the dataframe is hardly readable anymore. I'd prefer an opt in solution. Putting the full name also wouldn't be a problem if we'd melt the DataFrame (could also be opt in).

What do you think?

@amotl
Copy link
Member Author

amotl commented Jun 15, 2020

Dear Benjamin,

I had mainly left those "data column names" untouched because I'd have a hard time choosing the right names for the values. There are those given names by DWD which are sometimes as long as a sentence which is also not really an option.

I see. Maybe @wetterfrosch and I can work on that aspect to establish reasonable names here.

[In any case,] I'd prefer an opt in solution.

Your wish is my command. I've added an option humanize_column_names to the collect_dwd_data method through an updated ebb922b coming from #55.

May I also suggest to use lowercase column names on the matter of #55?

With kind regards,
Andreas.

@gutzbenj
Copy link
Member

Thank you, I merged the PR. Column names may also be lowercase as you wish.

@amotl
Copy link
Member Author

amotl commented Jun 16, 2020

Thank you, I merged the PR.

Thanks again!

Column names may also be lowercase as you wish.

Shall we make all things on the right hand side of column_names_enumeration.py lowercase then?

@gutzbenj
Copy link
Member

Thank you, I merged the PR.

Thanks again!

Column names may also be lowercase as you wish.

Shall we make all things on the right hand side of column_names_enumeration.py lowercase then?

Yes, sure!

@amotl
Copy link
Member Author

amotl commented Jul 5, 2020

After completing the list for the daily resolution, we should expand the list of column names for the 10_minutes and hourly resolutions next, see panodata/dwdweather2#13 (comment).

@gutzbenj
Copy link
Member

gutzbenj commented Jul 6, 2020

I will try and setup up the names for the whole set of parameters. I already know that some values are not that clear, and with those I'll come back to you!

@amotl
Copy link
Member Author

amotl commented Jul 6, 2020

I will try and setup up the names for the whole set of parameters.

Thanks!

Borrowing from dwdweather2

You might want to get inspired by our naming scheme applied within the knowledgebase module knowledge.py of dwdweather2 to make the column names look and sound nice. Disclaimer: Saying this, there might be well room for improvement.

Examples: temperature_max_200, temperature_min_200, temperature_min_005, soil_temperature_002, soil_temperature_005, etc.

It would be cool to be able to give individual humanized column names to same columns per parameter, especially for the quality level columns like daily_quality_level_3 vs. daily_quality_level_4 vs. soil_temperature_quality_level in order to be able to merge data from different parameter sets into a single DataFrame without collisions. This relates to #106.

Borrowing from GribMagic

@meteoDaniel also added some humanized labels through unified_forecast_variables.py, thanks already! Wetterdienst might also draw some inspiration from this.

@gutzbenj
Copy link
Member

gutzbenj commented Jul 12, 2020

Please check out the new branch https://github.com/earthobservations/wetterdienst/tree/more-meaningful-column-names

I added all column names and had also added a function to create an individual mapping based on some anomalies within "kl" parameter, that has two quality flags within one file.

I still had some issues with these parameters:

RWS_IND_10
RS_IND_01
DD
CD_TER
DK_TER
FK_TER
NSH_TAG
JA_RR
JA_MX_RS

Maybe we can discuss those.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
2 participants