Data type anomaly for specific fields within daily data #107

amotl · 2020-07-05T23:32:25Z

As outlined within panodata/dwdweather2#27, acquiring precipitation information from daily observations should yield data like

"precipitation_form": 4,
"precipitation_height": 8.8,

However, I just found out that

wetterdienst readings --resolution=daily --parameter=kl --period=recent --station=44

will yield data like

"precipitation_form":4.0,
"precipitation_height":1.5,

So, we should adjust the data type for precipitation_form to be an Integer, like designated within the dwdweather2 knowledge base module, line 142.

cc @BenjaminMews

The text was updated successfully, but these errors were encountered:

amotl · 2020-07-05T23:35:05Z

While being at it, we also might want to appropriately adjust the data type for fields like daily_quality_level_4 as outlined within the dwdweather2 knowledge base module, line 140.

gutzbenj · 2020-07-06T07:10:27Z

Do you think that we should reinvent the dtype mapping creation or simply implement another if-else for, say, static data columns (-> int) and dynamic data columns (-> float).

amotl · 2020-07-06T10:29:08Z

I believe doing it in a dynamic manner would be okay. Then we can say things like

if column_name in ['precipitation_form', 'any_others'] or 'quality_level' in column_name:
    value = int(value)

Note this is just non-Pandas pseudocode and probably should be written down in a more elaborated way. Also, it should be performed before humanizing column names, which is an optional feature.

gutzbenj · 2020-07-06T12:46:47Z

Btw just noticed that if we want to successfully apply this to the library, we require the whole parameter names being typed for all resolutions. Otherwise we'd break the functionality for some time resolutions. In conclusion we should first fully name the whole set of parameters from 1_minute to annual...

amotl · 2020-07-06T16:38:23Z

Btw just noticed that if we want to successfully apply this to the library, we require the whole parameter names being typed for all resolutions. Otherwise we'd break the functionality for some time resolutions.

I see. Thanks for looking at the nitty gritty details.

For now, we could also approach a dynamic solution and use integer_field in df.columns as a constraint to apply the coercion. I just quickly ramped up and submitted #108 to give us an idea about how things might be implemented that way.

In conclusion we should first fully name the whole set of parameters from 1_minute to annual...

I will not stop you doing this. Thanks already! It might save some unnecessary cycles iterating through all special integer fields. However, if you feel the dynamic solution outlined through #108 will also be okay, I will also be happy to help getting it out of draft mode.

amotl mentioned this issue Jul 6, 2020

Coerce some integer data types #108

Closed

amotl mentioned this issue Jul 6, 2020

Overhaul column type casting #109

Closed

amotl linked a pull request Jul 13, 2020 that will close this issue

Coerce some integer data types #108

Closed

gutzbenj linked a pull request Jul 22, 2020 that will close this issue

Dtype conversion is extended to integer fields and string fields #115

Merged

gutzbenj closed this as completed in #115 Jul 22, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Data type anomaly for specific fields within daily data #107

Data type anomaly for specific fields within daily data #107

amotl commented Jul 5, 2020 •

edited

Loading

amotl commented Jul 5, 2020

gutzbenj commented Jul 6, 2020

amotl commented Jul 6, 2020 •

edited

Loading

gutzbenj commented Jul 6, 2020

amotl commented Jul 6, 2020 •

edited

Loading

Data type anomaly for specific fields within daily data #107

Data type anomaly for specific fields within daily data #107

Comments

amotl commented Jul 5, 2020 • edited Loading

amotl commented Jul 5, 2020

gutzbenj commented Jul 6, 2020

amotl commented Jul 6, 2020 • edited Loading

gutzbenj commented Jul 6, 2020

amotl commented Jul 6, 2020 • edited Loading

amotl commented Jul 5, 2020 •

edited

Loading

amotl commented Jul 6, 2020 •

edited

Loading

amotl commented Jul 6, 2020 •

edited

Loading