Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: handle null values in data #636

Merged
merged 10 commits into from Jan 31, 2024
Merged

fix: handle null values in data #636

merged 10 commits into from Jan 31, 2024

Conversation

alespour
Copy link
Contributor

@alespour alespour commented Jan 30, 2024

Closes #621

Proposed Changes

Handles data with missing values when querying to data frames. The query functions query_data_frame... have new optional parameter use_extension_dtypes.

def query_data_frame(self, query: str, org=None, data_frame_index: List[str] = None, params: dict = None,
                     use_extension_dtypes: bool = False):
    ...

def query_data_frame_stream(self, query: str, org=None, data_frame_index: List[str] = None, params: dict = None,
                     use_extension_dtypes: bool = False):
    ...

Example output (with data from #621):

use_extension_dtypes=True

<bound method NDFrame.head of     result  table                    _start                     _stop                            _time _measurement  test_double  test_long
0  _result      0 2023-12-15 13:19:54+00:00 2023-12-15 13:19:57+00:00 2023-12-15 13:19:55.372000+00:00         test          4.0       <NA>
1  _result      0 2023-12-15 13:19:54+00:00 2023-12-15 13:19:57+00:00        2023-12-15 13:19:56+00:00         test         <NA>          1>

use_extension_dtypes=False

<bound method NDFrame.head of     result  table                    _start                     _stop                            _time _measurement  test_double  test_long
0  _result      0 2023-12-15 13:19:54+00:00 2023-12-15 13:19:57+00:00 2023-12-15 13:19:55.372000+00:00         test          4.0        NaN
1  _result      0 2023-12-15 13:19:54+00:00 2023-12-15 13:19:57+00:00        2023-12-15 13:19:56+00:00         test          NaN        1.0>

Note: the conversion of numeric values to extension dtypes works properly with pandas>=2.0, so in Python 3.7 environment, where the latest available pandas is 1.3.5, dtype of columns with NA values is 'object' ie. same as without the use extension types. For Python 3.8+, pandas 2.x is available.

Checklist

  • CHANGELOG.md updated
  • Rebased/mergeable
  • A test has been added if appropriate
  • pytest tests completes successfully
  • Commit messages are conventional

@codecov-commenter
Copy link

codecov-commenter commented Jan 30, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (27777d1) 90.19% compared to head (17ab3b1) 90.40%.

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #636      +/-   ##
==========================================
+ Coverage   90.19%   90.40%   +0.21%     
==========================================
  Files          39       39              
  Lines        3467     3503      +36     
==========================================
+ Hits         3127     3167      +40     
+ Misses        340      336       -4     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@alespour alespour marked this pull request as ready for review January 30, 2024 19:34
Copy link
Contributor

@bednar bednar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alespour thanks for PR 👍

Please, add the use_extension_dtypes: bool = False parameter also into async query API:

async def query_data_frame_stream(self, query: str, org=None, data_frame_index: List[str] = None,

@alespour alespour requested a review from bednar January 31, 2024 12:11
Copy link
Contributor

@bednar bednar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🚀

@bednar bednar merged commit 7a5f655 into master Jan 31, 2024
14 checks passed
@bednar bednar deleted the fix/issue-621 branch January 31, 2024 13:46
@bednar bednar added this to the 1.41.0 milestone Jan 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Pivotted query result with long and float type columns causes ValueError in _to_value()
3 participants