Conversation
- Updated `imednet/integrations/export.py` to explicitly include "str" dtype when selecting columns for sanitization, silencing a Pandas 3.0 deprecation warning.
- Deleted `tests/unit/test_csv_sanitization.py`.
- Created `tests/unit/test_export_sanitization.py` with expanded coverage:
- Verifies CSV sanitization.
- Verifies Excel sanitization (previously untested).
- Verifies non-string columns are preserved.
- Ensures no warnings are emitted during export.
Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
|
👋 Jules, reporting for duty! I'm here to lend a hand with this pull request. When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down. I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job! For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with New to Jules? Learn more at jules.google/docs. For security, I will only act on instructions from the user who triggered this task. |
|
The failing job is due to a formatting issue: black reports that tests/unit/test_export_sanitization.py would be reformatted. To fix this, run black on the file to apply auto-formatting: black tests/unit/test_export_sanitization.pyAfterward, commit and push the changes. This will resolve the job failure caused by code style violations. |
Thank you. I have run |
- Updated `imednet/integrations/export.py` to explicitly include "str" dtype when selecting columns for sanitization, silencing a Pandas 3.0 deprecation warning.
- Deleted `tests/unit/test_csv_sanitization.py`.
- Created `tests/unit/test_export_sanitization.py` with expanded coverage:
- Verifies CSV sanitization.
- Verifies Excel sanitization (previously untested).
- Verifies non-string columns are preserved.
- Ensures no warnings are emitted during export.
- Reformatted tests with black to pass CI.
Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
- Updated `imednet/integrations/export.py` to explicitly include "str" dtype when selecting columns for sanitization, silencing a Pandas 3.0 deprecation warning.
- Deleted `tests/unit/test_csv_sanitization.py`.
- Created `tests/unit/test_export_sanitization.py` with expanded coverage:
- Verifies CSV sanitization.
- Verifies Excel sanitization (previously untested).
- Verifies non-string columns are preserved.
- Ensures no warnings are emitted during export.
- Reformatted tests with black and fixed Ruff linting errors (unused variable, line length) to pass CI.
Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
|
The failing job is caused by a TypeError: numpy string dtypes are not allowed, use 'str' or 'object' instead. This error occurs in tests that involve export to CSV and Excel. The root cause is likely in your export code (e.g., imednet/integrations/export.py), where Pandas DataFrames contain columns with numpy string dtype (such as numpy.str_ or S types). Pandas 2.0+ no longer allows these dtypes for I/O. Solution: import numpy as np def sanitize_string_dtypes(df: pd.DataFrame) -> pd.DataFrame: In your export_to_csv/export_to_excel functions, add:df = sanitize_string_dtypes(df) Then continue to write the DataFrameAlternatively, you can coerce just before to_csv/to_excel: df = df.astype({col: "object" for col in df.select_dtypes(include=["S", "U", "string", np.str_]).columns}) Update the export function(s) to ensure no DataFrame written to disk has numpy string dtypes. References:
Once corrected, your tests should pass with Pandas 2.x and future versions. |
Thank you for the detailed analysis. I've updated the |
- Updated `imednet/integrations/export.py` to use a `try-except TypeError` block when selecting string dtypes. This handles a `TypeError` in Pandas 2.x (seen in CI) while preserving the fix for Pandas 3.0 warnings.
- Deleted `tests/unit/test_csv_sanitization.py`.
- Created `tests/unit/test_export_sanitization.py` with expanded coverage:
- Verifies CSV sanitization.
- Verifies Excel sanitization (previously untested).
- Verifies non-string columns are preserved.
- Ensures no warnings are emitted during export.
- Reformatted tests with black and fixed Ruff linting errors.
Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
This PR addresses a Pandas deprecation warning (
Pandas4Warning) related to object dtype selection inexport.pyand significantly improves test coverage for the export functionality.Changes:
_sanitize_df,select_dtypesnow explicitly includes"str"alongsideobject. This prevents the warning about implicit string inclusion in object dtype, which is deprecated in Pandas 3.0+.test_csv_sanitization.pywith a comprehensivetest_export_sanitization.py.export_to_excelto ensure formula injection protection works for Excel exports as well (mockingto_excelto verify the DataFrame state before write).warnings.catch_warnings(record=True)) to ensure the export process is warning-free.Impact:
PR created automatically by Jules for task 17288914380282949305 started by @fderuiter