Skip to content

Conversation

Waschenbacher
Copy link
Contributor

Summary

Allow polars dataframe in CachedFoundryClient.save_dataset

Checklist

  • [x ] You agree with our CLA
  • [x ] Included tests (or is not applicable).
  • [x ] Updated documentation (or is not applicable).
  • [x ] Used pre-commit hooks to format and lint the code.

@nicornk nicornk requested a review from Copilot June 21, 2025 08:28
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds support for polars DataFrame in CachedFoundryClient.save_dataset, allowing users to pass polars DataFrame in addition to pandas and pyspark DataFrames.

  • Updated type hints and docstrings to incorporate polars support.
  • Modified test cases to validate functionality with both pandas and polars DataFrames.

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
tests/unit/test_cached_foundry_client.py Added parameterized tests for both pandas and polars DataFrames.
libs/foundry-dev-tools/src/foundry_dev_tools/cached_foundry_client.py Extended type hints, error messages, and functionality to support polars DataFrame in save_dataset.
Comments suppressed due to low confidence (2)

tests/unit/test_cached_foundry_client.py:38

  • [nitpick] Consider renaming 'test_save_pandas' to 'test_save_dataframe' to reflect that the test now covers both pandas and polars DataFrames.
@pytest.mark.parametrize(

libs/foundry-dev-tools/src/foundry_dev_tools/cached_foundry_client.py:222

  • Update the error message to include polars support, for example: 'Please provide a spark, pandas, or polars dataframe object with parameter "df"'.
        if df is None:

@nicornk
Copy link
Contributor

nicornk commented Jun 21, 2025

Thanks for the contribution.

can you double check we don’t run into this issue again when polars is not available (because it’s an optional dependency)

#83

Co-authored-by: Nicolas Renkamp <nicornk@users.noreply.github.com>
@Waschenbacher
Copy link
Contributor Author

Sure. In an environment without polars, the current code works fine. The if-clause for polars is the same as pandas:

if not pd.__fake__ and isinstance(df, pd.DataFrame):
    ...
elif not pl.__fake__ and isinstance(df, pl.DataFrame):
    ...

When polars is no available in the env, not pl.__fake__ gives False and isinstance(df, pl.DataFrame) will be skipped, hence no issue.

@nicornk nicornk merged commit f325241 into emdgroup:main Jun 23, 2025
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants