-
Notifications
You must be signed in to change notification settings - Fork 87
Add validation of loader kwargs to dataset_load
#241
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
As discussed on #238: #238 (comment) http://b/388077145
| } | ||
|
|
||
| # Mapping of adapters to the valid kwargs to use for that adapter | ||
| _DATASET_LOAD_VALID_KWARGS_MAP = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: This can become tedious as the methods get more complex. Instead, we can use the inspect module to get the function signature. See https://docs.python.org/3/library/inspect.html#introspecting-callables-with-the-signature-object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, very cool. When we do the next data loader, I'll make a point to refactor this then.
neshdev
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Feel free to refactor to use inspect module if you can or feel free to save it for next time.
|
|
||
| def _load_polars_dataset_with_other_loader_kwargs_and_assert_warning(self) -> None: | ||
| output_stream = io.StringIO() | ||
| handler = logging.StreamHandler(output_stream) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess the unittest works, but in our code base we use redirect_stdout when working with standard output in unit test.
https://docs.python.org/3/library/contextlib.html#contextlib.redirect_stdout
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see. I found this example in the code base and based my implementation off of that:
Lines 57 to 67 in 4fe7fd0
| def test_login_returns_403_for_bad_credentials(self) -> None: | |
| output_stream = io.StringIO() | |
| handler = logging.StreamHandler(output_stream) | |
| logger.addHandler(handler) | |
| login("invalid", "invalid") | |
| captured_output = output_stream.getvalue() | |
| self.assertEqual( | |
| captured_output, | |
| "Invalid Kaggle credentials. You can check your credentials on the [Kaggle settings page](https://www.kaggle.com/settings/account).\n", | |
| ) |
So something like this would be a better pattern to follow?:
kagglehub/tests/test_logger.py
Lines 45 to 55 in 4fe7fd0
| def test_kagglehub_console_filter_discards_logrecord(self) -> None: | |
| with TemporaryDirectory() as f: | |
| log_path = Path(f) / "test-log" | |
| logger = logging.getLogger("kagglehub") | |
| stream = StringIO() | |
| with redirect_stdout(stream): | |
| # reconfigure logger, otherwise streamhandler doesnt use the modified stderr | |
| _configure_logger(log_path) | |
| logger.info("HIDE", extra={**EXTRA_CONSOLE_BLOCK}) | |
| logger.info("SHOW") | |
| self.assertEqual(stream.getvalue(), "SHOW\n") |
|
@neshdev I'll merge this for now to keep things moving. LMK if I should address the logging unit test and I'm going to backburner the inspect module refactor for the next data loader. |
As discussed on #238: #238 (comment)
http://b/388077145