Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

REF: Make read_json less stateful #59124

Merged
merged 6 commits into from
Jul 2, 2024

Conversation

mroeschke
Copy link
Member

Also adds some validation for the typ keyword which aligns with the docs/typing

@mroeschke mroeschke added Refactor Internal refactoring of code IO JSON read_json, to_json, json_normalize labels Jun 27, 2024
@mroeschke mroeschke added this to the 3.0 milestone Jun 27, 2024
if result:
self.obj = obj
def _try_convert_types(self, obj: Series) -> Series:
obj, _ = self._try_convert_data("data", obj, convert_dates=self.convert_dates)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The existing code is pretty strange, but wouldn't this return something different in the case where the result value was False?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think obj should be the same if result value was False i.e. the conversion was a no-op. I'm relying on that fact here that a no-op result should just be passed through here with _try_convert_data

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah OK - definitely not a fan of relying on the implementation of another function like that...but that can be cleaned up in a separate PR

return False

self._process_converter(lambda col, c: self._try_convert_to_date(c), filt=is_ok)
obj = self._process_converter(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to make less stateless while maintaining these as separate functions? I find the new code harder to read, especially with the subtle difference in arguments to self._process_converter

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure thing. I just pushed a change to remove _process_converter and make things simpler

@@ -792,7 +792,7 @@ def test_frame_from_json_precise_float(self):

def test_typ(self):
s = Series(range(6), index=["a", "b", "c", "d", "e", "f"], dtype="int64")
result = read_json(StringIO(s.to_json()), typ=None)
result = read_json(StringIO(s.to_json()), typ="series")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did the tests need to change?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test didn't align with the docs/typing of typ accepting None. I added validation in this PR to ensure this case fails now

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm OK. That's super strange we allowed that to begin with...but hopefully harmless taking it away

return obj
return SeriesParser(json, **kwargs).parse()
else:
raise ValueError(f"{typ=} must be 'frame' or 'series'.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should add a bugfix note for this at least to have it visibile. Something to the effect of "read_json now raises if the typ argument is neither frame nor series"

Copy link
Member

@WillAyd WillAyd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice to have a whatsnew note, but otherwise lgtm

@mroeschke mroeschke merged commit 69fe98d into pandas-dev:main Jul 2, 2024
44 of 45 checks passed
@mroeschke mroeschke deleted the ref/json/lessstate branch July 2, 2024 02:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
IO JSON read_json, to_json, json_normalize Refactor Internal refactoring of code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants