Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

deprecate automatically parsing header timestamps #1014

Open
minrk opened this issue Mar 9, 2024 · 0 comments · May be fixed by #1015
Open

deprecate automatically parsing header timestamps #1014

minrk opened this issue Mar 9, 2024 · 0 comments · May be fixed by #1015

Comments

@minrk
Copy link
Member

minrk commented Mar 9, 2024

I meant to upstream this issue a long time ago, but forgot. It came up again in ipython/ipykernel#1210.

Parsing timestamps in message headers has turned out to be a major overhead in handling messages, especially when they are almost never used in their parsed form. As a result, ipyparallel patches extract_dates to be a no-op (at import time, which wasn't the right choice - I'll move that to instantiation of the appropriate objects). Ironically, ipyparallel probably uses timestamps the most of any jupyter project, but the intermediate de/re-serialization in schedulers had a massive cost in benchmarks. I believe the same is true in the notebook server, which relays many messages without needing to look inside them, but it's still doing tons of datetime parsing (#590).

I think we should just not do this parsing at all, and instead rely on consumers of messages to parse the timestamps they are interested in when they want to use them.

It's a hard transition to make, because it's technically a breaking API change to just stop parsing these, but it's very unlikely to actually break anything aside from tests themselves because they are so rarely used in code.

I don't have a great strategy for how to smoothly deprecate this behavior, so I'd love it if someone had a good idea for how to do it.

ipyparallel's monkeypatch could have ben avoided, too, if extract_dates were an instance method on Session instead of a module function called unconditionally.

One possible option:

  • define Session.extract_dates method. Default: call session.extract_dates. Could also have a Session.extract_dates attribute that's a boolean toggle.
  • environment variable, e.g. JUPYTER_CLIENT_EXTRACT_DATES to opt-in to future behavior, deprecation warning for default, old behavior
  • someday invert default

The problem with this approach is that it makes both dates and strings technically valid, which means fully compatible code needs to be updated to handle both types, rather than just updating to the new unparsed strings, because consumers of Session-created messages aren't always in the same code that created them. But maybe that's the best we can do.

more related issues:

@minrk minrk linked a pull request Mar 10, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant