-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Upgrade to Pyodide 0.23 #1347
Upgrade to Pyodide 0.23 #1347
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In a followup, we could switch to injecting the default URL (in remote_interpreter.ts
) and the version in the tests from package.json
. Ideally we should say the version:
- one time in the changelog
- one time in the docs
- one time in environment.yml
- and one time in package.json
so that there are only four edits needed to bump the version.
Thanks @JeffersGlass! |
I love this idea - I've turned it into issue #1349 so we don't lose track of it when this gets merged. |
Of the tests that failed in the most recent commit (
|
I plan to write an improved version of Also, unrelated note: we talked multiple times about declaring that each pyscript version support one and only one pyodide version. Maybe this is the right occasion to clean up the code and kill all the logic whose only goal is to support multiple ones? |
ok, I did it in PR #1363 :) |
403eae0
to
feba2eb
Compare
#1363 has solved the repl_test issue! Huzzah! But now something weird is happening... tests are crashing both in CI and when run locally, when running threaded. They seem to be the same collection of tests that were taking longer before, but perhaps not. Have not dug in very far yet. |
maybe for some reason they use more memory and thus we get an OOM? |
Still gathering data on this, please forgive the lengthy brain dump. @antocuni the issue doesn't appear to be memory related; at least locally where I can monitor it, memory usage is nominal. I perhaps mispoke earlier - tests are not crashing, they are hanging. In that they never actually finish. I've run the integration tests 10 times waiting at least 5 minutes after the last printed line (and in one case, over 2.5 hours when I walked away), and the tests do not complete. Tests do all successfully complete when run not-threaded, so I suspect it's some interaction with I've saved the test output for these runs and processed it below - test names on the left, results on the right; you may need to scroll right to see the results. For brevity (lol), I've only included the tests that had at least one non-success. The results key is:
Starting from run 6, I included the pytest-random-order plugin with
And finally, just because this wasn't long enough: when I kill the tests via a keyboard interrupt, I've been saving the tracebacks. They all look generally like this, if it means anything to anyone:
|
If anyone else wants to try their hand at this, feel free - I'd like to prioritize the event handlers PR (and to a lesser extent, |
I'm not too surprised by this to be honest. In my experience, in all the projects in which I tried to use For all I know, we might also decide to just not use But I'm still confused and I'm not 100% to have correctly understood what's happening. Please let me know whether the following sentences are correct or now:
If so, it would be interesting to see the tracebacks for those tests which FAIL in case (4) |
@antocuni that is not quite the situation I have been seeing, and also my commits this morning have changed things a bit:
Watching memory usage more closely while the tests are running in parallel locally, it seems like there is indeed an Out-of-Memory issue - at least running locally, near the end of tests, my local machine runs out of memory, then swap rapidly fills up, then the tests crash/hang. The largest memory hogs are the tests in |
ok that's good. So this mean that the problem is not CI vs local, it's xdist vs sequential.
If it's really the OOM-killer, I'm not surprised to see random tests hanging. For each test, there are at least three processes involved (possibly more):
If (2) or (3) gets killed, it's totally possible that the rest of the system becomes confused and hangs.
If you are on linux, you can look at It's totally possible that running test_zz_examples in parallel takes too much memory, because some of them are probably very heavy. I suggest the following:
|
I've tried out running the tests with I will break out the integration tests into two separate runs as you say, I think that's probably the smoothest way forward. It does sound like our GitHub action runner is getting beefed up - I could still see a world where we run the examples tests in Parallel in CI if that happens. |
CI is green again and this is once again ready for review. @antocuni I refactored |
// for which the signature of `loadPackage` accepts the above params as args i.e. | ||
// the call uses `logger.info.bind(logger), logger.info.bind(logger)`. | ||
const messageCallback = logger.info.bind(logger) as typeof logger.info; | ||
if (this.interpreter.version.startsWith('0.22')) { | ||
if (Number(this.interpreter.version.split('.')[1]) >= 22) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: a funny topic here ... JavaScript already understands semver in strings by design:
'0.23.2' > '0.22'; // true
'0.22' > '0.22'; // false
'0.22.1' > '0.22'; // true
'0.23' > '0.22.100001'; // true
accordingly .. .do we really need to do this dance that obviously doesn't scale / work moving forward with major versions too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about 0.23.10
vs 0.23.9
. This is generally what causes trouble when comparing versions lexicographically.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you are right, that fails as 0.23.09
should be on the right side ... good call, but here we're targeting a minor and any 0.23.XX
will always be greater than 0.22.X
so when it's minor versions we're after, the current dance looks unnecessary to me ... if it was about patches, the patch that introduced breaking changes was bad and yet we'd be good if minor or major is meant.
edit
in this case we should use "0.22" > "0.21"
unless the 0.22.X
patch broke the API or expectations in which case my hint wouldn't work (but also wouldn't the current proposed change) but I think 0.22.X
should not have changed the API there ... but I guess that's another story and here we're still looking for >= 0.22
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The breaking change in the API that this accounts for is in 0.22.0 (the minor-version release), but it does feel odd to me to do string-wise comparison on what are ultimately string-represented numbers. That said, I don't mind doing this in a Javascript-y way. Will make the change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the thing is that once we have a major this code won't work anymore while using "1.0" > "0.21"
will, so that there's nothing to change from our side in the future if we just use the current version and compare it against "0.21"
as string.
it's quirky but less maintenance and logic or surprises prone.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But you do have the problem that "0.100.0" < "0.22"...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
with our time to release I don't see that as a real-world issue to consider + I am not a big fan of 0.100
releases ... not even React will reach that 😅 ... also I believe whenever that happens this code won't be needed or exist anymore, I am just a bit pragmatic here but no hard feelings with using current logic.
self.goto("examples/panel_kmeans.html") | ||
self.wait_for_pyscript(timeout=90 * 1000) | ||
self.wait_for_pyscript(timeout=120 * 1000) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Terrible. What kind of test requires a twelve second timeout? Scipy is the worst.
It'll be really nice once jspi is shipping in browsers and we don't have to preload scipy's 100+ .so files. In the meantime I think we should get rid of kmeans. If a single test doesn't test anything and requires a twelve second timeout it needs to be gotten rid of. If it does test something can anyone tell me what it tests and why it can't be done in a scipy free way?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to be honest it feels like we test too much 3rd party in our integration when the goal should be "if 3rd party can be loaded, we're good" so I am actually curious myself why we take responsibility for all these foreign modules ... I understand the will to ensure stuff works, but having these as part of our testing pipeline seems a bit off to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to clarify further ... I think these tests should run only before a release and not for every commit we make to this project ... that would save time (both ours + computation) and grant we can tackle foreign modules gotchas before a release, not daily ... imho that would be great to me, so that code-freeze on release is meant to tackle possible gotchas and fix these if needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hoodmane for what it's worth, that's a 120 second timeout 🤢
I'm more than happy to go an entirely different direction with testing and removing the examples from the testing workflow, or testing them only prior to release, or what have you. I was mostly looking for a straightforward path to get this long-standing PR merged and the project moved up to Pyodide 0.23/Python 3.11.
How would we feel about merging this as-is and following up with a separate PR/discussion about improving the examples testing process?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that would work for me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than only before the release, I'd personally prefer to run them nightly or weekly or something like that. At least we are aware something broke sooner (and we don't have the overhead of testing examples before merging a PR)
In terms of testing too much 3rd party, I agree that we should test less and mainly focus on "if they load, we are good". There are some caveats though that I'd like for us to consider and add better support to specific 3rd party libraries. That's especially true for modules that are quite popular and don't just work with specifying them as dependencies. Take panel
or bokeh
as examples for instance, I'd like for us to make the user experience better and "auto-patch" everything they need to "just work" when you specify them as packages (but that is a totally different discussion and I think it probably belongs to PyScript core and rather to higher level API or something like this)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd personally prefer to run them nightly or weekly or something like that
that's a great idea too, as long as these are off our daily pipes around unrelated tasks
Wow that's impressive - my little address comments commit caused every single examples test to fail... but only in CI. The same was happening in |
Oh it's not just me, it's now all of the examples tests that are failing, even in PR's with no change. See #1468. Will have to wait till whatever that is is resolved before testing this again. |
@JeffersGlass I'm confused: the fact that you had to tweak the test in such a way looks like a code smell to me:
This seems the relevant code: pyscript/pyscriptjs/src/main.ts Lines 280 to 294 in 61b3154
How is it possible that the page prints |
I'm not sure how it's possible, but it seems to be true, at least if I'm reading the output correctly. If js errors are to be caught, they should appear in the console before pyscript/pyscriptjs/tests/integration/support.py Lines 419 to 430 in 61b3154
However, sometimes (but not always) the error does not appear in stddout until after it has completed and "Waited for X s" is printed. This is the relevant captured stdout from this test run. Note how
This test failed, because a JS error was expected by wait_for_console, but it did not appear until after wait_for_console terminted. So I agree, something smells, and I think something about our understanding of the timing of the js_error output or the test mechanism is incomplete. I returned to manually testing for JS errors in a later place, when some other asserts have verified that the banner already exists, and therefore so should the message in the console. |
What I would propose is we merge this as-is, and then open a separate issue re-examining the issue around checking JS errors during I know that's not ideal, but I'm trying (and failing) to prevent this PR from becoming a catch-all for discovered issues. (js_errors, how/when we run examples, panel/bokeh API changes...😬). |
I'm fine with merging this PR as is but yes, the problems that it uncovered must be fixed. |
@antocuni Agreed! I am going to hit merge here, and open a new issue rehashing what the issue is. |
Description
This PR bumps the Pyodide version from 0.22.1 to 0.23.2. See the pyodide changelog for a detailed list of changes.
Closes #1346
Changes
Bumped pyodide version in npm, default Interpreter loading paths, tests, and documentation.
Comments
It would be nice to expose some of the reduced loading-size options to PyScript users. I suppose they can use
[interpreters]
to specify the smallerpyc
version directly, though it might be nice to have that as a separate key in py-config. (Pyodide 3701)Possibly exposing the new
stdLibURL
parameter ofloadPyodide
would be good as well (Pyodide 3670).@marimeireles there's been some updates to the
isCallable
interface (now I think the method isinstanceof pyodide.ffi.PyCallable
), which might help with some of the things you were discussing in #1336?@FabioRosado & @marimeireles
runPython
andrunPythonAsync
now take a locals argument, but I think with the new event work that's no longer a critical feature, yeah? (Pyodide 3618)Checklist
docs/changelog.md