-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Release 0.1.0 #247
Release 0.1.0 #247
Conversation
recipe/meta.yaml
Outdated
- cloudpickle ==2.1.0 | ||
- tornado ==6.1 | ||
- toolz ==0.11.2 | ||
- python-blosc ==1.10.2 | ||
- zict ==2.2.0 | ||
- xgboost ==1.6.1 | ||
- dask-ml ==2022.5.27 | ||
- openssl >1.1.0g |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that we're removing this openssl
pin here as it's no longer needed with newer versions of distributed
that contain dask/distributed#6562
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, pending CI
recipe/meta.yaml
Outdated
{% set version = "0.1.0" + environ.get("VERSION_SUFFIX", '') %} | ||
{% set dask_version = environ.get("DASK_VERSION", "2022.6.0") %} | ||
{% set distributed_version = environ.get("DISTRIBUTED_VERSION", "2022.6.0") %} | ||
{% set version = "0.2.0" + environ.get("VERSION_SUFFIX", '') %} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's this version number?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This goes into the release number for nightly builds for coiled-runtime
on the coiled
conda channel. Thinking about this more, this should probably be updated to be 0.1.1
instead of 0.2.0
The one CI build that's failing is due to an HTTP error in one of our scheduler plugins Aug 15 20:50:30 ip-10-0-4-89 cloud-init[1267]: Traceback (most recent call last):
Aug 15 20:50:30 ip-10-0-4-89 cloud-init[1267]: File "/opt/conda/envs/coiled/lib/python3.9/site-packages/distributed/core.py", line 481, in start
Aug 15 20:50:30 ip-10-0-4-89 cloud-init[1267]: await asyncio.wait_for(self.start_unsafe(), timeout=timeout)
Aug 15 20:50:30 ip-10-0-4-89 cloud-init[1267]: File "/opt/conda/envs/coiled/lib/python3.9/asyncio/tasks.py", line 442, in wait_for
Aug 15 20:50:30 ip-10-0-4-89 cloud-init[1267]: return await fut
Aug 15 20:50:30 ip-10-0-4-89 cloud-init[1267]: File "/opt/conda/envs/coiled/lib/python3.9/site-packages/distributed/scheduler.py", line 3420, in start_unsafe
Aug 15 20:50:30 ip-10-0-4-89 cloud-init[1267]: await asyncio.gather(
Aug 15 20:50:30 ip-10-0-4-89 cloud-init[1267]: File "https://cloud.coiled.io/api/v2/cluster_facing/preload/scheduler", line 56, in start
Aug 15 20:50:30 ip-10-0-4-89 cloud-init[1267]: tornado.httpclient.HTTPClientError: HTTP 502: Bad Gateway Not sure if this is a known issue, or some random transient error, cc @ntabris for visibility. Regardless, I'm going to rerun CI to see if it goes away. I've also opened dask/distributed#6890 upstream to log these errors instead of failing to bring the scheduler up |
Thanks, @jrbourbeau. I think the PATCH requests we make in scheduler preload aren't being retried on transient errors, I've made https://github.com/coiled/platform/issues/16. |
Okay, so a regression was detected here
which is great to see automatically happening. Based on the CI logs, our |
Seeing what appears to be similar behavior on a different test after #250 was merged:
@ian-r-rose any thoughts? |
Ah, I see -- I think I've been interpreting the message displayed in the regression exception incorrectly. Interestingly the detected regressions haven't been consistent between CI runs (assuming all detected regressions are printed in CI). I'll rerun CI again to see how consistent things are. If this is a regression, it's okay for us to revert all package version updates here as this particular release can be just about getting something on PyPI |
I may have some fixes I need to do to make re-running CI functional. I think the artifacts may be persisted between different runs, which can cause issues with how I've set things up: https://github.com/coiled/coiled-runtime/runs/7867069146?check_suite_focus=true |
Yeah, the message can probably be improved |
Okay, good to know. I'll stick with pushing an empty commit, which seems to work, instead of using the rerun CI button in the GitHub UI |
Looks like it reproduced (hooray?) |
Indeed! I've gone ahead and reverted all the package version pin updates. Assuming there are no regressions now, let's go ahead with this more minimal change so we can get |
I wonder if it's malloc trim again, though I haven't fully tracked down the timing. |
Interestingly the regressions reported here are both with the
I'm inclined to not have this block a 0.1.0 release, but am curious what other folks think. @ian-r-rose thoughts? |
Isn't this release basically a no-op as the branch is right now? I suspect that there was an actual regression in dask or distributed that we are catching by updating |
That's correct, the current state of this PR doesn't update any package pins. That's why I was confused by a regression being reported for the That said, I agree the regressions which were reported earlier, when package versions were updated do seem legitimate |
LGTM |
Updates package versions for releasing
xref #233