Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updating workflow to use Ubuntu Latest Image #5671

Closed
Nirzak opened this issue Oct 8, 2022 · 12 comments · Fixed by #5675
Closed

Updating workflow to use Ubuntu Latest Image #5671

Nirzak opened this issue Oct 8, 2022 · 12 comments · Fixed by #5675

Comments

@Nirzak
Copy link
Contributor

Nirzak commented Oct 8, 2022

Summary

Hi Guys, I have seen that all scrapy CI workflows are using ubuntu-18.04 now which is rather become a little old OS. Can we update it to use ubuntu-latest stable image as Parsel is using ubuntu-latest as the workflow image.

So, can I open a pull request for that?
Thanks.

@wRAR
Copy link
Member

wRAR commented Oct 9, 2022

I think we can? I'm not aware of any requirements to use that version (even if we would want to make sure Scrapy works on 18.04, the base image doesn't really help with that, as during Github tests even Python is installed separately).

@Nirzak
Copy link
Contributor Author

Nirzak commented Oct 10, 2022

I think we can? I'm not aware of any requirements to use that version (even if we would want to make sure Scrapy works on 18.04, the base image doesn't really help with that, as during Github tests even Python is installed separately).

I will create a PR then. As today or tomorrow, we must have to move from Ubuntu 18.04 as it will soon meet its end-of-life support and also will be deprecated.

@sibprogrammer
Copy link

I guess it's better to stick the version number (ubuntu-20.04 or ubuntu-22.04) and avoid using "latest" to keep the environment more predictable.

@Nirzak
Copy link
Contributor Author

Nirzak commented Oct 11, 2022

I guess it's better to stick the version number (ubuntu-20.04 or ubuntu-22.04) and avoid using "latest" to keep the environment more predictable.

Yeah it’s also fine. But by default ubuntu-latest is always using the latest stable version. So it can be guessed which version it’s actually using. Stick to a particular version makes it to change again and again after a period of times.

@sibprogrammer
Copy link

sibprogrammer commented Oct 11, 2022

But by default ubuntu-latest is always using the latest stable version

Not really. ubuntu-latest doesn't mean it points to the stable release. It points to "some" release that GitHub thinks it should point to. Today it points to ubuntu-20.04, after some time it will point to ubuntu-22.04. It's a bad practice to use "latest" in CI pipelines. If you use "latest" CI pipeline can become broken in any moment due to base image change. It will be impossible to re-run tests on the old branch / version due to compatibility issues and etc.

Stick to a particular version makes it to change again and again after a period of times

Yes, and it's a good thing because it's an explicit change made my the repo maintainer.

@Nirzak
Copy link
Contributor Author

Nirzak commented Oct 11, 2022

But by default ubuntu-latest is always using the latest stable version

Not really. ubuntu-latest doesn't mean it points to the stable release. It points to "some" release that GitHub thinks it should point to. Today it points to ubuntu-20.04, after some time it will point to ubuntu-22.04. It's a bad practice to use "latest" in CI pipelines. If you use "latest" CI pipeline can become broken in any moment due to base image change. It will be impossible to re-run tests on the old branch / version due to compatibility issues and etc.

Stick to a particular version makes it to change again and again after a period of times

Yes, and it's a good thing because it's an explicit change made my the repo maintainer.

Usually github action changes the stable pointed version after a reasonable period. You see ubuntu-latest is still pointing to ubuntu-20.04 cause ubuntu-22.04 is still going through some changes though it’s almost stable and got stable release. Pipeline will never break if you always use the latest updated dependencies.
If you don’t update then the pipelines can be broken because of deprecated libraries. That’s call maintaining. To always keep up with the latest patch and updates.

@kmike
Copy link
Member

kmike commented Oct 11, 2022

even if we would want to make sure Scrapy works on 18.04, the base image doesn't really help with that, as during Github tests even Python is installed separately

@wRAR could it help with ensuring Scrapy works with older OpenSSL versions?

@wRAR
Copy link
Member

wRAR commented Oct 11, 2022

@kmike I believe OpenSSL in our tests comes from the cryptography binary wheel, not from the system (a system cryptography module installation would use the system OpenSSL but we don't use it). IIRC in the past we had problems detected only by build-time tests of the Debian package because those would use the system OpenSSL.

@kmike
Copy link
Member

kmike commented Oct 11, 2022

Regarding pinning vs not pinning, I think both makes sense.

It is good to get the failures on CI before users report them, so I think it makes sense to have at least some of the environments pointing to latest. At the same time, pinning looks useful if we want to support some minimum versions of dependencies. We should also pin packages which can't affect users, to avoid random breakages - e.g. packages which are needed for testing, but not for running Scrapy.

Here it looks like pinning to an older version doesn't really help us to ensure Scrapy works with older Ubuntu. Also, it doesn't look like something which affects tests only. So, I think that's fine to use latest in our case.

@kmike
Copy link
Member

kmike commented Oct 11, 2022

We do have an issue with running tests for older versions, e.g. when working on backporting fixes for older Scrapy versions. They sometimes break because of dependency updates.

But these are often real breakages, which show that Scrapy might break for the users in the same way. So, we're fixing these tests while doing a release - sometimes it requires backporting a fix from master, sometimes it's a matter of restricting/pinning dependencies - just for the older branch.

It's a pain, but I think it's better than relying on users to report incompatibilities with more recent dependencies.

@Gallaecio
Copy link
Member

We should also pin packages which can't affect users, to avoid random breakages - e.g. packages which are needed for testing, but not for running Scrapy.

I think the Ubuntu image here could be defined as part of this group of packages, and as so I would keep it pinned, i.e. do upgrade but to a specific version, not to a latest that could potentially break tests at some point.

@Nirzak
Copy link
Contributor Author

Nirzak commented Oct 12, 2022

We should also pin packages which can't affect users, to avoid random breakages - e.g. packages which are needed for testing, but not for running Scrapy.

I think the Ubuntu image here could be defined as part of this group of packages, and as so I would keep it pinned, i.e. do upgrade but to a specific version, not to a latest that could potentially break tests at some point.

So then should I pin the ubuntu version 20.04 or 22.04? I will update the PR as mentioned.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants