Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Deprecation] Properly deprecate CentOS7 as CI build image/Supported OS and switch to Almalinux8 #4379

Open
peterzhuamazon opened this issue Jan 29, 2024 · 45 comments
Assignees
Labels
Build Libraries & Interfaces dependencies Pull requests that update a dependency file deprecation docker documentation Improvements or additions to documentation release v2.16.0

Comments

@peterzhuamazon
Copy link
Member

peterzhuamazon commented Jan 29, 2024

20240425 We have since moved from Rockylinux8 to Almalinux8
20240430 We will switch back to CentOS7 for another release in 2.14.0

[Deprecation] Properly deprecate CentOS7 as CI build image/Supported OS and switch to RockyLinux8

Back in #3743, we already plan to migrate all CentOS7 images to RockyLinux8 as part of the NodeJS18(#1563) / Python3.9(#3351) Upgrade.

However, it was pushed back by the k-NN Team as moving from CentOS7 to RockyLinux8 would change the minimal glibc version from 2.17 to 2.28 during lib build, causing people with servers containing glibc lower than 2.28 failed to run k-NN, as the plugin will crash upon api call.

CentOS7 will end its support by 2024/06/30 (https://endoflife.date/centos), meaning we need to migrate to RockyLinux8 by then, especially for the only build image for OpenSearch. We also need to properly announce it on the website, as our compatibility chart is clearly mentioning the support for it here: https://github.com/opensearch-project/documentation-website/blob/054dc56d5dcef66576cd4b15bef0ae5efa4f8f77/_install-and-configure/install-opensearch/index.md?plain=1#L26

We need to have a plan and decide when do we announce:

  1. Discontinue CentOS7 support on Release
  2. Discontinue CentOS7 support on CI Build
    2.1. Which in turn, k-NN drop CentOS7 support on its lib as well

Thanks.



PRs:

@github-actions github-actions bot added the untriaged Issues that have not yet been triaged label Jan 29, 2024
@peterzhuamazon peterzhuamazon added release docker dependencies Pull requests that update a dependency file Build Libraries & Interfaces deprecation documentation Improvements or additions to documentation and removed untriaged Issues that have not yet been triaged labels Jan 29, 2024
@bbarani
Copy link
Member

bbarani commented Jan 30, 2024

@vamshin @jmazanec15 @VijayanB We need to put out deprecation notice before we stop support for CentOS7 in upcoming OpenSearch version. CC: @krisfreedain

@peterzhuamazon peterzhuamazon changed the title [Deprecation] Properly deprecate CentOS7 as CI build image and switch to RockyLinux8 [Deprecation] Properly deprecate CentOS7 as CI build image/Supported OS and switch to RockyLinux8 Jan 30, 2024
@krisfreedain
Copy link
Member

@vamshin @jmazanec15 @VijayanB - Agree with @bbarani - I'd recommend this being included in the Release Notes, and likely the launch blog @jhmcintyre

@peterzhuamazon peterzhuamazon self-assigned this Mar 26, 2024
@peterzhuamazon peterzhuamazon changed the title [Deprecation] Properly deprecate CentOS7 as CI build image/Supported OS and switch to RockyLinux8 [Deprecation] Properly deprecate CentOS7 as CI build image/Supported OS and switch to Almalinux8 Apr 8, 2024
@peterzhuamazon
Copy link
Member Author

We plan to switch to Almalinux from Rockylinux soon.
#4525

@jmazanec15
Copy link
Member

@peterzhuamazon What glibc does almalinux have?

@peterzhuamazon
Copy link
Member Author

Hi @jmazanec15 @naveentatikonda ,

  1. Almalinux8 present the same packages and versions compares to Rockylinux8 in majority of the case, glibc is 2.28.
  2. Because of the switch GCC is now also 9.2.1 (from 9.3.1 in CentOS7 specifically after enable devtoolset-9) which should not pose any issue consider this: Raise gcc version requirement to 9.0.0 for SIMD Neon support on ARM64 k-NN#1517

Could you help confirm these from KNN team perspective?

  • Customer impact on this from KNN team perspective
  • Any documentation for the knn part needs changes
  • Is there any other dependencies on CentOS7 apart from the glibc minimal version

If you can think of anything else, please also note.

cc: @vamshin @bbarani

Thanks.

@bbarani
Copy link
Member

bbarani commented Apr 9, 2024

Tagging @elfisher @krisfreedain @dblock as we are nearing CentOS 7 EOL date (June 30, 2024).

@jmazanec15
Copy link
Member

jmazanec15 commented Apr 9, 2024

Hey @peterzhuamazon

Customer impact on this from KNN team perspective

Not sure on this. Do we have any breakdown of what distro's open source users are using? @bbarani @krisfreedain

Any documentation for the knn part needs changes

The only thing we should add is instructions for how to build and install libs from source if users do not have proper compatibility.

Is there any other dependencies on CentOS7 apart from the glibc minimal version

I dont think so. The big dependencies are openblas and openmp. openblas I think we fetch from package manager. openmp will come with gcc libs.

If you can think of anything else, please also note.

Only thing is to call out the OS's we are dropping support for - basically anything with glibc < 2.28

@peterzhuamazon
Copy link
Member Author

peterzhuamazon commented Apr 18, 2024

After offline discussion with k-NN team, we will now switch the 2.14.0 manifest to use Almalinux8 for building OpenSearch.

cc: @vamshin @jmazanec15 @bbarani @elfisher @krisfreedain @dblock

Thanks.

@peterzhuamazon
Copy link
Member Author

Seems like there are some issues with almalinux8 running gcc 9.2.1 for ARM64 build.
@jmazanec15 has reported to me and he will add more details.
I will try to raise from gcc 9.2.1 to 9.3.1 which is the version we used back in CentOS7 soon.

@jmazanec15
Copy link
Member

@peterzhuamazon right - issue here: opensearch-project/k-NN#1591 (comment).

@peterzhuamazon
Copy link
Member Author

peterzhuamazon commented Apr 19, 2024

Based on https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94052, it seems the bug has been fixed in gcc 10 then backport to gcc 9 and gcc 8 on 2020/03/24.

According to https://gcc.gnu.org/releases.html, 9.2 released on 2019/08/12 thus not containing the fix until 9.3 in 2020/03.

A few routes:

  • Get 9.3.1 source code and compile 9.3.1 just for rockylinux8.
  • Go to gcc10.
  • Go to gcc12 as Windows is already using that version for a long time (recommended).

Windows:

$ Which gcc
/c/Users/ContainerAdministrator/scoop/apps/mingw/12.2.0-rt_v10-rev1/bin/gcc

$ gcc --version
gcc.exe (x86_64-posix-seh-rev1, Built by MinGW-W64 project) 12.2.0
Copyright (C) 2022 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

......

$ which g++
/c/Users/ContainerAdministrator/scoop/apps/mingw/12.2.0-rt_v10-rev1/bin/g++

$ g++ --version
g++.exe (x86_64-posix-seh-rev1, Built by MinGW-W64 project) 12.2.0
Copyright (C) 2022 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

After offline discussion with @jmazanec15 we have no concern going to gcc12.

Will PR soon.

Thanks.

@junqiu-lei
Copy link
Member

junqiu-lei commented May 30, 2024

With the build CI artifacts triggered by Peter, all the remote integ tests passed on X64 and Arm64 AL2 envs.

@peterzhuamazon
Copy link
Member Author

@peterzhuamazon
Copy link
Member Author

@peterzhuamazon
Copy link
Member Author

Due to additional issues with the lib we are reverting back to CentOS7 for another release:

@Pallavi-AWS
Copy link
Member

Due to additional issues with the lib we are reverting back to CentOS7 for another release:

* [[BUG] CI failing on Linux with Java runtime error k-NN#1737](https://github.com/opensearch-project/k-NN/issues/1737)

Thanks @peterzhuamazon. We would need to prioritize the deprecation in next release though - @vamshin @navneet1v

@junqiu-lei
Copy link
Member

junqiu-lei commented Jun 12, 2024

In AL2 docker image we build openblas from source from here through build team pipeline with m5.4xlarge machine.

The above additional issue was found when we test against on different arch instance types on x64 architecture. For example, it can be reproduced when run knn test on c5a.4xlarge machine with default docker image build from m5.4xlarge.

After adding DYNAMIC_ARCH=1 argument to openblas make, we found the issue can be resolve in above test example, we still need to verify the change in other envs(arm64/x64 on AL2, AL2023, Ubuntu 20.04).

@naveentatikonda
Copy link
Member

naveentatikonda commented Jun 14, 2024

Had discussion with a couple of folks from Intel team for adding AVX512 support. There are a couple of requirements that we have identified so far:

  • gcc compiler should be atleast of version 12. Right now, we are using version 10
  • library needs to be built on 4th Gen Intel® Xeon® Scalable processor (like the r7i )

One blocker is yum only supports gcc versions 7 and 10 on AL2 (as we are switching to AL2 from OpenSearch 2.16), so we might need to manually compile it. @peterzhuamazon please let me know if you think there are any other blockers from build POV.
cc: @vamshin @navneet1v @jmazanec15 @junqiu-lei

@peterzhuamazon
Copy link
Member Author

peterzhuamazon commented Jun 26, 2024

Restart this process again to have k-NN work in 2.16.0, tagging @prudhvigodithi as he would be the 2.16.0 release manager.

I will proceed the next steps with @junqiu-lei

  1. Junqiu push the changes of using DYNAMIC_ARCH=1 to AL2 image for x64 only
  2. Peter proceed with rebuilding the AL2 image and publish to all registries
  3. Peter proceed with the docker image switch again to AL2
  4. Junqiu monitor and confirm the image working in k-NN GitHub Repo
  5. Peter make changes to the centos7 deprecation documentation PRs and highlight text for 2.16.0

Thanks.

@peterzhuamazon
Copy link
Member Author

Try switch again from centos7 to al2 with dynamic_arch changes:

@junqiu-lei
Copy link
Member

Verified on k-NN repo that the build CI tests is passed:
https://github.com/opensearch-project/k-NN/actions/runs/9735975558/job/26904055122

opensearch al2
public.ecr.aws/opensearchstaging/ci-runner:ci-runner-al2-opensearch-build-v1

@peterzhuamazon
Copy link
Member Author

peterzhuamazon commented Jul 1, 2024

Thanks @junqiu-lei ,

We are moving this issue to backlog as the AL2 part is completed.
We will re-visit this once the Almalinux8 part is coming next year.

@peterzhuamazon
Copy link
Member Author

@dbwiddis
Copy link
Member

dbwiddis commented Jul 7, 2024

CentOS7 will end its support by 2024/06/30 (https://endoflife.date/centos), meaning we need to migrate to RockyLinux8 by then

Hello from 2024/07/06! :)

GitHub-based actions using this container no longer work. GitHub switched their runners to Node20 as of June 3 which have eventually rolled out to the runners for plugin CI.

TLDR: It looks like PR #4820 is intended to address this, but the issue also mentions NodeJS 18. Will that PR support NodeJS 20, which is now a minimal version for any GitHub Actions?

@peterzhuamazon
Copy link
Member Author

peterzhuamazon commented Jul 8, 2024

CentOS7 will end its support by 2024/06/30 (https://endoflife.date/centos), meaning we need to migrate to RockyLinux8 by then

Hello from 2024/07/06! :)

GitHub-based actions using this container no longer work. GitHub switched their runners to Node20 as of June 3 which have eventually rolled out to the runners for plugin CI.

TLDR: It looks like PR #4820 is intended to address this, but the issue also mentions NodeJS 18. Will that PR support NodeJS 20, which is now a minimal version for any GitHub Actions?

Sorry @dbwiddis AL2 only support older nodejs version:

We need to support using AL2 as baseline for another year until 2025/06/30 before moving to almalinux8.

Thanks.

@peterzhuamazon
Copy link
Member Author

Note that we are are shipping nodejs18 on OSD. (edited)

This is particularly that opensearch plugins repos are currently running on AL2 build images for checks purposes.

And github checkout action requires to install node to checkout.

This should have nothing to do with your code compilation or anything, it is just for the sake of run that action to git checkout code.

OSD plugins does not affect by this at all.

Because we are only supporting AL2 for another year, for support k-NN native code compilation on glibc 2.26.

Hope that can explain the confusion 😅 Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Build Libraries & Interfaces dependencies Pull requests that update a dependency file deprecation docker documentation Improvements or additions to documentation release v2.16.0
Projects
Status: ⌛ On Hold
Status: Action items ✍
8 participants