Skip to content

Conversation

@ethanwee1
Copy link

@ethanwee1 ethanwee1 commented Apr 22, 2025

rocm6.5_internal_testing move contents of centos stream dockerfile into dockerfile

Validation: http://rocm-ci.amd.com/job/mainline-framework-pytorch-ci/2448/

@rocm-repo-management-api
Copy link

rocm-repo-management-api bot commented Apr 22, 2025

Jenkins build for 2dd188e54c25f121e400edf121ba0d22030f0e85 commit finished as FAILURE
Links: Blue Ocean view / Build artifacts

Detected error during base docker image building:

#31 10.27 The following packages have unmet dependencies:
#31 10.33  rocm-dev : Depends: rocm-cmake (= 0.14.0.60304-76~22.04) but 5.0.0-1 is to be installed
#31 10.33             Depends: rocm-device-libs (= 1.0.0.60304-76~22.04) but 5.0.0-1 is to be installed
#31 10.33  rocm-utils : Depends: rocminfo (= 1.0.0.60304-76~22.04) but 5.0.0-1 is to be installed
#31 10.33               Depends: rocm-cmake (= 0.14.0.60304-76~22.04) but 5.0.0-1 is to be installed
#31 10.33 E: Unable to correct problems, you have held broken packages.
#31 ERROR: process "/bin/sh -c bash ./install_rocm.sh" did not complete successfully: exit code: 100
------
 > [stage-0 23/61] RUN bash ./install_rocm.sh:
10.27 distribution that some required packages have not yet been created
10.27 or been moved out of Incoming.

RUN bash ./install_amdsmi.sh
RUN rm install_amdsmi.sh

ENV ROCM_PATH /opt/rocm
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dont know why this is added only in centOS dockerfiles, dont remember the history. Try build by removing it.

Copy link
Collaborator

@jithunnair-amd jithunnair-amd Apr 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pruthvistony It's not only in CentOS Dockerfile, it was actually ported from Ubuntu Dockerfile to enable triton build: https://github.com/pytorch/pytorch/blame/47d34261e06e2416e7a1e7d51a3d428e4ea51f9d/.ci/docker/ubuntu-rocm/Dockerfile#L63

@rocm-repo-management-api
Copy link

rocm-repo-management-api bot commented Apr 25, 2025

Jenkins build for 2dd188e54c25f121e400edf121ba0d22030f0e85 commit finished as FAILURE
Links: Blue Ocean view / Build artifacts

@jithunnair-amd jithunnair-amd force-pushed the rocm6.5_internal_testing_dckerfileswap branch from 2dd188e to b72033a Compare April 25, 2025 23:03
@rocm-repo-management-api
Copy link

Jenkins build for 81ae6254f808a4cb43ceabd310ff7f2672d8e6c6 commit is in progress
Links: Blue Ocean view / Build artifacts

@jithunnair-amd jithunnair-amd marked this pull request as ready for review April 28, 2025 18:33
@jithunnair-amd jithunnair-amd merged commit 7886773 into rocm6.5_internal_testing Apr 28, 2025
1 of 6 checks passed
@jithunnair-amd jithunnair-amd deleted the rocm6.5_internal_testing_dckerfileswap branch April 28, 2025 18:37
ethanwee1 added a commit to ethanwee1/pytorch that referenced this pull request Apr 28, 2025
pruthvistony pushed a commit that referenced this pull request May 8, 2025
…ontents into dockerfile (#2044)

rocm6.5_internal_testing move contents of centos stream dockerfile into
dockerfile

Validation:
http://rocm-ci.amd.com/job/mainline-framework-pytorch-ci/2448/

---------

Co-authored-by: Jithun Nair <jithun.nair@amd.com>
pruthvistony pushed a commit that referenced this pull request May 8, 2025
…ontents into dockerfile (#2044)

rocm6.5_internal_testing move contents of centos stream dockerfile into
dockerfile

Validation:
http://rocm-ci.amd.com/job/mainline-framework-pytorch-ci/2448/

---------

Co-authored-by: Jithun Nair <jithun.nair@amd.com>
ethanwee1 added a commit to ethanwee1/pytorch that referenced this pull request May 12, 2025
pragupta pushed a commit that referenced this pull request Oct 29, 2025
====================================================

[SOW MS3] Centos stream9 PyTorch image support (#1090)

* changes to build Centos stream 9 images

* Added scripts for centos and centos stream images

* Added an extra line

* Add ninja installation

* Optimized code

* Fixes

* Add comment

* Optimized code

* Added AMDGPU mapping for ROCm 5.2 and invalid-url for rocm_baseurl

Co-authored-by: Jithun Nair <37884920+jithunnair-amd@users.noreply.github.com>

Updated to latest conda for CentOS stream 9

[CS9] Updates to CentOS stream 9 build (#1326)

- Add missing common_utils.sh
- Update the install vision part
- Move to amdgpu rhel 9.3 builds
- Update to pick python from conda path
- Add a missing package
- Add ROCM_PATH and magma
- Updated repo radeon path

(cherry picked from commit 51ce1cc)

[rocm6.4_internal_testing] Update missing changes for CentOS9 (#1813)

To fix, https://ontrack-internal.amd.com/browse/SWDEV-505385 and
https://ontrack-internal.amd.com/browse/SWDEV-507301

(cherry picked from commit 956c145)

delete .ci/docker/common/install_db.sh

(cherry picked from commit 8a7fd64)

CONSOLIDATED COMMITS: Updates to build on Jammy and CentOS7

===========================================================

Updates to build on Jammy
- Fortran package installation moved after gcc
- Update libtinfo search code in cmake1
- Install libstdc++.so

[UB22.04] Updates to support latest scipy

Build required version of libpng for CentOS7

Updated condition for libstc++ for Jammy

Set ROCM_PATH in env for centOS docker container

Changes to support docker v23

Reversed the condition as required

temporarily ignore certificate check for Miniconda

(cherry picked from commit 9848db1)

[release/2.1] Skip certificate check for CentOS7 since certificate expired (#1399)

* Skip certificate check only for CentOS7 since certificate expired

* Naming

Remove the installation of rocm-llvm-dev package

- Causing regression - SWDEV-463083

fix install_centos() function

[rocm6.3_internal_testing] skip pytorch-nightly installstion (#1557)

This PR skips pytorch-nightly installation in docker images

Installation of pytorch-nightly is needed to prefetch mobilenet_v2 avd
v3 models for some tests.
Came from

85bd6bc
Models are downloaded on first use to the folder /root/.cache/torch/hub
But pytorch-nightly installation also overrides
.ci/docker/requirements-ci.txt settings and upgrades some of python
packages (sympy from 1.12.0 to 1.13.0) which causes several
'dynamic_shapes' tests to fail
Skip prefetching models affects these tests without any errors (but
**internet access required**):

- python test/mobile/model_test/gen_test_model.py mobilenet_v2
- python test/quantization/eager/test_numeric_suite_eager.py -k
test_mobilenet_v3

Issue ROCm/frameworks-internal#8772

Also, in case of some issues these models can be prefetched after
pytorch building and before testing

(cherry picked from commit b92b34d)

Fixes #ISSUE_NUMBER

(cherry picked from commit ec70f7e)

[rocm6.4_internal_testing] Changes to support UB 24.04 build (#1817)

Changes applied from #1816

Test PyTorch build:
http://rocm-ci.amd.com/job/mainline-framework-pytorch-ub24.04-py312-internal/5/

(cherry picked from commit 74e1e9e)
(cherry picked from commit e7cb7cc)

Update Centos 9 build

(cherry picked from commit 3d6ba22)

[rocm6.5_internal_testing] remove centos.stream dockerfile and move contents into dockerfile (#2044)

rocm6.5_internal_testing move contents of centos stream dockerfile into
dockerfile

Validation:
http://rocm-ci.amd.com/job/mainline-framework-pytorch-ci/2448/

---------

Co-authored-by: Jithun Nair <jithun.nair@amd.com>
(cherry picked from commit 7886773)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants