Skip to content

Conversation

@jbusche
Copy link
Contributor

@jbusche jbusche commented Nov 13, 2023

Closes #324'

What changes have been made

Swapping out the docker.io busybox to the quay.io one.

Reason for the Change:

At least at IBM on our Fyre clusters, the docker.io pulls are getting a rate-limit error

Failed to pull image "busybox:1.28": rpc error: code = Unknown desc = reading manifest 1.28 in docker.io/library/busybox: toomanyrequests: You have reached your pull rate limit. You may increase the limit by authenticating and upgrading: https://www.docker.com/increase-rate-limit

But quay.io doesn't have a rate-limit issue, so we've moved the docker.io busybox over to quay.io and are using it in the tests and guided demos so that people don't see this docker pull issue in the future.

Verification steps

Checks

  • I've made sure the tests are passing.
  • Testing Strategy
    • Unit tests
    • Manual tests
    • Testing is not required for this change

Signed-off-by: James Busche <jbusche@us.ibm.com>
@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Nov 13, 2023
@jbusche
Copy link
Contributor Author

jbusche commented Nov 13, 2023

Hey, unit tests looked good - thank you @tedhtchang with the help on getting poetry on my Mac working...

pytest -v tests/unit_test.py
=========================================================================================== test session starts ===========================================================================================
platform linux -- Python 3.9.16, pytest-7.4.0, pluggy-1.3.0 -- /root/.cache/pypoetry/virtualenvs/codeflare-sdk--BEgL2cU-py3.9/bin/python
cachedir: .pytest_cache
rootdir: /root/codeflare-sdk
configfile: pyproject.toml
plugins: mock-3.11.1
collected 58 items   
....
=========================================================================================== 58 passed in 10.60s ===========================================================================================

@jbusche
Copy link
Contributor Author

jbusche commented Nov 13, 2023

Testing now in my cluster...

@jbusche
Copy link
Contributor Author

jbusche commented Nov 14, 2023

Testing in my cluster looked good too - here's what I did

  1. Installed the normal way using @KPostOffice quickstart dsc method from Quick-Start.md
  2. The only two things I did differently was:
    2.1 Instead of using the Codeflare notebook image, I used the Jupyter Data Science image
    2.2 Instead of cloning the main branch of codeflare-sdk, I cloned my own branch and installed that version of codeflare with these steps:
git clone https://github.com/jbusche/codeflare-sdk.git -b jimb-pr324-2

cd codeflare-sdk/

pip list |grep codeflare
    codeflare-sdk                   0.10.1 
    codeflare-torchx                0.6.0.dev1

pip install -e .

pip list |grep codeflare
    codeflare-sdk                   0.0.0        /opt/app-root/src/JIM/codeflare-sdk
    codeflare-torchx                0.6.0.dev1

Now when I run the 0_basic_ray.ipynb or 2_basic_jobs.ipynb, instead of using the docker.io old version of busybox, it's using the quay.io newer version of busybox.

@jbusche
Copy link
Contributor Author

jbusche commented Nov 14, 2023

job.status()

AppStatus:
  msg: !!python/object/apply:ray.dashboard.modules.job.common.JobStatus
  - SUCCEEDED
  num_restarts: -1
  roles:
  - replicas:
    - hostname: <NONE>
      id: 0
      role: ray
      state: !!python/object/apply:torchx.specs.api.AppState
      - 4
      structured_error_msg: <NONE>
    role: ray
  state: SUCCEEDED (4)
  structured_error_msg: <NONE>
  ui_url: null

@jbusche jbusche marked this pull request as ready for review November 14, 2023 00:46
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Nov 14, 2023
Copy link
Contributor

@tedhtchang tedhtchang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/LGTM

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Nov 14, 2023
Copy link
Contributor

@Fiona-Waters Fiona-Waters left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested changes on an OSD cluster - all works as expected using the quay image.
Screenshot from 2023-11-14 11-32-23
/lgtm

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Nov 17, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: KPostOffice, tedhtchang

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 17, 2023
@openshift-merge-bot openshift-merge-bot bot merged commit b0f4671 into project-codeflare:main Nov 17, 2023
@jbusche jbusche deleted the jimb-pr324-2 branch December 5, 2023 19:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants