Add support for Python 3.10 and 3.11 #1937

simonzhaoms · 2023-06-06T11:54:11Z

Description

This PR added the support for Python 3.10 and 3.11 with the following changes:

Remove dependencies that are required by other dependencies.
Upgrade dependency versions to include those support Python 3.10 and 3.11.
Replace out-of-date dependencies with their new alternatives, such as nvidia-ml-py3 -> nvidia-ml-py.
Resolved errors introduced by the upgrade, such as errors introduced from fastai to fastai2

However, this upgrade doesn't fix the following issues:

Related Issues

[FEATURE] Support Python 3.10 #1919

Checklist:

I have followed the contribution guidelines and code style for this project.
I have added tests covering my contributions.
I have updated the documentation accordingly.
This PR is being made to staging branch and not to main branch.

miguelgfierro · 2023-06-06T14:21:45Z

hey @simonzhaoms the tests are broken right now, see #1934. The last thing I've been trying to do is to build a docker file and inject it into the RunConfiguration.

I think we should first fix the tests, and then we can do the upgrade to python 3.10 and 3.11. Do you think you could help me with it?

simonzhaoms · 2023-06-06T21:56:23Z

hey @simonzhaoms the tests are broken right now, see #1934. The last thing I've been trying to do is to build a docker file and inject it into the RunConfiguration.

I think we should first fix the tests, and then we can do the upgrade to python 3.10 and 3.11. Do you think you could help me with it?

Sure, I can take a look.

miguelgfierro · 2023-06-08T10:49:11Z

hey @simonzhaoms, after the tests are fixed, we should do a PR to main. Do you want to merge this PR before doing the PR to main or after?

simonzhaoms · 2023-06-09T04:34:53Z

hey @simonzhaoms, after the tests are fixed, we should do a PR to main. Do you want to merge this PR before doing the PR to main or after?

after, because there are still some problems with this PR to be solved.

simonzhaoms · 2023-06-09T04:36:16Z

@miguelgfierro this PR is still a draft, not ready for review.

simonzhaoms · 2023-06-11T02:05:12Z

After upgrading the dependent packages and the docker images in the commit b71c4ed, it failed because AzureML SDK tried to compile and install an old version of ruamel.yaml (0.15.89) (See the log).

It looks like an issue with AzureML SDK and Conda (See Azure Machine Learning SDK installation failing with an exception). And it seems that AzureML SDK sets the upper version for ruamel.yaml to 0.15.89, not sure why.

I tried setup.py of the commit b71c4ed on my local machine with Python 3.8 3.9, 3.10 and 3.11. All works because there is no dependency on the old version of ruamel.yaml.

Possible solutions:

Upgrade AzureML SDK from v1 to v2 (not sure whether this would work)
Wait until the issue with AzureML and Conda resolved.
Try install everything in the docker file without using Conda.

miguelgfierro · 2023-06-11T08:12:28Z

tests/ci/azureml_tests/submit_groupwise_azureml_pytest.py

@@ -181,7 +178,20 @@ def create_run_config(
    run_azuremlcompute = RunConfiguration()
    run_azuremlcompute.target = cpu_cluster
    run_azuremlcompute.environment.docker.enabled = True
-    run_azuremlcompute.environment.docker.base_image = docker_proc_type
+    # See https://learn.microsoft.com/en-us/azure/machine-learning/how-to-train-with-custom-image?view=azureml-api-1#use-a-custom-dockerfile-optional


@simonzhaoms right now in the actions we are installing run: pip install --quiet "azureml-core>1,<2" "azure-cli>2,<3". What is the azurml sdk that we are installing? Here I see the latest one as 1.51 https://pypi.org/project/azureml-sdk/#history.

Of the 3 options, I think one that is interesting to explore would be Try install everything in the docker file without using Conda. iif we are reducing dependencies. I think 80% of our problems come from dependencies: #1936 So maybe something to reflect on is how can we reduce dependencies and use more standardize and robust software?

@miguelgfierro

@simonzhaoms right now in the actions we are installing run: pip install --quiet "azureml-core>1,<2" "azure-cli>2,<3". What is the azurml sdk that we are installing? Here I see the latest one as 1.51 https://pypi.org/project/azureml-sdk/#history.

I think this azureml-core is only used for launching the script submit_groupwise_azureml_pytest.py.

https://github.com/microsoft/recommenders/blob/b71c4ed66991f8eddfd80a8afbff05b34591c7ee/.github/actions/azureml-test/action.yml#L75-L77

And the AzureML SDK I mentioned is used inside the docker image launched inside submit_groupwise_azureml_pytest.py

https://github.com/microsoft/recommenders/blob/b71c4ed66991f8eddfd80a8afbff05b34591c7ee/.github/actions/azureml-test/action.yml#L85-L94

https://github.com/microsoft/recommenders/blob/b71c4ed66991f8eddfd80a8afbff05b34591c7ee/tests/ci/azureml_tests/submit_groupwise_azureml_pytest.py#L178-L194

I think when we use CondaDependencies.add_pip_package("xxx"), AzureML adds the item xxx in a Conda env yaml file maintained by itself, and the AzureML SDK is also an item added by AzureML by default implicitly.

https://github.com/microsoft/recommenders/blob/b71c4ed66991f8eddfd80a8afbff05b34591c7ee/tests/ci/azureml_tests/submit_groupwise_azureml_pytest.py#L202-L223

What I don't know is why the AzureML SDK trigger the error now when I try to add support for Python 3.10 and upgrade all dependencies.

Of the 3 options, I think one that is interesting to explore would be Try install everything in the docker file without using Conda. iif we are reducing dependencies. I think 80% of our problems come from dependencies: #1936 So maybe something to reflect on is how can we reduce dependencies and use more standardize and robust software?

I agree. In addition, if possible, I'd use what GitHub actions and workflows can provide to build the testing pipeline rather than use the AzureML service, because AzureML service is not transparent.

setup.py

.github/workflows/azureml-cpu-nightly.yml

setup.py

loomlike · 2024-03-12T00:58:22Z

folks, when I run this manually on my AzureML workspace with tests/ci/azureml_tests/submit_groupwise_azureml_pytest.py, it works without issues. Can someone paste the actual AzureML job's logs (error message) here?

SimonYansenZhao · 2024-03-12T07:28:20Z

folks, when I run this manually on my AzureML workspace with tests/ci/azureml_tests/submit_groupwise_azureml_pytest.py, it works without issues. Can someone paste the actual AzureML job's logs (error message) here?

@loomlike The errors can be found at the nightly build on the branch simony-dep-upgrade-20230606: https://github.com/recommenders-team/recommenders/actions/runs/8036183347

But the newest error is the one pasted by Miguel above from running the notebook examples/06_benchmarks/movielens.ipynb run by tests/functional/examples/test_notebooks_gpu.py::test_benchmark_movielens_gpu()

Signed-off-by: Jun Ki Min <42475935+loomlike@users.noreply.github.com>

SimonYansenZhao · 2024-03-16T13:17:51Z

Hi @miguelgfierro the gpu machine seems to be in a bad status. Several runs of tests have the same error saying

"code": "UserError",
"message": "AzureMLCompute cluster gpu-cluster has an error with code UnhealthyGPU that prevents it from scaling to 1 nodes. Message: ECC was disabled on the node. Node is unhealthy and will reboot.",
"ClusterName": "gpu-cluster",
"Code": "UnhealthyGPU",
"NodeCount": "1",
"code": "ClusterProvisionedWithErrors"

Could you help to check when you have time?
I have already fixed all errors found. I think all tests should pass if no further errors occur.

Signed-off-by: miguelgfierro <miguelgfierro@users.noreply.github.com>

miguelgfierro · 2024-03-16T20:40:11Z

@SimonYansenZhao It seems that error is solved. Now we have some TF errors: https://github.com/recommenders-team/recommenders/actions/runs/8310055489/job/22742112981?pr=1937

Signed-off-by: miguelgfierro <miguelgfierro@users.noreply.github.com>

Signed-off-by: Simon Zhao <simonyansenzhao@gmail.com>

Signed-off-by: miguelgfierro <miguelgfierro@users.noreply.github.com>

miguelgfierro · 2024-03-19T06:35:35Z

@SimonYansenZhao see this comment: #2071 (comment)

Signed-off-by: miguelgfierro <miguelgfierro@users.noreply.github.com>

Signed-off-by: Simon Zhao <simonyansenzhao@gmail.com>

Fixing TF to < 2.16

miguelgfierro · 2024-03-19T09:43:22Z

@SimonYansenZhao if the unit tests pass, what do you think of merging this PR, and then figure out how to fix the issues we have in the nightly?

miguelgfierro

Unit tests pass

daviddavo · 2024-03-19T10:07:25Z

recommenders/models/rlrmc/RLRMCdataset.py

+        df = train if validation is None else pd.concat([train, validation])
+        df = df if test is None else pd.concat([df, test])


You can avoid having two concats with this one liner:

df = pd.concat(filter(None, [train, validation, test]))

hey @daviddavo, can you please create a PR to staging with these changes? I gave you written permissions.

git checkout staging git branch -b new_branch .... code changes ... git commit -asm "you message don't forget the -s to sign the commits" git push origin new_branch

And then PR.

SimonYansenZhao · 2024-03-19T10:34:13Z

@SimonYansenZhao if the unit tests pass, what do you think of merging this PR, and then figure out how to fix the issues we have in the nightly?

sounds good @miguelgfierro

Add support for Python 3.10 and 3.11

ffac856

simonzhaoms self-assigned this Jun 6, 2023

simonzhaoms added 3 commits June 7, 2023 09:57

Correct upper bound version for category_encoders

ffd8b9e

Add tests for Python 3.10 and 3.11

644aa5d

Remove dependencies that others require

9003450

Merge in staging

a157403

simonzhaoms changed the title ~~Add support for Python 3.10 and 3.11~~ Add support for Python 3.10 and 3.11 and drop for 3.7 Jun 9, 2023

simonzhaoms added 3 commits June 9, 2023 11:44

Update nvidia-ml-py and tensorflow version

793ec87

Install system level dependencies for scipy

1556eb4

Merge in staging

34aee1d

simonzhaoms added 9 commits June 9, 2023 14:58

Support from Python 3.8 to 3.11

890a3fe

Remove unused system deps

c7bb846

Drop python 3.11 because some packages do not support 3.11

7f2e298

Install dependencies for scipy in docker image

09069f7

Change docker image

643fed6

Add pip==20.1.1

db4c9c3

Correct conda package format

9e05e4f

Remove pip downgrade code

e1d6acf

Use docker images for ubuntu 22.04

b71c4ed

miguelgfierro reviewed Jun 11, 2023

View reviewed changes

anargyri reviewed Jun 21, 2023

View reviewed changes

setup.py Outdated Show resolved Hide resolved

anargyri reviewed Jun 21, 2023

View reviewed changes

setup.py Outdated Show resolved Hide resolved

daviddavo reviewed Jul 17, 2023

View reviewed changes

.github/workflows/azureml-cpu-nightly.yml Outdated Show resolved Hide resolved

wutaomsft reviewed Jul 17, 2023

View reviewed changes

setup.py Outdated Show resolved Hide resolved

Upgrade GitHub Action azure/login

ac90e54

SimonYansenZhao and others added 3 commits March 15, 2024 11:03

Update fastai usage in utils

1d0fe7d

change deprecated azureml option (#2069)

0740b16

Signed-off-by: Jun Ki Min <42475935+loomlike@users.noreply.github.com>

Update SP creation doc

89cc985

Signed-off-by: Jun Ki Min <42475935+loomlike@users.noreply.github.com>

📝

55433c5

Signed-off-by: miguelgfierro <miguelgfierro@users.noreply.github.com>

miguelgfierro and others added 7 commits March 18, 2024 20:16

Fixing TF to < 2.16

730a5e9

🐛

657531a

Signed-off-by: miguelgfierro <miguelgfierro@users.noreply.github.com>

model to CUDA as well as data

e99b8d0

Signed-off-by: miguelgfierro <miguelgfierro@users.noreply.github.com>

Set tensorflow <= 2.15.0

03554de

Signed-off-by: Simon Zhao <simonyansenzhao@gmail.com>

Add missing colon

47281c8

Signed-off-by: Simon Zhao <simonyansenzhao@gmail.com>

Merge branch 'simonz-dep-upgrade-20230606' into miguel/fix_tf

19bcf1a

📝

b255fae

Signed-off-by: miguelgfierro <miguelgfierro@users.noreply.github.com>

miguelgfierro and others added 4 commits March 19, 2024 09:22

Reducing DKN batch size to 200

85899cf

Signed-off-by: miguelgfierro <miguelgfierro@users.noreply.github.com>

Move learner.model to cuda if cuda is available

d8e8ac3

Signed-off-by: Simon Zhao <simonyansenzhao@gmail.com>

Merge branch 'simonz-dep-upgrade-20230606' into miguel/fix_tf

21492c9

Merge pull request #2071 from recommenders-team/miguel/fix_tf

14c5c93

Fixing TF to < 2.16

miguelgfierro approved these changes Mar 19, 2024

View reviewed changes

anargyri approved these changes Mar 19, 2024

View reviewed changes

daviddavo reviewed Mar 19, 2024

View reviewed changes

miguelgfierro merged commit 3821655 into staging Mar 19, 2024
38 checks passed

This was referenced Mar 20, 2024

Daviddavo/less concats #2074

Closed

Merged two concats into one #2075

Merged

SimonYansenZhao mentioned this pull request Mar 24, 2024

[BUG] Upgrade GitHub Action azure/login #2065

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for Python 3.10 and 3.11 #1937

Add support for Python 3.10 and 3.11 #1937

simonzhaoms commented Jun 6, 2023 •

edited by SimonYansenZhao

miguelgfierro commented Jun 6, 2023

simonzhaoms commented Jun 6, 2023

miguelgfierro commented Jun 8, 2023

simonzhaoms commented Jun 9, 2023

simonzhaoms commented Jun 9, 2023

simonzhaoms commented Jun 11, 2023 •

edited

miguelgfierro Jun 11, 2023

simonzhaoms Jun 12, 2023

simonzhaoms Jun 12, 2023

loomlike commented Mar 12, 2024

SimonYansenZhao commented Mar 12, 2024 •

edited

SimonYansenZhao commented Mar 16, 2024 •

edited

miguelgfierro commented Mar 16, 2024

miguelgfierro commented Mar 19, 2024

miguelgfierro commented Mar 19, 2024

miguelgfierro left a comment

daviddavo Mar 19, 2024

miguelgfierro Mar 19, 2024

SimonYansenZhao commented Mar 19, 2024

		df = train if validation is None else pd.concat([train, validation])
		df = df if test is None else pd.concat([df, test])

Add support for Python 3.10 and 3.11 #1937

Add support for Python 3.10 and 3.11 #1937

Conversation

simonzhaoms commented Jun 6, 2023 • edited by SimonYansenZhao

Description

Related Issues

Checklist:

miguelgfierro commented Jun 6, 2023

simonzhaoms commented Jun 6, 2023

miguelgfierro commented Jun 8, 2023

simonzhaoms commented Jun 9, 2023

simonzhaoms commented Jun 9, 2023

simonzhaoms commented Jun 11, 2023 • edited

miguelgfierro Jun 11, 2023

Choose a reason for hiding this comment

simonzhaoms Jun 12, 2023

Choose a reason for hiding this comment

simonzhaoms Jun 12, 2023

Choose a reason for hiding this comment

loomlike commented Mar 12, 2024

SimonYansenZhao commented Mar 12, 2024 • edited

SimonYansenZhao commented Mar 16, 2024 • edited

miguelgfierro commented Mar 16, 2024

miguelgfierro commented Mar 19, 2024

miguelgfierro commented Mar 19, 2024

miguelgfierro left a comment

Choose a reason for hiding this comment

daviddavo Mar 19, 2024

Choose a reason for hiding this comment

miguelgfierro Mar 19, 2024

Choose a reason for hiding this comment

SimonYansenZhao commented Mar 19, 2024

simonzhaoms commented Jun 6, 2023 •

edited by SimonYansenZhao

simonzhaoms commented Jun 11, 2023 •

edited

SimonYansenZhao commented Mar 12, 2024 •

edited

SimonYansenZhao commented Mar 16, 2024 •

edited