Update and align numpy versions in TensorFlow #48918

bhack · 2021-05-05T19:42:08Z

This PR is just to explore CI error when we align and upgrade numpy in TF.

When finalized it will fix #47691

google-ml-butler · 2021-05-05T19:42:11Z

Thanks for contributing to TensorFlow Lite Micro.

To keep this process moving along, we'd like to make sure that you have completed the items on this list:

Read the contributing guidelines for TensorFlow Lite Micro
Created a TF Lite Micro Github issue
Linked to the issue from the PR description

We would like to have a discussion on the Github issue first to determine the best path forward, and then proceed to the PR review.

review-notebook-app · 2021-05-05T19:42:12Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

bhack · 2021-05-05T20:29:23Z

Many of the Docker images with python < 3.7 will fail cause we don't have numpy v1.20 for older python version.
Python upgrade for the devel and devel derivated images is at #48371.

NeilGirdhar · 2021-05-05T21:00:40Z

I don't know what the TensorFlow team is planning, but according to NEP 29, "all projects across the Scientific Python ecosystem" were supposed to drop Python 3.6 last June.

bnavigator · 2021-05-05T21:04:31Z

This is wrong. You don't have to specify a minimum Numpy 1.20. Just don't require a maximum numpy, by not relying on features deprecated long ago and finally removed in 1.19 or 1.20.

mihaimaruseac · 2021-05-05T21:50:49Z

After the next release, we can drop py3.6 and then revisit this.

bhack · 2021-05-05T22:00:46Z

@bnavigator It is not the logic of this PR. This is just a CI exploration to expose public logs for all the users that are pushing for 1.20 in the ticket. I am forcing 1.20.x to see the effect on our VMs, Community VM, Docker images, and TF tests on all the platforms.

More in general, but probably @mihaimaruseac can comment better then me, I suppose that currently as we are under release, we don't have the bandwidth to control all the python env matrix for all the numpy versions and probably it will be easier to evaluate python 3.6 removal and python 2.7 minimal version to work with an updated version of numpy.
Then probably we could finetune what will be the minimal numpy version that we want to support considering also NEP29.

bnavigator · 2021-05-05T22:43:27Z

I disagree. Comments like #48918 (comment) and #48918 (comment) show that this diverts energy into the wrong direction. Supporting NumPy 1.20 does have nothing to do with dropping 1.19 or python 3.6. Python 3.6 along with a compatible older NumPy (and SciPy and many more packages) are still the mainline on some major LTS Linux distros. Just relaxing the upper bounds on NumPy should suffice to let any reasonable resolver take the newest available version.

And why would you want to drop python 3.6 support but still talk about python 2.7? That one is really EOL.

bhack · 2021-05-05T23:09:56Z

2.7 it was just a typo as it is EOL it was 3.7 of course.

If you think that you have a better solution for your scope and you have the best min/max numpy ranges, feel free to open a new Pull request we we will review that without any problems.

Here I just want to verify CI VM and CI results with 1.20 and aligned numpy versions (they diverted in many places) as in the PR description:

This PR is just to explore CI error when we align and upgrade numpy in TF.

Then I want to point out that for many of us, activities like this are only voluntary activities as the TF internal teams commit directly from the internal system and generally they don't use PRs.

The code is available for everyone and I believe it is appropriate to have a more collaborative approach when we comment on tickets as it is defined in our community code of conduct.

mihaimaruseac · 2021-05-05T23:13:40Z

@bnavigator: Given how deep TF integrates with numpy, having a wide range on the dependency results in broken CI in the current build system. Easiest fix at the numpy 1.20 release time has been to force TF to use older numpy. @bhack is currently trying to unblock this / see what are the remaining breakages.

Imported from GitHub PR #49008 Related #48918 Copybara import of the project: -- e9924bf by fsx950223 <fsx950223@outlook.com>: fix numpy 1.20 -- 509a10c by fsx950223 <fsx950223@outlook.com>: add test case PiperOrigin-RevId: 373618749 Change-Id: I925da021db2bdb5a1b35f0d187d0007b7802ff1c

tupui · 2021-05-25T14:32:52Z

It's not because you still want to support python3.6 that your CI must run everything on 3.6 and then you are not able to instal NumPy's latest.

Have a look at our CI for SciPy. We test a bunch of configurations and test the develop version of NumPy only in one particular workflow. This way we can ensure forward compatibility while still being able to test on older systems on other workflows. You could be fully compatible with NEP29 if wanted... Now you're just breaking everyones workflows.

shoyer · 2021-06-11T20:42:58Z

I'll just note that it's also possible to test against the development version of upstream dependencies like NumPy. This is a pretty common practice for core projects in the scientific Python ecosystem and makes a huge difference, often identifying upstream bugs even before there's a release.

tupui · 2021-06-12T11:34:53Z

Most of Google's CI is still on Python 3.6. Since there's no numpy 1.20 for that version, we cannot use it.

OC you can as you could have one workflow dedicated to forward compatibility check. You would test latest Python, NumPy, etc. Nothing complicated here, just willingness to do so... Again, cf SciPy's CI for example.

bhack · 2021-06-12T12:32:50Z

Most of Google's CI is still on Python 3.6. Since there's no numpy 1.20 for that version, we cannot use it.

OC you can as you could have one workflow dedicated to forward compatibility check. You would test latest Python, NumPy, etc. Nothing complicated here, just willingness to do so... Again, cf SciPy's CI for example.

If you are interested to contribute to the general discussion I've started a new thread in our forum: https://discuss.tensorflow.org/t/extend-the-collaboration-on-the-ci/1917

bnavigator · 2021-06-12T17:49:49Z

If you are interested to contribute to the general discussion I've started a new thread in our forum:

Thanks for the initiative. I would comment there, but somehow the device 2FA for the google login does not succeed at them moment. 🤷‍♂️ (Plus, I actually don't want to use my personal gmail address for a public forum)

Most of Google's CI is still on Python 3.6. Since there's no numpy 1.20 for that version, we cannot use it.

I must say I am a bit confused. In the release notes about TF 2.5.0 the first bullet point is that you now support Python 3.9. How do you do that if you don't have a CI for it?

bhack · 2021-06-12T22:52:11Z

Thanks for the initiative. I would comment there, but somehow the device 2FA for the google login does not succeed at them moment. 🤷‍♂️ (Plus, I actually don't want to use my personal gmail address for a public forum)

If you need support with 2FA issue you could send a message to forum-help@tensorflow.org
Also your gmail address will not be disclosed.

/cc @theadactyl

mihaimaruseac · 2021-06-13T17:01:43Z

I'll just note that it's also possible to test against the development version of upstream dependencies like NumPy. This is a pretty common practice for core projects in the scientific Python ecosystem and makes a huge difference, often identifying upstream bugs even before there's a release.

Yes, it is possible, but very costly. Our support matrix is already not manageable, unfortunately.

mihaimaruseac · 2021-06-13T17:04:19Z

I must say I am a bit confused. In the release notes about TF 2.5.0 the first bullet point is that you now support Python 3.9. How do you do that if you don't have a CI for it?

Internal CI is locked to py3.6. Google has a one version policy: all tools, libraries and binaries used anywhere inside Google are pinned to the exact same policy, regardless of team and product.

Externally, we are building py3.6-py3.9, as you can see here

tupui · 2021-06-14T06:52:28Z

I must say I am a bit confused. In the release notes about TF 2.5.0 the first bullet point is that you now support Python 3.9. How do you do that if you don't have a CI for it?

Internal CI is locked to py3.6. Google has a one version policy: all tools, libraries and binaries used anywhere inside Google are pinned to the exact same policy, regardless of team and product.

Externally, we are building py3.6-py3.9, as you can see here

Building for something else than 3.6 is not the same as having a CI for something else than 3.6 🤔
Sounds like a weird policy to prevent you to have a better CI... Again, matrices are for that.

bnavigator · 2021-06-14T15:51:54Z

Again, matrices are for that.

Granted, if a single matrix entry takes half a day to build, you want a small matrix. On the other hand, having "internal" and "public" CIs doing duplicate work is a disservice to the planet's climate, too.

tupui · 2021-06-14T16:34:06Z

Again, matrices are for that.

Granted, if a single matrix entry takes half a day to build, you want a small matrix. On the other hand, having "internal" and "public" CIs doing duplicate work is a disservice to the planet's climate, too.

This is why there are mechanism to have conditional CI or even skip some workflows (things like [skip ci] or [skip actions] in the commit message). The classic example being, if you only push docs, you don't run the tests.

If there is no discipline with CI, of course it's intractable. But well executed, you can add lot of exotic plateformes or setups at no extra cost.

adriangb · 2021-06-14T16:37:18Z

Also caching of the C++ compilation, which I think @bhack was working on, would help a lot. In my experience, running the Python tests takes a fraction of the time of compilation (and even that could be sped up eg. by using pytest-xdist).

mihaimaruseac · 2021-06-14T16:56:43Z

I must say I am a bit confused. In the release notes about TF 2.5.0 the first bullet point is that you now support Python 3.9. How do you do that if you don't have a CI for it?

Internal CI is locked to py3.6. Google has a one version policy: all tools, libraries and binaries used anywhere inside Google are pinned to the exact same policy, regardless of team and product.
Externally, we are building py3.6-py3.9, as you can see here

Building for something else than 3.6 is not the same as having a CI for something else than 3.6 🤔
Sounds like a weird policy to prevent you to have a better CI... Again, matrices are for that.

Internal CI only supports 3.6
External CI runs on py3.6..py3.9

Matrices work but also result in combinatorial explosion.

Internal CI is needed due to differences between internal TF and OSS TF. Also because Google monorepo.

tupui · 2021-06-14T17:38:28Z

External CI runs on py3.6..py3.9

I thought you said this is just for building. Which we all seem to interpret (thumbs up) as not running unit tests.

Matrices work but also result in combinatorial explosion.

Sure but nothing you cannot manage if conditionally skipped on some particular events. But I agree this adds some complexity.

As such a prominent package, I guess we all expect more rigour here. I mean, here we are just asking that you support latest NumPy and maybe check these in advance in your CI. Nothing esoteric which is not already done by all big packages.

mihaimaruseac · 2021-06-14T17:51:21Z

External CI runs on py3.6..py3.9

I thought you said this is just for building. Which we all seem to interpret (thumbs up) as not running unit tests.

My bad, I was not clear (though the linked CI scripts would have shown that we run tests). We have multiple CI types:

PR presubmits: one single version of Python, just spreading on multiple OSes and configs (although not all of them are blocking)
Continuous builds: whenever a new change lands and there is no continuous build running a new one is triggered at HEAD. This again covers only one version of Python and is spread across multiple OSes but is supposed to have buildcops monitoring these and immediately surfacing breakages.
Nightly builds: 2 versions, all supported Python versions and OSes. Supposed to have buildcops to surface breakages too. The 2 versions are:
- nonpip: just bazel test of the code base, just like all the builds above
- pip: bazel build for the pip package, install the pip package, run tests against it
Release builds: same as nightly builds but run on the release process, after branch cut (multiple times, at least ~2 times for each RC + final)

Between 1 and 2, during the PR import process there are presubmits builds inside Google that test the internal version of TF, on a single version of Python, single OS (Linux)

Step 2 is actually duplicated: we build from HEAD of OSS and we also build from HEAD of internal code exposing that to a local OSS repo.

Steps 3 and 4 have a few builds that are outside the pip/nonpip separation, but not relevant here. See the build scrips linked before for all details

Matrices work but also result in combinatorial explosion.

Sure but nothing you cannot manage if conditionally skipped on some particular events. But I agree this adds some complexity.

Some of these don't depend on our team. Changing the Google behemoth is very slow, but we're working on this.

As such a prominent package, I guess we all expect more rigour here. I mean, here we are just asking that you support latest NumPy and maybe check these in advance in your CI. Nothing esoteric which is not already done by all big packages.

This is understood and we're working on making things easier for OSS contributors. But these are processes that take months given the whole CI structure and org setup

adriangb · 2021-06-14T19:51:27Z

We understand that Google is a large complex machine, and it is great that TF is even FOSS in the first place. I hope I speak for everyone in saying that we appreciate the work you do and just want to help however we can. As long as we keep the discussions going and things moving forward, we'll all benefit.

gbaned · 2021-06-25T17:48:47Z

@bhack Can you please resolve conflicts? Thanks!

bhack · 2021-06-25T17:51:18Z

By now this is just a monitor PR. I am waiting for @mihaimaruseac feedback on when I need to resync.

google-ml-butler bot added the size:M CL Change Size: Medium label May 5, 2021

google-cla bot added the cla: yes label May 5, 2021

Update and align numpy version in tensorflow

6645f31

bhack force-pushed the numpy_1.20 branch from 0acefbc to 6645f31 Compare May 5, 2021 19:50

Change space format

348d5aa

bhack marked this pull request as ready for review May 5, 2021 20:22

bhack mentioned this pull request May 5, 2021

Numpy v1.20+ compatibility #47691

Closed

gbaned self-assigned this May 6, 2021

gbaned added the comp:micro Related to TensorFlow Lite Microcontrollers label May 6, 2021

gbaned added this to Assigned Reviewer in PR Queue via automation May 6, 2021

gbaned requested a review from advaitjain May 6, 2021 06:53

bnavigator mentioned this pull request May 6, 2021

Compatibility with NumPy 1.20+ #48935

Closed

fsx950223 mentioned this pull request May 8, 2021

fix numpy 1.20 #49008

Merged

gbaned added the awaiting review Pull request awaiting review label May 20, 2021

advaitjain removed the comp:micro Related to TensorFlow Lite Microcontrollers label May 20, 2021

advaitjain removed their request for review May 25, 2021 15:14

This was referenced Jun 10, 2021

tensorflow removed numpy 1.20.3 #50204

Closed

Tensorflow 2.5 wheels depend on an unreleased keras version and restrict numpy to 1.19 #50042

Closed

gbaned requested a review from mihaimaruseac June 25, 2021 17:47

gbaned added stat:awaiting response Status - Awaiting response from author and removed awaiting review Pull request awaiting review labels Jun 25, 2021

gbaned added stat:awaiting tensorflower Status - Awaiting response from tensorflower and removed stat:awaiting response Status - Awaiting response from author labels Jul 8, 2021

bhack mentioned this pull request Jul 12, 2021

Fix numpy 1.20 deprecation warnings #50527

Merged

tupui mentioned this pull request Aug 5, 2021

Raspberry Pi 4 aarch64: ModuleNotFoundError: No module named 'numpy.random.bit_generator' scipy/scipy#14541

Closed

gbaned added awaiting review Pull request awaiting review and removed stat:awaiting tensorflower Status - Awaiting response from tensorflower labels Aug 10, 2021

jeylau mentioned this pull request Aug 31, 2021

"pip install --upgrade deeplabcut" reverts tensorflow to v2.2.0 (from previously installed v2.6.0) DeepLabCut/DeepLabCut#1486

Closed

2 tasks

bhack closed this Oct 4, 2021

PR Queue automation moved this from Assigned Reviewer to Closed/Rejected Oct 4, 2021

google-ml-butler bot removed the awaiting review Pull request awaiting review label Oct 4, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update and align numpy versions in TensorFlow #48918

Update and align numpy versions in TensorFlow #48918

bhack commented May 5, 2021

google-ml-butler bot commented May 5, 2021

review-notebook-app bot commented May 5, 2021

bhack commented May 5, 2021

NeilGirdhar commented May 5, 2021

bnavigator commented May 5, 2021

mihaimaruseac commented May 5, 2021

bhack commented May 5, 2021 •

edited

bnavigator commented May 5, 2021

bhack commented May 5, 2021 •

edited

mihaimaruseac commented May 5, 2021

tupui commented May 25, 2021

shoyer commented Jun 11, 2021

tupui commented Jun 12, 2021 •

edited

bhack commented Jun 12, 2021

bnavigator commented Jun 12, 2021

bhack commented Jun 12, 2021

mihaimaruseac commented Jun 13, 2021 •

edited

mihaimaruseac commented Jun 13, 2021

tupui commented Jun 14, 2021

bnavigator commented Jun 14, 2021

tupui commented Jun 14, 2021

adriangb commented Jun 14, 2021

mihaimaruseac commented Jun 14, 2021 •

edited

tupui commented Jun 14, 2021

mihaimaruseac commented Jun 14, 2021

adriangb commented Jun 14, 2021 •

edited

gbaned commented Jun 25, 2021

bhack commented Jun 25, 2021

Update and align numpy versions in TensorFlow #48918

Update and align numpy versions in TensorFlow #48918

Conversation

bhack commented May 5, 2021

google-ml-butler bot commented May 5, 2021

review-notebook-app bot commented May 5, 2021

bhack commented May 5, 2021

NeilGirdhar commented May 5, 2021

bnavigator commented May 5, 2021

mihaimaruseac commented May 5, 2021

bhack commented May 5, 2021 • edited

bnavigator commented May 5, 2021

bhack commented May 5, 2021 • edited

mihaimaruseac commented May 5, 2021

tupui commented May 25, 2021

shoyer commented Jun 11, 2021

tupui commented Jun 12, 2021 • edited

bhack commented Jun 12, 2021

bnavigator commented Jun 12, 2021

bhack commented Jun 12, 2021

mihaimaruseac commented Jun 13, 2021 • edited

mihaimaruseac commented Jun 13, 2021

tupui commented Jun 14, 2021

bnavigator commented Jun 14, 2021

tupui commented Jun 14, 2021

adriangb commented Jun 14, 2021

mihaimaruseac commented Jun 14, 2021 • edited

tupui commented Jun 14, 2021

mihaimaruseac commented Jun 14, 2021

adriangb commented Jun 14, 2021 • edited

gbaned commented Jun 25, 2021

bhack commented Jun 25, 2021

bhack commented May 5, 2021 •

edited

bhack commented May 5, 2021 •

edited

tupui commented Jun 12, 2021 •

edited

mihaimaruseac commented Jun 13, 2021 •

edited

mihaimaruseac commented Jun 14, 2021 •

edited

adriangb commented Jun 14, 2021 •

edited