New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update and align numpy versions in TensorFlow #48918
Conversation
Thanks for contributing to TensorFlow Lite Micro. To keep this process moving along, we'd like to make sure that you have completed the items on this list:
We would like to have a discussion on the Github issue first to determine the best path forward, and then proceed to the PR review. |
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
Many of the Docker images with python < 3.7 will fail cause we don't have numpy v1.20 for older python version. |
I don't know what the TensorFlow team is planning, but according to NEP 29, "all projects across the Scientific Python ecosystem" were supposed to drop Python 3.6 last June. |
This is wrong. You don't have to specify a minimum Numpy 1.20. Just don't require a maximum numpy, by not relying on features deprecated long ago and finally removed in 1.19 or 1.20. |
After the next release, we can drop py3.6 and then revisit this. |
@bnavigator It is not the logic of this PR. This is just a CI exploration to expose public logs for all the users that are pushing for 1.20 in the ticket. I am forcing 1.20.x to see the effect on our VMs, Community VM, Docker images, and TF tests on all the platforms. More in general, but probably @mihaimaruseac can comment better then me, I suppose that currently as we are under release, we don't have the bandwidth to control all the python env matrix for all the numpy versions and probably it will be easier to evaluate python 3.6 removal and python 2.7 minimal version to work with an updated version of numpy. |
I disagree. Comments like #48918 (comment) and #48918 (comment) show that this diverts energy into the wrong direction. Supporting NumPy 1.20 does have nothing to do with dropping 1.19 or python 3.6. Python 3.6 along with a compatible older NumPy (and SciPy and many more packages) are still the mainline on some major LTS Linux distros. Just relaxing the upper bounds on NumPy should suffice to let any reasonable resolver take the newest available version. And why would you want to drop python 3.6 support but still talk about python 2.7? That one is really EOL. |
If you think that you have a better solution for your scope and you have the best min/max numpy ranges, feel free to open a new Pull request we we will review that without any problems. Here I just want to verify CI VM and CI results with 1.20 and aligned numpy versions (they diverted in many places) as in the PR description:
Then I want to point out that for many of us, activities like this are only voluntary activities as the TF internal teams commit directly from the internal system and generally they don't use PRs. The code is available for everyone and I believe it is appropriate to have a more collaborative approach when we comment on tickets as it is defined in our community code of conduct. |
@bnavigator: Given how deep TF integrates with numpy, having a wide range on the dependency results in broken CI in the current build system. Easiest fix at the numpy 1.20 release time has been to force TF to use older numpy. @bhack is currently trying to unblock this / see what are the remaining breakages. |
It's not because you still want to support python3.6 that your CI must run everything on 3.6 and then you are not able to instal NumPy's latest. Have a look at our CI for SciPy. We test a bunch of configurations and test the develop version of NumPy only in one particular workflow. This way we can ensure forward compatibility while still being able to test on older systems on other workflows. You could be fully compatible with NEP29 if wanted... Now you're just breaking everyones workflows. |
I'll just note that it's also possible to test against the development version of upstream dependencies like NumPy. This is a pretty common practice for core projects in the scientific Python ecosystem and makes a huge difference, often identifying upstream bugs even before there's a release. |
OC you can as you could have one workflow dedicated to forward compatibility check. You would test latest Python, NumPy, etc. Nothing complicated here, just willingness to do so... Again, cf SciPy's CI for example. |
If you are interested to contribute to the general discussion I've started a new thread in our forum: https://discuss.tensorflow.org/t/extend-the-collaboration-on-the-ci/1917 |
Thanks for the initiative. I would comment there, but somehow the device 2FA for the google login does not succeed at them moment. 🤷♂️ (Plus, I actually don't want to use my personal gmail address for a public forum)
I must say I am a bit confused. In the release notes about TF 2.5.0 the first bullet point is that you now support Python 3.9. How do you do that if you don't have a CI for it? |
If you need support with 2FA issue you could send a message to forum-help@tensorflow.org /cc @theadactyl |
Yes, it is possible, but very costly. Our support matrix is already not manageable, unfortunately. |
Internal CI is locked to py3.6. Google has a one version policy: all tools, libraries and binaries used anywhere inside Google are pinned to the exact same policy, regardless of team and product. Externally, we are building py3.6-py3.9, as you can see here |
Building for something else than 3.6 is not the same as having a CI for something else than 3.6 🤔 |
Granted, if a single matrix entry takes half a day to build, you want a small matrix. On the other hand, having "internal" and "public" CIs doing duplicate work is a disservice to the planet's climate, too. |
This is why there are mechanism to have conditional CI or even skip some workflows (things like If there is no discipline with CI, of course it's intractable. But well executed, you can add lot of exotic plateformes or setups at no extra cost. |
Also caching of the C++ compilation, which I think @bhack was working on, would help a lot. In my experience, running the Python tests takes a fraction of the time of compilation (and even that could be sped up eg. by using pytest-xdist). |
Internal CI only supports 3.6 Matrices work but also result in combinatorial explosion. Internal CI is needed due to differences between internal TF and OSS TF. Also because Google monorepo. |
I thought you said this is just for building. Which we all seem to interpret (thumbs up) as not running unit tests.
Sure but nothing you cannot manage if conditionally skipped on some particular events. But I agree this adds some complexity. As such a prominent package, I guess we all expect more rigour here. I mean, here we are just asking that you support latest NumPy and maybe check these in advance in your CI. Nothing esoteric which is not already done by all big packages. |
My bad, I was not clear (though the linked CI scripts would have shown that we run tests). We have multiple CI types:
Between 1 and 2, during the PR import process there are presubmits builds inside Google that test the internal version of TF, on a single version of Python, single OS (Linux) Step 2 is actually duplicated: we build from HEAD of OSS and we also build from HEAD of internal code exposing that to a local OSS repo. Steps 3 and 4 have a few builds that are outside the
Some of these don't depend on our team. Changing the Google behemoth is very slow, but we're working on this.
This is understood and we're working on making things easier for OSS contributors. But these are processes that take months given the whole CI structure and org setup |
We understand that Google is a large complex machine, and it is great that TF is even FOSS in the first place. I hope I speak for everyone in saying that we appreciate the work you do and just want to help however we can. As long as we keep the discussions going and things moving forward, we'll all benefit. |
@bhack Can you please resolve conflicts? Thanks! |
By now this is just a monitor PR. I am waiting for @mihaimaruseac feedback on when I need to resync. |
This PR is just to explore CI error when we align and upgrade numpy in TF.
When finalized it will fix #47691