Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve error message when incompatible torchvision / PyTorch are used #2148

Closed
fmassa opened this issue Apr 27, 2020 · 18 comments · Fixed by #2467
Closed

Improve error message when incompatible torchvision / PyTorch are used #2148

fmassa opened this issue Apr 27, 2020 · 18 comments · Fixed by #2467

Comments

@fmassa
Copy link
Member

fmassa commented Apr 27, 2020

A number of users report errors on the C++ operators when they update either PyTorch or torchvision, see for example #1916, with errors such as

RuntimeError: No such operator torchvision::nms

We should improve the error message and instead point out that their pytorch / torchvision versions are incompatible.
This will give a better user experience.

@pmeier
Copy link
Collaborator

pmeier commented Apr 27, 2020

Out of curiosity: does a list of compatible versions exist somewhere?

@fmassa
Copy link
Member Author

fmassa commented Apr 27, 2020

Compatibility between PyTorch and torchvision versions is currently as follows:

  • latest torchvision stable is compatible with latest pytorch stable
  • latest torchvision nightly is compatible with latest pytorch nightly

So for nightlies, we could use the nightly date as a criteria for compatibility, although it should also work for users who are compiling torchvision from source (in which case, the compatibility constraints are much less restricted)

I agree that this is not ideal and might be a bit hard to track, but until we manage to find a way to make the PyTorch ABI fixed / stable (probably not going to happen), this will maybe be the only way.

I would love to hear other ideas though.

@anibali
Copy link
Contributor

anibali commented May 18, 2020

@fmassa I've found that it is currently very difficult to determine which pairings of PyTorch and Torchvision are compatible. For example, a question that I've had a few times in the past is "given that I have some old version PyTorch #.#.# installed (that I can't upgrade), what is the newest Torchvision that I can install with it?". Currently this is actually not so easy to find out. This is particularly important for a project like PyTorch, since there are numerous reasons why you may not be able to upgrade (eg API changes, Python 2 support, CUDA version support).

latest torchvision stable is compatible with latest pytorch stable

This answer is somewhat unhelpful for two reasons:

  1. It's hard to track for old versions. The best I've been able to do is look up historical release dates from both projects and try to link them up, which is clearly not ideal. It doesn't even seem like the release notes indicate which PyTorch version is required for each release. A compatibility table in the README or something similar would address this point.
  2. It still doesn't indicate what the minimum PyTorch version required for a particular Torchvision release is. For example, the current README seems to suggest that Torchvision 0.6.0 is compatible with PyTorch 1.4 ("TorchVision requires PyTorch 1.4 or newer"). But at the same time, setup.py seems to pin a particular version (although this can't be determined by looking at that file...):

    vision/setup.py

    Lines 65 to 67 in 222a599

    pytorch_dep = 'torch'
    if os.getenv('PYTORCH_VERSION'):
    pytorch_dep += "==" + os.getenv('PYTORCH_VERSION')

@pmeier
Copy link
Collaborator

pmeier commented May 19, 2020

For example, a question that I've had a few times in the past is "given that I have some old version PyTorch #.#.# installed (that I can't upgrade), what is the newest Torchvision that I can install with it?". Currently this is actually not so easy to find out.

I second this. Had to do this multiple times myself.

If I'm not mistaken there is a one-to-one relation for the stable releases. If that is the case couldn't we put simple table or something similar somewhere?

@fmassa
Copy link
Member Author

fmassa commented May 19, 2020

I agree that having a compatibility table would be a good start. But it would also help to raise informative error messages at runtime if the versions are incompatible.
It could even be as simple as grepping for symbol not found in the error message when loading the .so library, as it would probably cover most of the cases I think

@vincentqb
Copy link
Contributor

vincentqb commented May 19, 2020

Compatibility between PyTorch and torchvision versions is currently as follows:

  • latest torchvision stable is compatible with latest pytorch stable
  • latest torchvision nightly is compatible with latest pytorch nightly

How about we simply look at the version of pytorch, and throw a warning when importing torchvision with an incompatible version of pytorch?

@fmassa
Copy link
Member Author

fmassa commented May 20, 2020

This is a good start, but things get a bit more complicated with nightlies and when compiling from master.
I think we should try loading the library first, and if it fails identify from the error message the the versions are incompatible, maybe redirecting the users to the torchvision README which should contain a compatibility matrix of PyTorch and torchvision versions, like torchtext does.

@vincentqb
Copy link
Contributor

Compatibility between PyTorch and torchvision versions is currently as follows:

  • latest torchvision stable is compatible with latest pytorch stable
  • latest torchvision nightly is compatible with latest pytorch nightly

The official statement is fairly straightforward for releases, and we could make a table to keep track of which torchvision version was targeting which pytorch version.

For releases, as I suggested above, we could also explicitly throw a warning when a different version is detected.

For nightlies or master, I see those as "at your own risk" with minimal guarantees. I would not say more than what the statement says above in the readme.

  1. It's hard to track for old versions.

I agree. I believe a table would help with that.

  1. It still doesn't indicate what the minimum PyTorch version required for a particular Torchvision release is.

The official statement implicitly talks of exact version matches, and we can make that more clear. Making torchvision support minimal pytorch version is a much bigger burden, and outside the scope of what I understand this issue to be.

@fmassa -- are you planning to extend the support of torchvision outside the official statement? AFAIK torchaudio and torchtext are not planning this.

This is a good start, but things get a bit more complicated with nightlies and when compiling from master.

Were users here having issues with nighltlies or master? Or when using conda/pip to install torchvision? Again, I see nightlies and master as "at your own risk" with minimal guarantees. torchvision should simply be compatible with the latest master at all time.

I think we should try loading the library first, and if it fails identify from the error message the the versions are incompatible, maybe redirecting the users to the torchvision README which should contain a compatibility matrix of PyTorch and torchvision versions

Incompatible versions can lead to failures in many unexpected ways, and detecting all of them can be tricky. Once we detect incompatible versions, what do we do? Do we stop the import? Do we return a warning? Or maybe we selectively disabling features that are not supported?

Catching errors and throwing other ones could also obscure legitimate issues. Thoughts?

like torchtext does.

What does torchtext do?

@anibali
Copy link
Contributor

anibali commented May 20, 2020

Were users here having issues with nighltlies or master? Or when using conda/pip to install torchvision? Again, I see nightlies and master as "at your own risk" with minimal guarantees. torchvision should simply be compatible with the latest master at all time.

Speaking for myself, I'm only really concerned with the release versions available via conda/pip. I don't foresee a situation in which I'd personally need to figure out the PyTorch version compatibilities for old Torchvision nightly builds.

@fmassa
Copy link
Member Author

fmassa commented May 21, 2020

are you planning to extend the support of torchvision outside the official statement? AFAIK torchaudio and torchtext are not planning this.

No, we will only enforce to be compatible with latest release, if it ends up working with older releases (if compiled from source) that's a bonus, not a requirement.

Incompatible versions can lead to failures in many unexpected ways, and detecting all of them can be tricky. Once we detect incompatible versions, what do we do? Do we stop the import? Do we return a warning? Or maybe we selectively disabling features that are not supported?

My thinking was: import should not error. Users might not want to use C++ extensions, and the code should still hopefully work fine in most cases. But if the user calls into a C++ extension, then they should ideally get a good error message.

Catching errors and throwing other ones could also obscure legitimate issues. Thoughts?

Fair point. One of the issues we currently have is that in order to support torchhub, we need to catch an error at import and not hard-fail. But this makes error messages down the road a bit worse IMO.

What does torchtext do?

I was referring to a table like https://github.com/pytorch/text#installation

Speaking for myself, I'm only really concerned with the release versions available via conda/pip. I don't foresee a situation in which I'd personally need to figure out the PyTorch version compatibilities for old Torchvision nightly builds.

I agree, let's not overly complicate the problem. Having a compatibility matrix in the readme, and a better error message when running C++ ops that failed loading the .so pointing to the compatibility matrix should be enough I think

@pmeier
Copy link
Collaborator

pmeier commented May 22, 2020

I've went through the blog posts and release notes and compiled the following table:

torch torchvision python
master / nightly master / nightly >=3.6
1.5.0 0.6.0 >=3.5
1.4.0 0.5.0 ==2.7, >=3.5, <=3.8
1.3.0 0.4.2 ==2.7, >=3.5, <=3.7
1.2.0 0.4.0 ==2.7, >=3.5, <=3.7
1.1.0 0.3.0 ==2.7, >=3.5, <=3.7
1.0.0 ==2.7, >=3.5, <=3.7
<=0.4.1 ==2.7, >=3.5, <=3.7
  • Can someone help me fill in the blanks? I couldn't find a note about torchvision in the release notes of torch==1.0.0 and torch==0.4.1. Furthermore, I couldn't find a release of torchvision that matched the release date of these torch versions.
  • How do we handle the patch versions, i.e. 0.4.X. Per release notes is torchvision==0.4.1 a compatibility release for torch==1.3.0 and torchvision==0.5.0 corresponds to torch==1.4.0. How does torchvision==0.4.2 fit in here?
  • Related to the above comment: How do we want to handle the patch versions of torch, i.e. 1.3.1, 1.4.1, and so on? Do we add all compatible versions, only the latest release or even only the first release in the table?

@fmassa
Copy link
Member Author

fmassa commented May 22, 2020

Thanks a lot for compiling this list @pmeier !

Can someone help me fill in the blanks? I couldn't find a note about torchvision in the release notes of torch==1.0.0 and torch==0.4.1. Furthermore, I couldn't find a release of torchvision that matched the release date of these torch versions.

For torchvision < 0.3.0, the compatibility matrix was much simpler, because we didn't have compiled C++ bits in torchvision. So I would say that for PyTorch <= 1.0.0, the corresponding version of torchvision doesn't really matter, but we could pin it to 0.2.2

How do we handle the patch versions

Patch versions also are compatible with the latest PyTorch stable release at the time of the release. So 0.4.2 is compatible with PyTorch 1.3.1, while both 0.4.0 and 0.4.1 are compatible with PyTorch 1.3.0

do we want to handle the patch versions of torch, i.e. 1.3.1, 1.4.1, and so on? Do we add all compatible versions, only the latest release or even only the first release in the table?

Normally, whenever there was a new PyTorch release, we have also released a new (potentially minor) release of torchvision so that it is compatible. That was the case for 1.3.1, and PyTorch 1.4.1 is not really a release (probably a mistake while creating the release candidate version)

@pmeier
Copy link
Collaborator

pmeier commented May 22, 2020

while both 0.4.0 and 0.4.1 are compatible with PyTorch 1.3.0

Are you sure about that? This blog post suggests that torch==1.2.0 was released together with torchvision==0.4.0. If you are right this would break the rule that "the latest stable release of torchvision is only compatible with the latest stable release of torch.


With your comments the table now looks like this:

torch torchvision python
master / nightly master / nightly >=3.6
1.5.0 0.6.0 >=3.5
1.4.2 0.5.0 ==2.7, >=3.5, <=3.8
1.3.1 0.4.2 ==2.7, >=3.5, <=3.7
1.3.0 0.4.0, 0.4.1 ==2.7, >=3.5, <=3.7
1.2.0 0.4.0 ==2.7, >=3.5, <=3.7
1.1.0 0.3.0 ==2.7, >=3.5, <=3.7
<=1.0.1 0.2.2 ==2.7, >=3.5, <=3.7

Is this correct now? If that is the case I'll send a PR adding it to the README.

@mhsmith
Copy link

mhsmith commented May 25, 2020

Thanks for gathering this information. At the very least, the "or newer" should be removed from the README, as the released packages on PyPI all have a == dependency, not a >= one, and it sounds like that is deliberate.

@mhsmith
Copy link

mhsmith commented May 26, 2020

A correction for the table: the torchvision 0.5.0 package on PyPI requires torch==1.4.0, and it looks like torch 1.4.1 and 1.4.2 don't exist.

@fmassa
Copy link
Member Author

fmassa commented May 26, 2020

@pmeier

Are you sure about that? This blog post suggests that torch==1.2.0 was released together with torchvision==0.4.0. If you are right this would break the rule that "the latest stable release of torchvision is only compatible with the latest stable release of torch.

My bad, this is how I think the compatibility matrix should look like

torch torchvision python
master / nightly master / nightly >=3.6
1.5.0 0.6.0 >=3.5
1.4.0 0.5.0 ==2.7, >=3.5, <=3.8
1.3.1 0.4.2 ==2.7, >=3.5, <=3.7
1.3.0 0.4.1 ==2.7, >=3.5, <=3.7
1.2.0 0.4.0 ==2.7, >=3.5, <=3.7
1.1.0 0.3.0 ==2.7, >=3.5, <=3.7
<=1.0.1 0.2.2 ==2.7, >=3.5, <=3.7

@ppwwyyxx
Copy link
Contributor

ppwwyyxx commented Jul 12, 2020

Another related issue: currently if the torchvision extension failed to load, due to whatever reasons (including incompatible pytorch version), it will just silently ignore the exception:

try:
_register_extensions()
_HAS_OPS = True
except (ImportError, OSError):
pass

and later, if a user attempts to use operators in the extension, uninformative error like RuntimeError: No such operator torchvision::nms will appear. This can be improved by using the exception that's ignored.

@fmassa
Copy link
Member Author

fmassa commented Jul 12, 2020

@ppwwyyxx yes, that would be a good thing to do, but unfortunately IIRC torchscript doesn't support global values, so I'm not sure how we would be able to do it properly.

Actually, thinking about it a bit more, maybe here is one possible solution:

try:
    _register_extensions()
    def _has_extension():
        return True
except (ImportError, OSError):
    def _has_extension():
        return False

and then in the python implementations

def nms(...):
    if not _has_extension():
        raise RuntimeError("torchvision not compiled with .....")
    return torch.ops.torchvision.nms(...)

That could work with torchscript I think.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants