Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Loosen versions for fsspec and pygtrie #6242

Merged
merged 1 commit into from
Jun 28, 2021

Conversation

Erotemic
Copy link
Contributor

Resolves #6102

In general modules with pinned requirements are difficult to integrate into existing systems.

Of course there is a benefit to pinning versions, you are sure your code works, but you also prevent someone from installing your package alongside other code that may require slightly newer versions.

Specifying a minimum version is always good practice, but specifying a maximum version should be done very sparingly and only when needed.

My other projects are failing to integrate with dvc because of incompatibilities in the pygtrie and fsspec packages. I'm not sure if there is a good reason why they are pinned to a maximum version. Submitting this PR to loosen them and see if the CI passes.

@Erotemic Erotemic requested a review from a team as a code owner June 28, 2021 16:32
@Erotemic Erotemic requested a review from pared June 28, 2021 16:32
@efiop
Copy link
Member

efiop commented Jun 28, 2021

Hi @Erotemic !

Are you using dvc python API or only CLI?

@Erotemic
Copy link
Contributor Author

Currently only using the CLI, but I may use the python API in the future.

@efiop
Copy link
Member

efiop commented Jun 28, 2021

@Erotemic For CLI it is better to either install dvc using pipx or install something that comes with an isolated env, like our deb/rpm/homebrew/choco/snap/etc packages. For API these conflicts are, of course, relevant, but we are currently actively working on fsspec-related fixes/features/etc, which forces us to always use the latest version, which will be problematic for some environments. Once things are a bit more stable, we'll definitely relax fsspec requirements in dvc itself, but depending on a remote type that you want to use with dvc (e.g. s3), the corresponding driver might still require the latest fsspec version.

Thank you for this PR! 🙏 Let's merge, but fsspec will likely jump to the newer version very soon, so depending on the rest of your env, you might get stuck to this particular dvc version for a little bit. Thanks again for the feedback! 🙂

@efiop efiop enabled auto-merge (squash) June 28, 2021 20:57
@efiop efiop merged commit e72354e into iterative:master Jun 28, 2021
@Erotemic
Copy link
Contributor Author

I'm always happy to upgrade libraries to their latest version. If an upgrade to another package to its latest version breaks my code, I feel that it's my responsibility as a dev to fix that.

For API these conflicts are, of course, relevant, but we are currently actively working on fsspec-related fixes/features/etc, which forces us to always use the latest version, which will be problematic for some environments.

Are changes going to be backwards incompatible? For instance, would you expect a fsspec == 2021.7.1 to break code that was developed with fsspec == 2021.6.1? If not, then using ">=" seems safe. But if changes are expected to be backwards incompatible, then it may be important to add a maximum version.


On a related note, a CI scheme recently tried out, and have been very happy with, on the CI for my projects is creating a "loose" and a "strict" test environment.

In the "loose" environment, everything is kept as-is, all requirements only specify a minimum version for each dependency unless there is pathological incompatibility with the latest version (e.g. I had to pin to torch < 1.9 at one point because of an issue with an external dep). In this "loose" test, the CI will install the latest versions of all deps, so the tests will generally catch if there is something new breaks the code.

In the "strict" environment I replace all the ">=" in my requirements with "==", which pins all requirements to the minimum version. This test ensures that all of my code is will working based on the assumptions I'm used to. More than once I've submitted a patch, and the "strict" version passes, but the "loose" version fails. This has made it really easy to narrow down where the bug was introduced and has saved me a lot of development time.

@efiop
Copy link
Member

efiop commented Jun 28, 2021

I'm always happy to upgrade libraries to their latest version. If an upgrade to another package to its latest version breaks my code, I feel that it's my responsibility as a dev to fix that.

Thank you for doing that, we really appreciate it! 🙏

Are changes going to be backwards incompatible? For instance, would you expect a fsspec == 2021.7.1 to break code that was developed with fsspec == 2021.6.1?

Unfortunately, that is a possibility 🙁 To emphasize that, fsspec switched to date-based versions to avoid giving people an impression of having semantic versions. Closely related projects like s3fs/gcsfs are trying to follow that as well , though some (like adlfs) take the risk and keep a loose requirement for fsspec.

The fsspec requirement put in our setup.py is really only needed because of a recent LocalFileSystem bug, but @skshetry suggested that we could try to keep Local/MemoryFileSystem implementations in-house to avoid this and have an ability to change it quicker during the upcoming localfs optimizations. I am hesitant to duplicate the code, but it might be worth it as a temporary measure.

On a related note, a CI scheme recently tried out, and have been very happy with, on the CI for my projects is creating a "loose" and a "strict" test environment.

In the "loose" environment, everything is kept as-is, all requirements only specify a minimum version for each dependency unless there is pathological incompatibility with the latest version (e.g. I had to pin to torch < 1.9 at one point because of an issue with an external dep). In this "loose" test, the CI will install the latest versions of all deps, so the tests will generally catch if there is something new breaks the code.

In the "strict" environment I replace all the ">=" in my requirements with "==", which pins all requirements to the minimum version. This test ensures that all of my code is will working based on the assumptions I'm used to. More than once I've submitted a patch, and the "strict" version passes, but the "loose" version fails. This has made it really easy to narrow down where the bug was introduced and has saved me a lot of development time.
In the "strict" environment I replace all the ">=" in my requirements with "==", which pins all requirements to the minimum version. This test ensures that all of my code is will working based on the assumptions I'm used to. More than once I've submitted a patch, and the "strict" version passes, but the "loose" version fails. This has made it really easy to narrow down where the bug was introduced and has saved me a lot of development time.

Oh, this is a very nice trick! Thanks for the idea! WDYT, @iterative/dvc ?

Btw, @Erotemic we would love to have a chance to meet you and talk about your experience with dvc (no marketing bs or anything like that, just dvc and getting to know each other a little bit 🙂 ). Please let us know if you would be interested in having a hangouts/zoom call with us some time and I'll shoot you an email to figure out the details 🙂

@Erotemic
Copy link
Contributor Author

Sure, let's work out a time over email

@skshetry
Copy link
Member

Oh, this is a very nice trick! Thanks for the idea! WDYT, @iterative/dvc ?

We have a checkbox for running tests on lower bounds of requirements on #4468 already.

@efiop
Copy link
Member

efiop commented Jun 29, 2021

@Erotemic Great! Sent you an email 😉

@skshetry Ah, right, forgot about that. Thanks! 🙏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Update pygtrie to 2.4.2?
3 participants