Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for filesystem discovery via entry points #515

Merged
merged 3 commits into from
Jan 15, 2021

Conversation

bstrdsmkr
Copy link

Fixes #206

Uses importlib.metadata when available in stdlib (or falls back to the backport otherwise) to scan for installed packages with the fsspec.specs entry point defined and add all such entry points to known_implementations upon import.

Example

setup.py

from setuptools import setup

setup(
    name="myfs",
    version='1.0.0',
    packages=['myfs'],
    entry_points=[
        'fsspec.specs': [
            'myfs=myfs:MyFs',
        ]
    ],
)

myfs/init.py

from fsspec import AbstractFileSystem

class MyFs(AbstractFileSystem):
    pass

Future Work:

  • Maybe break existing implementations out into their own packages
  • Documentation entry

@martindurant
Copy link
Member

Thanks for having a go at this - I was meaning to eventually.

I hope to see some comments from others on how this looks before going through it myself.

@martindurant
Copy link
Member

Do you think that the implementations within this repo ought to be expressed as entry points too, or is that redundant, since we know that they will be there?

We could to warn other repos, upon merger, that they should move to declaring entrypoints - or allow the existing dict registry to grandfather those that are already there.

@bstrdsmkr
Copy link
Author

No worries, I have a highly site specific implementation I'm working on and having this support would save me some effort so I wanted to give back where I could :)

I think the main benefit to moving the "internal" implementations to entry points would be in providing examples for others wanting to contribute implementations. Would that goal be better served through extracting a few of them totally into external packages? I feel like taking say gdrive and its dependencies into its own package would both meet that goal and clear up some dependency issues like you mentioned here: #366 (comment)

@martindurant
Copy link
Member

Perhaps it'd be nice to extract the ones that don't correctly test well (or at all) on CI and drag down the coverage :)

try:
from importlib.metadata import entry_points
except ImportError: # python < 3.8
from importlib_metadata import entry_points
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually, we should cope with this not being present, since it is not an absolutely required dependency - which I agree with. This, if py<3.8, and importlib_metadata is not present, then setting of the registry via entrypoints should be skipped.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could add test to the error you get if a backend is not found and importlib_metadata is not present, to say that installing it may expose some implementations.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

registry via entrypoints can still be done without this backport package in 3.6 and 3.7 but I felt since this was the method going forward that it would be easier to remove this shim once 3.6 and 3.7 are no longer supported than to implement the "old" entrypoint method, then update to this method later.

How do you feel about just raising a warning if the second import attempt fails? That way it's still possible to operate without it, but is a more conscious choice

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather not clutter the output with warnings, but silently ignoring is OK, I think, since new backends are surely being developed for py>=3.8. If people raise issues around backends failing to register, we first point to this discussion, then maybe change the behaviour if needed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I can do this, if you don't have time)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, it might be better if you do... I was toying with using the old setup tools iter_entrypoints() method as a final fallback but I can seem to form it in a way that's readable. I can always open another PR if that comes together

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Use entrypoints for known implementations
2 participants