-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use poetry for dependency management #111
Conversation
CLA Assistant Lite bot All contributors have signed the CLA ✍️ ✅ |
I have read the CLA Document and I hereby sign the CLA |
recheck |
Ive published to pypi from cloudbuild (https://pypi.org/project/squirrel-core/0.18.4.dev776/). This is the diff of the package metadata to the latest version (mostly dev dependencies removed from the published package) {
- "filename":"squirrel_core-0.18.4.dev25985-py3-none-any.whl",
+ "filename":"squirrel_core-0.18.4.dev776-py3-none-any.whl",
"metadata_version":"2.1",
"name":"squirrel-core",
- "version":"0.18.4.dev25985",
- "summary":"Squirrel is a Python library that enables ML teams to share, load, and transform data in a collaborative, flexible, and efficient way.",
+ "version":"0.18.4.dev776",
+ "summary":"Squirrel is a Python library that enables ML teams to share, load, and transform data in a collaborative, flexible and efficient way.",
+ "home_page":"https://merantix-momentum.com/technology/squirrel/",
"author":"Merantix Momentum",
"license":"Apache 2.0",
"classifiers":[
"Development Status :: 5 - Production/Stable",
"License :: OSI Approved :: Apache Software License",
- "Programming Language :: Python :: 3.8",
+ "License :: Other/Proprietary License",
+ "Programming Language :: Python :: 3",
+ "Programming Language :: Python :: 3.9",
+ "Programming Language :: Python :: 3.10",
+ "Programming Language :: Python :: 3.11",
+ "Programming Language :: Python :: 3.9",
"Typing :: Typed"
],
+ "requires_python":">=3.9,<4.0",
"requires_dist":[
- "fsspec (>=0.8.7)",
- "msgpack",
- "msgpack-numpy",
- "more-itertools",
- "pluggy",
- "random-name",
- "ruamel.yaml",
- "tqdm",
- "numba",
- "numpy",
- "pyjwt (>=2.4.0)",
- "mako (>=1.2.2)",
- "oauthlib (>=3.2.1)",
- "aiohttp (>=3.7.4)",
- "twine ; extra == 'all'",
- "wheel ; extra == 'all'",
- "pytest (>=6.2.1) ; extra == 'all'",
- "pytest-timeout ; extra == 'all'",
- "pytest-cov ; extra == 'all'",
- "pytest-xdist ; extra == 'all'",
- "wandb ; extra == 'all'",
- "mlflow ; extra == 'all'",
- "pre-commit (==2.16.0) ; extra == 'all'",
- "pip-tools (>=6.6.2) ; extra == 'all'",
- "sphinx ; extra == 'all'",
- "sphinx-autoapi ; extra == 'all'",
- "sphinxcontrib-mermaid ; extra == 'all'",
- "sphinx-rtd-theme ; extra == 'all'",
- "gcsfs (>=2021.06.0) ; extra == 'all'",
- "adlfs (<2021.10) ; extra == 'all'",
- "s3fs ; extra == 'all'",
- "zarr (==2.10.3) ; extra == 'all'",
- "pyarrow ; extra == 'all'",
- "dask[dataframe,distributed] ; extra == 'all'",
- "torch (>=1.13.1) ; extra == 'all'",
- "odfpy ; extra == 'all'",
- "openpyxl ; extra == 'all'",
- "pyxlsb ; extra == 'all'",
- "xlrd ; extra == 'all'",
- "adlfs (<2021.10) ; extra == 'azure'",
- "dask[dataframe,distributed] ; extra == 'dask'",
- "twine ; extra == 'dev'",
- "wheel ; extra == 'dev'",
- "pytest (>=6.2.1) ; extra == 'dev'",
- "pytest-timeout ; extra == 'dev'",
- "pytest-cov ; extra == 'dev'",
- "pytest-xdist ; extra == 'dev'",
- "wandb ; extra == 'dev'",
- "mlflow ; extra == 'dev'",
- "pre-commit (==2.16.0) ; extra == 'dev'",
- "pip-tools (>=6.6.2) ; extra == 'dev'",
- "sphinx ; extra == 'dev'",
- "sphinx-autoapi ; extra == 'dev'",
- "sphinxcontrib-mermaid ; extra == 'dev'",
- "sphinx-rtd-theme ; extra == 'dev'",
- "odfpy ; extra == 'excel'",
- "openpyxl ; extra == 'excel'",
- "pyxlsb ; extra == 'excel'",
- "xlrd ; extra == 'excel'",
- "pyarrow ; extra == 'feather'",
- "gcsfs (>=2021.06.0) ; extra == 'gcp'",
- "pyarrow ; extra == 'parquet'",
- "s3fs ; extra == 's3'",
- "torch (>=1.13.1) ; extra == 'torch'",
- "zarr (==2.10.3) ; extra == 'zarr'"
+ "adlfs (<2021.10) ; extra == \"azure\" or extra == \"all\"",
+ "aiohttp (>=3.7.4,<4.0.0)",
+ "dask[dataframe,distributed] (>=2021.7.0) ; extra == \"dask\" or extra == \"all\"",
+ "fsspec (>=2021.7.0)",
+ "gcsfs (>=2021.7.0) ; extra == \"gcp\" or extra == \"all\"",
+ "mako (>=1.2.2,<2.0.0)",
+ "more-itertools (>=9.0.0,<10.0.0)",
+ "msgpack (>=1.0.4,<2.0.0)",
+ "msgpack-numpy (>=0.4.8,<0.5.0)",
+ "numba (>=0.56.4,<0.57.0)",
+ "numpy (>=1.23.5,<2.0.0)",
+ "oauthlib (>=3.2.1,<4.0.0)",
+ "odfpy (>=1.4.1,<2.0.0) ; extra == \"excel\" or extra == \"all\"",
+ "openpyxl (>=3.1.1,<4.0.0) ; extra == \"excel\" or extra == \"all\"",
+ "pluggy (>=1.0.0,<2.0.0)",
+ "pyarrow (>=10.0.1,<11.0.0) ; extra == \"feather\" or extra == \"parquet\" or extra == \"all\"",
+ "pyjwt (>=2.4.0,<3.0.0)",
+ "pyxlsb (>=1.0.10,<2.0.0) ; extra == \"excel\" or extra == \"all\"",
+ "random-name (>=0.1.1,<0.2.0)",
+ "ruamel-yaml (>=0.17.21,<0.18.0)",
+ "s3fs (>=2021.7.0) ; extra == \"s3\" or extra == \"all\"",
+ "torch (>=1.13.1,<2.0.0) ; extra == \"torch\" or extra == \"all\"",
+ "tqdm (>=4.64.1,<5.0.0)",
+ "xlrd (>=2.0.1,<3.0.0) ; extra == \"excel\" or extra == \"all\"",
+ "zarr (>=2.10.3,<3.0.0) ; extra == \"zarr\" or extra == \"all\""
+ ],
+ "project_urls":[
+ "Documentation, https://squirrel-core.readthedocs.io/en/latest/",
+ "Repository, https://github.com/merantix-momentum/squirrel-core"
],
"provides_extras":[
"all",
"azure",
"dask",
- "dev",
"excel",
"feather",
"gcp",
"zarr"
],
"description_content_type":"text/markdown",
- "description":"<div align=\"center\">\n\n# <img src=\"https://raw.githubusercontent.com/merantix-momentum/squirrel-core/main/docs/_static/logo.png\" width=\"150px\"> Squirrel Core\n\n**Share, load, and transform data in a collaborative, flexible, and efficient way**\n\n[![Python](https://img.shields.io/pypi/pyversions/squirrel-core.svg?style=plastic)](https://badge.fury.io/py/squirrel-core)\n[![PyPI](https://img.shields.io/pypi/v/squirrel-core?label=pypi%20package)](https://pypi.org/project/squirrel-core/)\n[![Conda](https://img.shields.io/conda/vn/conda-forge/squirrel-core)](https://anaconda.org/conda-forge/squirrel-core)\n[![Documentation Status](https://readthedocs.org/projects/squirrel-core/badge/?version=latest)](https://squirrel-core.readthedocs.io/en/latest)\n[![Downloads](https://static.pepy.tech/personalized-badge/squirrel-core?period=total&units=international_system&left_color=grey&right_color=blue&left_text=Downloads)](https://pepy.tech/project/squirrel-core)\n[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://raw.githubusercontent.com/merantix-momentum/squirrel-core/main/LICENSE)\n[![DOI](https://zenodo.org/badge/458099869.svg)](https://zenodo.org/badge/latestdoi/458099869)\n[![Generic badge](https://img.shields.io/badge/Website-Merantix%20Momentum-blue)](https://merantix-momentum.com)\n[![Slack](https://img.shields.io/badge/slack-chat-green.svg?logo=slack)](https://join.slack.com/t/squirrel-core/shared_invite/zt-14k6sk6sw-zQPHfqAI8Xq5WYd~UqgNFw)\n\n</div>\n\n---\n\n## What is Squirrel?\n\nSquirrel is a Python library that enables ML teams to share, load, and transform data in a collaborative, flexible, and efficient way.\n\n1. **SPEED:** Avoid data stall, i.e. the expensive GPU will not be idle while waiting for the data.\n\n2. **COSTS:** First, avoid GPU stalling, and second allow to shard & cluster your data and store & load it in bundles, decreasing the cost for your data bucket cloud storage.\n\n3. **FLEXIBILITY:** Work with a flexible standard data scheme which is adaptable to any setting, including multimodal data.\n\n4. **COLLABORATION:** Make it easier to share data & code between teams and projects in a self-service model.\n\nStream data from anywhere to your machine learning model as easy as:\n```python\nit = (\n Catalog.from_plugins()[\"imagenet\"]\n .get_driver()\n .get_iter(\"train\")\n .map(lambda r: (augment(r[\"image\"]), r[\"label\"]))\n .batched(100)\n)\n```\n\nCheck out our full [getting started](https://github.com/merantix-momentum/squirrel-datasets-core/blob/main/examples/01.Getting_Started.ipynb) tutorial notebook. If you have any questions or would like to contribute, join our [Slack community](https://join.slack.com/t/squirrel-core/shared_invite/zt-14k6sk6sw-zQPHfqAI8Xq5WYd~UqgNFw).\n\n## Installation\nYou can install `squirrel-core` by\n```shell\npip install squirrel-core\n```\n\nTo install all features and functionalities:\n\n```shell\npip install \"squirrel-core[all]\"\n```\n\nOr select the dependencies you need:\n\n```shell\npip install \"squirrel-core[gcs,torch]\"\n```\n\nPlease refer to the [installation](https://squirrel-core.readthedocs.io/en/latest/getting_started/installation.html) \nsection of the documentation for a complete list of supported dependencies.\n\n## Documentation\n\nRead our documentation at [ReadTheDocs](https://squirrel-core.readthedocs.io/en/latest)\n\n## Squirrel Datasets\n\n[Squirrel-datasets-core](https://github.com/merantix-momentum/squirrel-datasets-core) is an accompanying Python package that does three things.\n1. It extends the Squirrel platform for data transform, access, and discovery through custom drivers for public datasets. \n2. It also allows you to tap into the vast amounts of open-source datasets from [Huggingface](https://huggingface.co/), [Activeloop Hub](https://www.activeloop.ai/) and [Torchvision](https://pytorch.org/vision/stable/datasets.html), and you\\'ll get all of Squirrel\\'s functionality on top!\n3. It provides open-source and community-contributed [tutorials and example notebooks](https://github.com/merantix-momentum/squirrel-datasets-core/tree/main/examples) for using Squirrel.\n\n## Contributing\nSquirrel is open source and community contributions are welcome!\n\nCheck out the [contribution guide](https://squirrel-core.readthedocs.io/en/latest/developer/contribute.html) to learn how to get involved.\n\n## The Humans Behind Squirrel\nWe are [Merantix Momentum](https://merantix-momentum.com/), a team of ~30 machine learning engineers, developing machine learning solutions for industry and research. Each project comes with its own challenges, data types and learnings, but one issue we always faced was scalable data loading, transforming and sharing. We were looking for a solution that would allow us to load the data in a fast and cost-efficient way, while keeping the flexibility to work with any possible dataset and integrate with any API. That\\'s why we build Squirrel – and we hope you\\'ll find it as useful as we do! By the way, [we are hiring](https://merantix-momentum.com/about#jobs)!\n\n\n## Citation\n\nIf you use Squirrel in your research, please cite it using:\n```bibtex\n@article{2022squirrelcore,\n title={Squirrel: A Python library that enables ML teams to share, load, and transform data in a collaborative, flexible, and efficient way.},\n author={Squirrel Developer Team},\n journal={GitHub. Note: https://github.com/merantix-momentum/squirrel-core},\n doi={10.5281/zenodo.6418280},\n year={2022}\n}\n```\n"
+ "description":"<div align=\"center\">\n\n# <img src=\"https://raw.githubusercontent.com/merantix-momentum/squirrel-core/main/docs/_static/logo.png\" width=\"150px\"> Squirrel Core\n\n**Share, load, and transform data in a collaborative, flexible, and efficient way**\n\n[![Python](https://img.shields.io/pypi/pyversions/squirrel-core.svg?style=plastic)](https://badge.fury.io/py/squirrel-core)\n[![PyPI](https://img.shields.io/pypi/v/squirrel-core?label=pypi%20package)](https://pypi.org/project/squirrel-core/)\n[![Conda](https://img.shields.io/conda/vn/conda-forge/squirrel-core)](https://anaconda.org/conda-forge/squirrel-core)\n[![Documentation Status](https://readthedocs.org/projects/squirrel-core/badge/?version=latest)](https://squirrel-core.readthedocs.io/en/latest)\n[![Downloads](https://static.pepy.tech/personalized-badge/squirrel-core?period=total&units=international_system&left_color=grey&right_color=blue&left_text=Downloads)](https://pepy.tech/project/squirrel-core)\n[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://raw.githubusercontent.com/merantix-momentum/squirrel-core/main/LICENSE)\n[![DOI](https://zenodo.org/badge/458099869.svg)](https://zenodo.org/badge/latestdoi/458099869)\n[![Generic badge](https://img.shields.io/badge/Website-Merantix%20Momentum-blue)](https://merantix-momentum.com)\n[![Slack](https://img.shields.io/badge/slack-chat-green.svg?logo=slack)](https://join.slack.com/t/squirrel-core/shared_invite/zt-14k6sk6sw-zQPHfqAI8Xq5WYd~UqgNFw)\n\n</div>\n\n---\n\n## What is Squirrel?\n\nSquirrel is a Python library that enables ML teams to share, load, and transform data in a collaborative, flexible, and efficient way.\n\n1. **SPEED:** Avoid data stall, i.e. the expensive GPU will not be idle while waiting for the data.\n\n2. **COSTS:** First, avoid GPU stalling, and second allow to shard & cluster your data and store & load it in bundles, decreasing the cost for your data bucket cloud storage.\n\n3. **FLEXIBILITY:** Work with a flexible standard data scheme which is adaptable to any setting, including multimodal data.\n\n4. **COLLABORATION:** Make it easier to share data & code between teams and projects in a self-service model.\n\nStream data from anywhere to your machine learning model as easy as:\n```python\nit = (\n Catalog.from_plugins()[\"imagenet\"]\n .get_driver()\n .get_iter(\"train\")\n .map(lambda r: (augment(r[\"image\"]), r[\"label\"]))\n .batched(100)\n)\n```\n\nCheck out our full [getting started](https://github.com/merantix-momentum/squirrel-datasets-core/blob/main/examples/01.Getting_Started.ipynb) tutorial notebook. If you have any questions or would like to contribute, join our [Slack community](https://join.slack.com/t/squirrel-core/shared_invite/zt-14k6sk6sw-zQPHfqAI8Xq5WYd~UqgNFw).\n\n## Installation\nYou can install `squirrel-core` by\n```shell\npip install squirrel-core\n```\n\nTo install all features and functionalities:\n\n```shell\npip install \"squirrel-core[all]\"\n```\n\nOr select the dependencies you need:\n\n```shell\npip install \"squirrel-core[gcs,torch]\"\n```\n\nPlease refer to the [installation](https://squirrel-core.readthedocs.io/en/latest/getting_started/installation.html) \nsection of the documentation for a complete list of supported dependencies.\n\n## Documentation\n\nRead our documentation at [ReadTheDocs](https://squirrel-core.readthedocs.io/en/latest)\n\n## Squirrel Datasets\n\n[Squirrel-datasets-core](https://github.com/merantix-momentum/squirrel-datasets-core) is an accompanying Python package that does three things.\n1. It extends the Squirrel platform for data transform, access, and discovery through custom drivers for public datasets. \n2. It also allows you to tap into the vast amounts of open-source datasets from [Huggingface](https://huggingface.co/), [Activeloop Hub](https://www.activeloop.ai/) and [Torchvision](https://pytorch.org/vision/stable/datasets.html), and you\\'ll get all of Squirrel\\'s functionality on top!\n3. It provides open-source and community-contributed [tutorials and example notebooks](https://github.com/merantix-momentum/squirrel-datasets-core/tree/main/examples) for using Squirrel.\n\n## Contributing\nSquirrel is open source and community contributions are welcome!\n\nCheck out the [contribution guide](https://squirrel-core.readthedocs.io/en/latest/developer/contribute.html) to learn how to get involved.\n\n## The Humans Behind Squirrel\nWe are [Merantix Momentum](https://merantix-momentum.com/), a team of ~30 machine learning engineers, developing machine learning solutions for industry and research. Each project comes with its own challenges, data types and learnings, but one issue we always faced was scalable data loading, transforming and sharing. We were looking for a solution that would allow us to load the data in a fast and cost-efficient way, while keeping the flexibility to work with any possible dataset and integrate with any API. That\\'s why we build Squirrel – and we hope you\\'ll find it as useful as we do! By the way, [we are hiring](https://merantix-momentum.com/about#jobs)!\n\n\n## Citation\n\nIf you use Squirrel in your research, please cite it using:\n```bibtex\n@article{2022squirrelcore,\n title={Squirrel: A Python library that enables ML teams to share, load, and transform data in a collaborative, flexible, and efficient way.},\n author={Squirrel Developer Team},\n journal={GitHub. Note: https://github.com/merantix-momentum/squirrel-core},\n doi={10.5281/zenodo.6418280},\n year={2022}\n}\n```\n\n"
} |
This reverts commit a7fdb62.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tested this PR in conda and it worked. Steps:
conda env create -f .../sandbox.yaml
conda activate sandbox
poetry install --all-extras
my sandbox.yaml
name: sandbox
dependencies:
- python=3.9
- anaconda
- pip
- pip:
- keyrings.google-artifactregistry-auth==1.1.1
- poetry
Description
This PR introduces poetry for dependency and venv management.
Because there were some version conflicts, we also made the following changes:
For building the python package, we switch from setuptools to poetry-core. We still build the package with our setup.py wrapper to inject our custom versioning logic.
This PR contains minimal changes to stay compatible with the current Docker and CloudBuild workflows.
.venv/
) and the dependencies (pyproject.toml
,poetry.lock
)requirements.txt
from the lock file in a pre-commit hook. No poetry is required from this moment.python -m build
(uses poetry-core)As a next step we can look into using poetry also for publishing the package to PyPI. This would require setting up poetry in Docker.
Type of change
Checklist: