Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add windows machine for CI #1112

Merged
merged 73 commits into from
Jan 19, 2023
Merged
Show file tree
Hide file tree
Changes from 72 commits
Commits
Show all changes
73 commits
Select commit Hold shift + click to select a range
0543f0f
Add windows machine for CI
Wauplin Oct 13, 2022
cdd6a30
add simple windows test
Wauplin Oct 13, 2022
d9ad62a
run only windows tests in CI
Wauplin Oct 13, 2022
727ead0
remove apt get in windows ci
Wauplin Oct 13, 2022
5606ff0
disable symlinks on windows machine in CI
Wauplin Oct 13, 2022
d244536
env variale
Wauplin Oct 13, 2022
bfbdf2d
fix disable_symlinks_on_windows_ci fixture
Wauplin Oct 13, 2022
8cac046
fix fixture ?
Wauplin Oct 13, 2022
280dbc0
try to fix permission issues on windows ci
Wauplin Oct 14, 2022
95addd7
Merge branch 'main' into 1074-run-ci-on-windows
Wauplin Oct 14, 2022
4d4cf6c
Merge branch 'main' into 1074-run-ci-on-windows
Wauplin Dec 20, 2022
e72d553
refacto Offline repository tests
Wauplin Dec 21, 2022
7bbed11
Update use_tmp_repo behavior
Wauplin Dec 21, 2022
836e14f
tmp work
Wauplin Dec 21, 2022
66d908d
split between shared vs unique repo tests
Wauplin Dec 22, 2022
3c13351
working repository tests
Wauplin Dec 22, 2022
51d5827
use credential for repo tests
Wauplin Dec 22, 2022
70e1f33
Merge branch 'main' into refacto-offline-repository-tests
Wauplin Dec 22, 2022
3b8358b
refacto dataset repository tests
Wauplin Dec 22, 2022
447e347
run repo test concurrently
Wauplin Dec 22, 2022
6a3d4ca
fix github workflow
Wauplin Dec 22, 2022
2c9b7a4
fix ci
Wauplin Dec 22, 2022
c746d0c
fix ci
Wauplin Dec 22, 2022
676f45c
echo ci
Wauplin Dec 22, 2022
ab9e05a
fix ci?
Wauplin Dec 22, 2022
e1df826
finally fix ci?g
Wauplin Dec 22, 2022
57ea6e0
Merge branch 'main' into 1074-run-ci-on-windows
Wauplin Dec 22, 2022
374aa18
Merge branch 'refacto-offline-repository-tests' into 1074-run-ci-on-w…
Wauplin Dec 22, 2022
ca38d78
make test_correct_helper more robust
Wauplin Dec 22, 2022
fe32b10
delete test_correct_helper
Wauplin Dec 22, 2022
2c6c5f2
fix commit context manager
Wauplin Dec 22, 2022
8b8961d
robust tmp dir deletiong
Wauplin Dec 22, 2022
90dee33
Implement more robust TemporaryDirectory
Wauplin Dec 22, 2022
03a47e0
adapt existing codebase
Wauplin Dec 22, 2022
043c46e
cleanup at last
Wauplin Dec 22, 2022
15f8be3
Merge branch '1278-more-robust-temporary-directory' into 1074-run-ci-…
Wauplin Dec 22, 2022
34970c0
code quality
Wauplin Dec 22, 2022
8a32735
Merge branch '1278-more-robust-temporary-directory' into 1074-run-ci-…
Wauplin Dec 22, 2022
8fbb352
Rename to softTemporaryDirectory
Wauplin Dec 23, 2022
990ec80
Merge branch '1278-more-robust-temporary-directory' into 1074-run-ci-…
Wauplin Dec 23, 2022
6f806e6
Skip test on windows + force str path in Repository
Wauplin Dec 23, 2022
f0fce96
typing
Wauplin Dec 23, 2022
faadefe
Merge branch 'main' into 1074-run-ci-on-windows
Wauplin Dec 23, 2022
6fd84c4
skip symlink test on windows
Wauplin Dec 23, 2022
efff0a3
always disable symlink on windows during tests
Wauplin Dec 23, 2022
578b496
fix pytest conf
Wauplin Dec 23, 2022
a9a2930
xfail_on_windows decorator
Wauplin Dec 23, 2022
380520b
cleaner xfail_on_windows
Wauplin Dec 23, 2022
a449b87
deactivate more tests
Wauplin Dec 23, 2022
5ee4126
another xfail
Wauplin Dec 23, 2022
6280b05
skip umask test
Wauplin Dec 23, 2022
38b6985
add pending questions
Wauplin Dec 23, 2022
7c055e7
other xfail
Wauplin Dec 23, 2022
6a8f200
longer but more robust test_backoff_sleep_time test
Wauplin Dec 23, 2022
3e0bb56
another xfail
Wauplin Dec 23, 2022
b645430
Merge branch 'main' into 1074-run-ci-on-windows
Wauplin Jan 12, 2023
744d837
Merge branch 'main' into 1074-run-ci-on-windows
Wauplin Jan 17, 2023
aaaafec
Make windows-specific version of test_scan_cache_on_valid_cache
Wauplin Jan 17, 2023
8a98205
fix file size
Wauplin Jan 17, 2023
0d2b82f
2 cache tests should fail
Wauplin Jan 17, 2023
7100f65
resolve path in test missing cache
Wauplin Jan 17, 2023
3140ae0
test_scan_cache_last_modified_and_last_accessed should fail on Windows
Wauplin Jan 17, 2023
65885ee
string instead of path in test_init_from_existing_local_clone
Wauplin Jan 17, 2023
3ad5ce9
fix \r\n issue in modelcards on windows
Wauplin Jan 17, 2023
ba03a44
preserve newlines when saving Repocard
Wauplin Jan 17, 2023
558ebbf
fix linebreak issue in test_updating_text_updates_content
Wauplin Jan 17, 2023
9f99606
add repocard test to ensure linebreaks are kept
Wauplin Jan 17, 2023
7665b8b
Merge brancth 'main' into 1074-run-ci-on-windows
Wauplin Jan 17, 2023
722c50d
fix for 3.7
Wauplin Jan 17, 2023
fa85197
fix flag when saving
Wauplin Jan 17, 2023
06592ca
Merge branch 'main' into 1074-run-ci-on-windows
Wauplin Jan 18, 2023
e342c79
document known issues on windows
Wauplin Jan 18, 2023
309b559
requested change
Wauplin Jan 19, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
40 changes: 39 additions & 1 deletion .github/workflows/python-tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ env:
HUGGINGFACE_PRODUCTION_USER_TOKEN: ${{ secrets.HUGGINGFACE_PRODUCTION_USER_TOKEN }}

jobs:
build:
build-ubuntu:
runs-on: ubuntu-latest
strategy:
fail-fast: false
Expand Down Expand Up @@ -123,3 +123,41 @@ jobs:
with:
files: ./coverage.xml
verbose: true

build-windows:
# (almost) Duplicate config compared to `build-ubuntu` but running on Windows.
# Please make sure to keep it updated as well.
# Lighter version of the tests with only 1 test suite running on Python3.7.
runs-on: windows-latest
env:
DISABLE_SYMLINKS_IN_WINDOWS_TESTS: 1
strategy:
fail-fast: false
matrix:
python-version: ["3.7"]

steps:
- uses: actions/checkout@v2
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v2
with:
python-version: ${{ matrix.python-version }}

# Install dependencies
- name: Configure and install dependencies
run: |
pip install --upgrade pip
pip install .[testing]
# sudo apt install -y libsndfile1-dev

# Run tests
- name: Run tests
working-directory: ./src # For code coverage to work
run: python -m pytest --cov=./huggingface_hub --cov-report=xml:../coverage.xml ../tests

# Upload code coverage
- name: Upload coverage reports to Codecov with GitHub Action
uses: codecov/codecov-action@v3
with:
files: ./coverage.xml
verbose: true
2 changes: 1 addition & 1 deletion codecov.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ comment:
# https://docs.codecov.com/docs/pull-request-comments#requiring-changes
require_changes: true
# https://docs.codecov.com/docs/pull-request-comments#after_n_builds
after_n_builds: 11
after_n_builds: 12

github_checks:
annotations: false
18 changes: 18 additions & 0 deletions docs/source/installation.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -131,6 +131,24 @@ Tags: ['pytorch', 'tf', 'jax', 'tflite', 'rust', 'safetensors', 'gpt2', 'text-ge
Task: text-generation
```

## Windows limitations

With our goal of democratizing good ML everywhere, we built `huggingface_hub` to be a
cross-platform library and in particular to work correctly on both Unix-based and Windows
systems. However, there are a few cases where `huggingface_hub` has some limitations when
run on Windows. Here is an exhaustive list of known issues. Please let us know if you
encounter any undocumented problem by opening [an issue on Github](https://github.com/huggingface/huggingface_hub/issues/new/choose).

- `huggingface_hub`'s cache system relies on symlinks to efficiently cache files downloaded
from the Hub. On Windows, you must activate developer mode or run your script as admin to
enable symlinks. Please read [this page](./how-to-cache#limitations) for more details.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe mention here that if this isn't activated, it will still work, just in a non-optimized manner

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea !

- Filepaths on the Hub can have special characters (e.g. `"path/to?/my/file"`). Windows is
more restrictive on [special characters](https://learn.microsoft.com/en-us/windows/win32/intl/character-sets-used-in-file-names)
which makes it impossible to download those files on Windows. Hopefully this is a rare case.
Please reach out to the repo owner if you think this is a mistake or to us to figure out
a solution.


## Next steps

Once `huggingface_hub` is properly installed on your machine, you might want
Expand Down
3 changes: 2 additions & 1 deletion setup.cfg
Original file line number Diff line number Diff line change
Expand Up @@ -60,4 +60,5 @@ max-line-length = 88
# --durations=0 -> print execution time of each test
addopts = -Werror::FutureWarning --log-cli-level=INFO -sv --durations=0
env =
HUGGINGFACE_CO_STAGING=1
HUGGINGFACE_CO_STAGING=1
DISABLE_SYMLINKS_IN_WINDOWS_TESTS=1
13 changes: 8 additions & 5 deletions src/huggingface_hub/repocard.py
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,8 @@
)

# exact same regex as in the Hub server. Please keep in sync.
REGEX_YAML_BLOCK = re.compile(r"^(\s*---[\n\r]+)([\S\s]*?)([\n\r]+---([\n\r]|$))")
# See https://github.com/huggingface/moon-landing/blob/main/server/lib/ViewMarkdown.ts#L18
REGEX_YAML_BLOCK = re.compile(r"^(\s*---[\r\n]+)([\S\s]*?)([\r\n]+---(\r\n|\n|$))")

logger = get_logger(__name__)

Expand Down Expand Up @@ -127,7 +128,9 @@ def save(self, filepath: Union[Path, str]):
"""
filepath = Path(filepath)
filepath.parent.mkdir(parents=True, exist_ok=True)
filepath.write_text(str(self), encoding="utf-8")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, this didn't work as expected on Windows?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was "working" but encoding a newline was different on Windows ("\n\r") than Linux/Macos ("\n"). That's what the newline="" is for otherwise you get a big diff if you edit a README file from different platforms. And pathlib.Path.write_text doesn't have this parameter in Python 3.7 so I had to use the standard open(...).

(And yes, I discovered/learnt about all this while working on this PR :D)

# Preserve newlines as in the existing file.
with open(filepath, mode="w", newline="", encoding="utf-8") as f:
f.write(str(self))

@classmethod
def load(
Expand Down Expand Up @@ -516,11 +519,11 @@ def metadata_save(local_path: Union[str, Path], data: Dict) -> None:
# try to detect existing newline character
if os.path.exists(local_path):
with open(local_path, "r", newline="") as readme:
if type(readme.newlines) is tuple:
content = readme.read()
if isinstance(readme.newlines, tuple):
line_break = readme.newlines[0]
if type(readme.newlines) is str:
elif isinstance(readme.newlines, str):
line_break = readme.newlines
content = readme.read()

# creates a new file if it not
with open(local_path, "w", newline="") as readme:
Expand Down
7 changes: 4 additions & 3 deletions src/huggingface_hub/repository.py
Original file line number Diff line number Diff line change
Expand Up @@ -438,7 +438,7 @@ class Repository:
@validate_hf_hub_args
def __init__(
self,
local_dir: str,
local_dir: Union[str, Path],
clone_from: Optional[str] = None,
repo_type: Optional[str] = None,
token: Union[bool, str] = True,
Expand All @@ -461,7 +461,7 @@ def __init__(
explicitly specified.

Args:
local_dir (`str`):
local_dir (`str` or `Path`):
path (e.g. `'my_trained_model/'`) to the local directory, where
the `Repository` will be initialized.
clone_from (`str`, *optional*):
Expand Down Expand Up @@ -492,7 +492,8 @@ def __init__(
Instance of HfApi to use when calling the HF Hub API. A new
instance will be created if this is left to `None`.
"""

if isinstance(local_dir, Path):
local_dir = str(local_dir)
os.makedirs(local_dir, exist_ok=True)
self.local_dir = os.path.join(os.getcwd(), local_dir)
self._repo_type = repo_type
Expand Down
25 changes: 24 additions & 1 deletion tests/conftest.py
Original file line number Diff line number Diff line change
@@ -1,14 +1,17 @@
import os
import shutil
from pathlib import Path
from typing import Generator

import pytest

import huggingface_hub
from _pytest.fixtures import SubRequest
from huggingface_hub import HfApi, HfFolder
from huggingface_hub.utils import SoftTemporaryDirectory

from .testing_constants import ENDPOINT_PRODUCTION, PRODUCTION_TOKEN
from .testing_utils import repo_name
from .testing_utils import repo_name, set_write_permission_and_retry


@pytest.fixture
Expand All @@ -28,6 +31,9 @@ def test_cache_dir(self) -> None:
with SoftTemporaryDirectory() as cache_dir:
request.cls.cache_dir = Path(cache_dir).resolve()
yield
# TemporaryDirectory is not super robust on Windows when a git repository is
# cloned in it. See https://www.scivision.dev/python-tempfile-permission-error-windows/.
shutil.rmtree(cache_dir, onerror=set_write_permission_and_retry)


@pytest.fixture(autouse=True, scope="session")
Expand All @@ -47,6 +53,23 @@ def clean_hf_folder_token_for_tests() -> Generator:
HfFolder().save_token(token)


@pytest.fixture(autouse=True)
def disable_symlinks_on_windows_ci(monkeypatch: pytest.MonkeyPatch) -> None:
class FakeSymlinkDict(dict):
def __contains__(self, __o: object) -> bool:
return True # consider any `cache_dir` to be already checked

def __getitem__(self, __key: str) -> bool:
return False # symlinks are never supported

if os.name == "nt" and os.environ.get("DISABLE_SYMLINKS_IN_WINDOWS_TESTS"):
monkeypatch.setattr(
huggingface_hub.file_download,
"_are_symlinks_supported_in_dir",
FakeSymlinkDict(),
)


@pytest.fixture
def fx_production_space(request: SubRequest) -> Generator[None, None, None]:
"""Add a `repo_id` attribute referencing a Space repo on the production Hub.
Expand Down
12 changes: 11 additions & 1 deletion tests/test_cache_layout.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,12 @@
from huggingface_hub.utils._errors import EntryNotFoundError

from .testing_constants import ENDPOINT_STAGING, TOKEN, USER
from .testing_utils import expect_deprecation, repo_name, with_production_testing
from .testing_utils import (
expect_deprecation,
repo_name,
with_production_testing,
xfail_on_windows,
)


logger = logging.get_logger(__name__)
Expand All @@ -24,6 +29,7 @@ def get_file_contents(path):

@with_production_testing
class CacheFileLayoutHfHubDownload(unittest.TestCase):
@xfail_on_windows(reason="Symlinks are deactivated in Windows tests.")
def test_file_downloaded_in_cache(self):
for revision, expected_reference in (
(None, "main"),
Expand Down Expand Up @@ -136,6 +142,7 @@ def test_file_download_happens_once(self):

self.assertEqual(creation_time_0, creation_time_1)

@xfail_on_windows(reason="Symlinks are deactivated in Windows tests.")
def test_file_download_happens_once_intra_revision(self):
# Tests that a file is only downloaded once if it's not updated, even across different revisions.

Expand All @@ -152,6 +159,7 @@ def test_file_download_happens_once_intra_revision(self):

self.assertEqual(creation_time_0, creation_time_1)

@xfail_on_windows(reason="Symlinks are deactivated in Windows tests.")
def test_multiple_refs_for_same_file(self):
with SoftTemporaryDirectory() as cache:
hf_hub_download(MODEL_IDENTIFIER, "file_0.txt", cache_dir=cache)
Expand Down Expand Up @@ -192,6 +200,7 @@ def test_multiple_refs_for_same_file(self):

@with_production_testing
class CacheFileLayoutSnapshotDownload(unittest.TestCase):
@xfail_on_windows(reason="Symlinks are deactivated in Windows tests.")
def test_file_downloaded_in_cache(self):
with SoftTemporaryDirectory() as cache:
snapshot_download(MODEL_IDENTIFIER, cache_dir=cache)
Expand Down Expand Up @@ -230,6 +239,7 @@ def test_file_downloaded_in_cache(self):

self.assertTrue(all([os.path.isfile(l) for l in resolved_snapshot_links]))

@xfail_on_windows(reason="Symlinks are deactivated in Windows tests.")
def test_file_downloaded_in_cache_several_revisions(self):
with SoftTemporaryDirectory() as cache:
snapshot_download(MODEL_IDENTIFIER, cache_dir=cache, revision="file-3")
Expand Down
5 changes: 2 additions & 3 deletions tests/test_fastai_integration.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,4 @@
import os
import shutil
from unittest import TestCase, skip

from huggingface_hub import HfApi
Expand All @@ -15,7 +14,7 @@
)

from .testing_constants import ENDPOINT_STAGING, TOKEN, USER
from .testing_utils import expect_deprecation, repo_name, set_write_permission_and_retry
from .testing_utils import expect_deprecation, repo_name, rmtree_with_retry


WORKING_REPO_SUBDIR = f"fixtures/working_repo_{__name__.split('.')[-1]}"
Expand Down Expand Up @@ -76,7 +75,7 @@ def setUpClass(cls):

def tearDown(self) -> None:
try:
shutil.rmtree(WORKING_REPO_DIR, onerror=set_write_permission_and_retry)
rmtree_with_retry(WORKING_REPO_DIR)
except FileNotFoundError:
pass

Expand Down
5 changes: 5 additions & 0 deletions tests/test_file_download.py
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,7 @@
OfflineSimulationMode,
offline,
with_production_testing,
xfail_on_windows,
)


Expand Down Expand Up @@ -243,6 +244,7 @@ def test_dataset_lfs_object(self):
(url, '"95aa6a52d5d6a735563366753ca50492a658031da74f301ac5238b03966972c9"'),
)

@xfail_on_windows(reason="umask is UNIX-specific")
def test_hf_hub_download_custom_cache_permission(self):
"""Checks `hf_hub_download` respect the cache dir permission.

Expand Down Expand Up @@ -473,13 +475,15 @@ def test_hf_hub_url_on_awful_subfolder_and_filename(self):
self.expected_url,
)

@xfail_on_windows(reason="Windows paths cannot contain a '?'.")
def test_hf_hub_download_on_awful_filepath(self):
local_path = hf_hub_download(
self.repo_id, self.filepath, cache_dir=self.cache_dir
)
# Local path is not url-encoded
self.assertTrue(local_path.endswith(self.filepath))

@xfail_on_windows(reason="Windows paths cannot contain a '?'.")
def test_hf_hub_download_on_awful_subfolder_and_filename(self):
local_path = hf_hub_download(
self.repo_id,
Expand All @@ -492,6 +496,7 @@ def test_hf_hub_download_on_awful_subfolder_and_filename(self):


class CreateSymlinkTest(unittest.TestCase):
@unittest.skipIf(os.name == "nt", "No symlinks on Windows")
@patch("huggingface_hub.file_download.are_symlinks_supported")
def test_create_relative_symlink_concurrent_access(
self, mock_are_symlinks_supported: Mock
Expand Down
8 changes: 3 additions & 5 deletions tests/test_hf_api.py
Original file line number Diff line number Diff line change
Expand Up @@ -95,7 +95,7 @@
repo_name,
require_git_lfs,
retry_endpoint,
set_write_permission_and_retry,
rmtree_with_retry,
use_tmp_repo,
with_production_testing,
)
Expand Down Expand Up @@ -366,9 +366,7 @@ def setUp(self) -> None:
with open(self.nested_tmp_file, "wb+") as f:
f.truncate(1024 * 1024)

self.addCleanup(
lambda: shutil.rmtree(self.tmp_dir, onerror=set_write_permission_and_retry)
)
self.addCleanup(rmtree_with_retry, self.tmp_dir)

@retry_endpoint
def test_upload_file_validation(self):
Expand Down Expand Up @@ -2041,7 +2039,7 @@ def setUpClass(cls):
def setUp(self):
self.REPO_NAME_LARGE_FILE = large_file_repo_name()
if os.path.exists(WORKING_REPO_DIR):
shutil.rmtree(WORKING_REPO_DIR, onerror=set_write_permission_and_retry)
rmtree_with_retry(WORKING_REPO_DIR)
logger.info(
f"Does {WORKING_REPO_DIR} exist: {os.path.exists(WORKING_REPO_DIR)}"
)
Expand Down
5 changes: 2 additions & 3 deletions tests/test_hubmixin.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,5 @@
import json
import os
import shutil
import unittest
from unittest.mock import Mock

Expand All @@ -9,7 +8,7 @@
from huggingface_hub.utils import is_torch_available, logging

from .testing_constants import ENDPOINT_STAGING, TOKEN, USER
from .testing_utils import expect_deprecation, repo_name, set_write_permission_and_retry
from .testing_utils import expect_deprecation, repo_name, rmtree_with_retry


logger = logging.get_logger(__name__)
Expand Down Expand Up @@ -59,7 +58,7 @@ class HubMixingCommonTest(unittest.TestCase):
class HubMixingTest(HubMixingCommonTest):
def tearDown(self) -> None:
if os.path.exists(WORKING_REPO_DIR):
shutil.rmtree(WORKING_REPO_DIR, onerror=set_write_permission_and_retry)
rmtree_with_retry(WORKING_REPO_DIR)
logger.info(
f"Does {WORKING_REPO_DIR} exist: {os.path.exists(WORKING_REPO_DIR)}"
)
Expand Down
Loading