Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

import: fails with ModuleNotFoundError (win32timezone) when importing private repository using ssh using dvc from Windows Installer #7577

Closed
W1M0R opened this issue Apr 15, 2022 · 9 comments
Assignees
Labels
awaiting response we are waiting for your reply, please respond! :) build Issues/features related to building dvc install packages. P: windows Related to the Platform: Windows

Comments

@W1M0R
Copy link

W1M0R commented Apr 15, 2022

Bug Report

Description

According to the guide, we need to use the git/ssh protocol when working with private repositories. Following this recommendation, a dvc import fails with ModuleNotFoundError: No module named 'win32timezone'. Trying again, but this time using https, I get dulwich.client.HTTPUnauthorized: No valid credentials provided (as expected according to your guide).

Reproduce

  1. Install dvc 2.10.1 using the Windows Installer.
  2. Clone a dvc project.
  3. Run dvc import --verbose git@github.com:YourOrg/YourPrivateRepo.git AFolder/ADvcFolder -o AFolder/ADvcFolder

YourOrg/YourPrivateRepo.git - This is a private repo initialised with dvc.
ADvcFolder - Is a dvc tracked folder (e.g. via dvc add AFolder/ADvcFolder in the private repo)

Expected

Import should succeed without error.

Environment information

Output of dvc doctor:

$ dvc doctor
DVC version: 2.10.1 (exe)
---------------------------------
Platform: Python 3.8.10 on Windows-10-10.0.19044-SP0
Supports:
        azure (adlfs = 2022.2.0, knack = 0.9.0, azure-identity = 1.9.0),
        gdrive (pydrive2 = 1.10.0),
        gs (gcsfs = 2022.3.0),
        hdfs (fsspec = 2022.3.0, pyarrow = 7.0.0),
        webhdfs (fsspec = 2022.3.0),
        http (aiohttp = 3.8.1, aiohttp-retry = 2.4.6),
        https (aiohttp = 3.8.1, aiohttp-retry = 2.4.6),
        s3 (s3fs = 2022.3.0, boto3 = 1.21.21),
        ssh (sshfs = 2022.3.1),
        oss (ossfs = 2021.8.0),
        webdav (webdav4 = 0.9.5),
        webdavs (webdav4 = 0.9.5)
Cache types: <https://error.dvc.org/no-dvc-cache>
Caches: local
Remotes: local
Workspace directory: NTFS on D:\
Repo: dvc, git

Output of dvc import:

$ dvc import --verbose git@github.com:YourOrg/YourPrivateRepo.git AFolder/ADvcFolder -o AFolder/ADvcFolder
2022-04-15 23:40:52,341 DEBUG: Removing output 'AFolder\ADvcFolder' of stage: 'AFolder\ADvcFolder.dvc'.
2022-04-15 23:40:52,343 DEBUG: Removing 'D:\TestProject\AFolder\ADvcFolder'
Importing 'AFolder/ADvcFolder (git@github.com:YourOrg/YourPrivateRepo.git)' -> 'AFolder\ADvcFolder'
2022-04-15 23:40:52,352 DEBUG: Computed stage: 'AFolder\ADvcFolder.dvc' md5: '0f2babe69537617c1ffa8526e9d72c0a'
2022-04-15 23:40:52,353 DEBUG: 'md5' of stage: 'AFolder\ADvcFolder.dvc' changed.
2022-04-15 23:40:52,355 DEBUG: Creating external repo git@github.com:YourOrg/YourPrivateRepo.git@None
2022-04-15 23:40:52,356 DEBUG: erepo: git clone 'git@github.com:YourOrg/YourPrivateRepo.git' to a temporary dir
2022-04-15 23:40:53,433 ERROR: failed to import 'AFolder/ADvcFolder from 'git@github.com:YourOrg/YourPrivateRepo.git'. - Failed to clone repo 'git@github.com:YourOrg/YourPrivateRepo.git' to 'C:\Users\user\AppData\Local\Temp\tmpbeadgsg7dvc-clone'
------------------------------------------------------------
Traceback (most recent call last):
  File "scmrepo\git\backend\dulwich\__init__.py", line 193, in clone
  File "dulwich\porcelain.py", line 443, in clone
  File "dulwich\client.py", line 535, in clone
  File "dulwich\client.py", line 601, in fetch
  File "dulwich\client.py", line 1088, in fetch_pack
  File "dulwich\client.py", line 1756, in _connect
  File "fsspec\asyn.py", line 85, in wrapper
  File "fsspec\asyn.py", line 65, in sync
  File "fsspec\asyn.py", line 25, in _runner
  File "scmrepo\git\backend\dulwich\asyncssh_vendor.py", line 149, in _run_command
  File "asyncssh\connection.py", line 7687, in connect
  File "asyncio\tasks.py", line 455, in wait_for
  File "asyncssh\connection.py", line 429, in _connect
  File "asyncio\base_events.py", line 1050, in create_connection
  File "asyncio\base_events.py", line 1068, in _create_connection_transport
  File "asyncssh\connection.py", line 7678, in conn_factory
  File "asyncssh\connection.py", line 3064, in __init__
  File "asyncssh\gss_win32.py", line 168, in __init__
  File "sspi.py", line 200, in __init__
ModuleNotFoundError: No module named 'win32timezone'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "dvc\scm.py", line 106, in clone
  File "scmrepo\git\__init__.py", line 143, in clone
  File "scmrepo\git\backend\dulwich\__init__.py", line 196, in clone
scmrepo.exceptions.CloneError: Failed to clone repo 'git@github.com:YourOrg/YourPrivateRepo.git' to 'C:\Users\user\AppData\Local\Temp\tmpbeadgsg7dvc-clone'     

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "dvc\commands\imp.py", line 15, in run
  File "dvc\repo\imp.py", line 6, in imp
  File "dvc\repo\__init__.py", line 48, in wrapper
  File "dvc\repo\scm_context.py", line 152, in run
  File "dvc\repo\imp_url.py", line 83, in imp_url
  File "funcy\decorators.py", line 45, in wrapper
  File "dvc\stage\decorators.py", line 36, in rwlocked
  File "funcy\decorators.py", line 66, in __call__
  File "dvc\stage\__init__.py", line 533, in run
  File "funcy\decorators.py", line 45, in wrapper
  File "dvc\stage\decorators.py", line 36, in rwlocked
  File "funcy\decorators.py", line 66, in __call__
  File "dvc\stage\__init__.py", line 557, in _sync_import
  File "dvc\stage\imports.py", line 47, in sync_import
  File "dvc\dependency\repo.py", line 66, in download
  File "dvc\dependency\repo.py", line 107, in _get_used_and_obj
  File "contextlib.py", line 113, in __enter__
  File "dvc\external_repo.py", line 36, in external_repo
  File "dvc\external_repo.py", line 162, in _cached_clone
  File "funcy\decorators.py", line 45, in wrapper
  File "funcy\flow.py", line 274, in wrap_with
  File "funcy\decorators.py", line 66, in __call__
  File "dvc\external_repo.py", line 232, in _clone_default_branch
  File "dvc\scm.py", line 108, in clone
dvc.scm.CloneError: Failed to clone repo 'git@github.com:YourOrg/YourPrivateRepo.git' to 'C:\Users\user\AppData\Local\Temp\tmpbeadgsg7dvc-clone'
------------------------------------------------------------
2022-04-15 23:40:53,453 DEBUG: Analytics is enabled.
2022-04-15 23:40:53,456 DEBUG: Trying to spawn '['daemon', '-q', 'analytics', 'C:\\Users\\user\\AppData\\Local\\Temp\\tmpnnqcpnag']'
2022-04-15 23:40:53,461 DEBUG: Spawned '['daemon', '-q', 'analytics', 'C:\\Users\\user\\AppData\\Local\\Temp\\tmpnnqcpnag']'
@W1M0R W1M0R changed the title import: fails with ModuleNotFoundError (win32timezone) when importing private repository using ssh import: fails with ModuleNotFoundError (win32timezone) when importing private repository using ssh using dvc from Windows Installer Apr 16, 2022
@W1M0R
Copy link
Author

W1M0R commented Apr 16, 2022

I uninstalled dvc (which was previously installed using the dvc Windows Installer - to get the automated symlink permissions setup). Then I installed dvc using choco. This resolved the issue. It could be that the problem then lies with the dvc Windows Installer, or that I missed a manual installation step.

Output of dvc doctor:

$ dvc doctor
DVC version: 2.10.1 (choco)
---------------------------------
Platform: Python 3.9.0 on Windows-10-10.0.19041-SP0
Supports:
        azure (adlfs = 2022.4.0, knack = 0.9.0, azure-identity = 1.9.0),
        gdrive (pydrive2 = 1.10.0),
        gs (gcsfs = 2022.3.0),
        webhdfs (fsspec = 2022.3.0),
        http (aiohttp = 3.8.1, aiohttp-retry = 2.4.6),
        https (aiohttp = 3.8.1, aiohttp-retry = 2.4.6),
        s3 (s3fs = 2022.3.0, boto3 = 1.21.21),
        ssh (sshfs = 2022.3.1),
        oss (ossfs = 2021.8.0)
Cache types: hardlink
Cache directory: NTFS on D:\
Caches: local
Remotes: local
Workspace directory: NTFS on D:\
Repo: dvc, git

@daavoo daavoo added P: windows Related to the Platform: Windows build Issues/features related to building dvc install packages. labels Apr 18, 2022
@jruehle
Copy link

jruehle commented Aug 24, 2022

Having the same issue of the missing win32timezone module with Win installer 2.18.0 (both in powershell and git bash). My workaround was installing dvc via pip and a venv.
I think more issues like #2754, #7505 and this problem discussion are connected to this. The merged PR #2763 seems to address it but we're still having the win32timezone issue after all?!

@HaddocktheHorrible
Copy link

HaddocktheHorrible commented Oct 6, 2022

Any idea if this will get attention? Ran into the same issue from the 2.11.0 windows installer (although pip-installing 2.29.0 has failed with different error; had to revert to 2.9.5). It's much more convenient for our team to distribute internally via the binary installer.

Edit: In case anyone else has this issue, the bug is not present in the 2.9.5 windows binary installer.

@pmrowla
Copy link
Contributor

pmrowla commented Oct 11, 2022

@HaddocktheHorrible have you tried the windows installer for the latest DVC release?

@HaddocktheHorrible
Copy link

HaddocktheHorrible commented Oct 11, 2022

Hi @pmrowla - not yet. Since I ran into #7702 from the pip version of 2.29.0, I did not bother with the windows installer. But you're right that I can see if I get the win32timezone error on the latest installer too. I will report back...

Update: Confirmed I see the same ModuleNotFoundError: No module named 'win32timezone' error when trying to call dvc import from the 2.30.0 release version.

Here's the traceback leading to that error:

Traceback (most recent call last):
  File "scmrepo\git\backend\dulwich\__init__.py", line 200, in clone
  File "dulwich\porcelain.py", line 538, in clone
  File "dulwich\client.py", line 760, in clone
  File "dulwich\client.py", line 837, in fetch
  File "dulwich\client.py", line 1146, in fetch_pack
  File "dulwich\client.py", line 1792, in _connect
  File "fsspec\asyn.py", line 111, in wrapper
  File "fsspec\asyn.py", line 96, in sync
  File "fsspec\asyn.py", line 53, in _runner
  File "scmrepo\git\backend\dulwich\asyncssh_vendor.py", line 163, in _run_command
  File "asyncssh\connection.py", line 7834, in connect
  File "asyncio\tasks.py", line 442, in wait_for
  File "asyncssh\connection.py", line 437, in _connect
  File "asyncio\base_events.py", line 1090, in create_connection
  File "asyncio\base_events.py", line 1108, in _create_connection_transport
  File "asyncssh\connection.py", line 7825, in conn_factory
  File "asyncssh\connection.py", line 3097, in __init__
  File "asyncssh\gss_win32.py", line 168, in __init__
  File "sspi.py", line 200, in __init__
ModuleNotFoundError: No module named 'win32timezone'

And dvc doctor output:

$ dvc doctor
DVC version: 2.30.0 (exe)
---------------------------------
Platform: Python 3.9.13 on Windows-10-10.0.19043-SP0
Subprojects:

Supports:
        azure (adlfs = 2022.10.0, knack = 0.10.0, azure-identity = 1.10.0),
        gdrive (pydrive2 = 1.10.0),
        gs (gcsfs = 2022.1.0),
        hdfs (fsspec = 2022.1.0, pyarrow = 7.0.0),
        http (aiohttp = 3.8.1, aiohttp-retry = 2.4.6),
        https (aiohttp = 3.8.1, aiohttp-retry = 2.4.6),
        oss (ossfs = 2021.8.0),
        s3 (s3fs = 2022.1.0, boto3 = 1.20.24),
        ssh (sshfs = 2021.11.2),
        webdav (webdav4 = 0.9.4),
        webdavs (webdav4 = 0.9.4),
        webhdfs (fsspec = 2022.1.0)
Cache types: hardlink
Cache directory: NTFS on C:\
Caches: local
Remotes: s3, s3
Workspace directory: NTFS on C:\
Repo: dvc, git

@HaddocktheHorrible
Copy link

Hi @pmrowla - last week I tried with the 2.30.0 but had no success (see logs attached in the previous comment). Sorry if this is a double-ping -- I realized comment edits might not send notifications...

@efiop efiop closed this as completed in ab3f8fb Nov 24, 2022
@efiop
Copy link
Member

efiop commented Nov 24, 2022

I'm not able to reproduce, but it looks like we were simply missing a hiddenimport for pyinstaller. Added one in ab3f8fb @W1M0R @HaddocktheHorrible @jruehle folks, please give 2.35.2 a try (the packages should be out later today/tomorrow) https://dvc.org/download/win/dvc-2.35.2

@efiop efiop self-assigned this Nov 24, 2022
@HaddocktheHorrible
Copy link

@efiop Sorry for the delay - just had a chance to install and try it. Can confirm dvc import is working again as expected for the 2.35.2 binary install. Thanks for the fix!

@ho9science
Copy link

I have same problem in version 2.34.0. but after installed 2.35.2, it works with out probolems. thank you

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
awaiting response we are waiting for your reply, please respond! :) build Issues/features related to building dvc install packages. P: windows Related to the Platform: Windows
Projects
None yet
Development

No branches or pull requests

7 participants