Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cloning from locally served repo throws TypeError #973

Open
lukjak opened this issue May 30, 2022 · 6 comments
Open

Cloning from locally served repo throws TypeError #973

lukjak opened this issue May 30, 2022 · 6 comments

Comments

@lukjak
Copy link

lukjak commented May 30, 2022

Hello,

I create local repo, serve it with TCPGitServer and then want to clone using p.clone (I am testing my repo-handling code and want to do it with a "real" server).

The code sample below throws "TypeError: Can't mix strings and bytes in path components".

I know why ('some_repo" is passed as bytes to TCPGitServer, which tries to concatenate it with str root path coming from FileSystemBackend) - but I have not idea how to resolve this problem.

Any suggestions? TIA

import logging
import threading
from pathlib import Path

from dulwich import porcelain as p
from dulwich.server import FileSystemBackend, TCPGitServer

logging.basicConfig(level=logging.DEBUG)

BASE = Path("c:/temp/dulwich")
REMOTE_PATH = BASE / "remote"
LOCAL_PATH = BASE / "local"

Path(REMOTE_PATH).mkdir(parents=True, exist_ok=True)
Path(LOCAL_PATH).mkdir(parents=True, exist_ok=True)

try:
    p.init(REMOTE_PATH)

    backend = FileSystemBackend(REMOTE_PATH)
    dul_server = TCPGitServer(backend, b'localhost', 0)
    threading.Thread(target=dul_server.serve).start()
    server_address, server_port = dul_server.socket.getsockname()

    p.clone(f"git://{server_address}:{server_port}/some_repo", LOCAL_PATH)
finally:
    dul_server.shutdown()
@jelmer
Copy link
Owner

jelmer commented May 30, 2022

This is probably a windows-specific issue. Have you tried encoding the path you pass into FileSystemBackend()?

@lukjak
Copy link
Author

lukjak commented May 30, 2022

Yes, both FileSystemBackend() and clone() parameters require str. Otherwise exceptions are thrown due to attempts to use bytes in context of string.

This is for REMOTE_PATH as bytes:

Traceback (most recent call last):
  File "C:/Users/user/AppData/Roaming/JetBrains/PyCharmCE2022.1/scratches/scratch.py", line 20, in <module>
    backend = FileSystemBackend(str(REMOTE_PATH).encode())
  File "C:\Programy\Python3\lib\site-packages\dulwich\server.py", line 203, in __init__
    self.root = (os.path.abspath(root) + os.sep).replace(os.sep * 2, os.sep)
TypeError: can't concat str to bytes

This is for clone source repo URI as bytes:

INFO:dulwich.server:Listening for TCP connections on b'localhost':0
Traceback (most recent call last):
  File "C:/Users/user/AppData/Roaming/JetBrains/PyCharmCE2022.1/scratches/scratch.py", line 25, in <module>
    p.clone(f"git://{server_address}:{server_port}/some_repo".encode(), LOCAL_PATH)
  File "C:\Programy\Python3\lib\site-packages\dulwich\porcelain.py", line 441, in clone
    (client, path) = get_transport_and_path(source, **kwargs)
  File "C:\Programy\Python3\lib\site-packages\dulwich\client.py", line 2347, in get_transport_and_path
    if sys.platform == "win32" and location[0].isalpha() and location[1:3] == ":\\":
AttributeError: 'int' object has no attribute 'isalpha'

@lukjak
Copy link
Author

lukjak commented May 31, 2022

I could easily patch FileSystemBackend.open_repository() to work around this, but I don't have any idea if it would make sense or just mask the real problem which later pops up somewhere else.

Bonus, if you apply such fix and make the example more "realistic" by removing "/some_repo", you get:

 File "C:\Programy\Python3\lib\site-packages\dulwich\refs.py", line 1168, in _set_default_branch
    return head_ref
UnboundLocalError: local variable 'head_ref' referenced before assignment

If you remove just "some_repo", you get:

  File "C:\Programy\Python3\lib\site-packages\dulwich\server.py", line 213, in open_repository
    raise NotGitRepository("Path %r not inside root %r" % (path, self.root))
dulwich.errors.NotGitRepository: Path '/' not inside root 'c:\\temp\\dulwich\\remote\\'

@lukjak
Copy link
Author

lukjak commented Jun 1, 2022

OK, so finally I came up with:

def open_repository(self, path):
    logger.debug("opening repository at %s", path)
    new_path = Path(path.decode() if isinstance(path, bytes) else path)
    # Remove any anchor if it exists
    new_path = Path(*new_path.parts[1:]) if new_path.anchor else new_path
    # Get full path
    abspath = (self.root / new_path).resolve()
    # Check if we traversed out of root or ended up in a wrong place
    if not abspath.is_relative_to(self.root) or not abspath.is_dir():
        raise NotGitRepository("Path %r not inside root %r" % (path, self.root))
    return Repo(str(abspath))

and using it with

with patch.object(sys.modules["dulwich"].server.FileSystemBackend, "open_repository", open_repository):

BTW, yet another problem with my original sample after the fix:

  File "C:\Programy\Python3\lib\site-packages\dulwich\refs.py", line 1168, in _set_default_branch
    return head_ref
UnboundLocalError: local variable 'head_ref' referenced before assignment

because empty repo (just after init()) does not contain any refs.

@jelmer
Copy link
Owner

jelmer commented Jul 6, 2022

The traceback you pasted

Hello,

I create local repo, serve it with TCPGitServer and then want to clone using p.clone (I am testing my repo-handling code and want to do it with a "real" server).

The code sample below throws "TypeError: Can't mix strings and bytes in path components".

I know why ('some_repo" is passed as bytes to TCPGitServer, which tries to concatenate it with str root path coming from FileSystemBackend) - but I have not idea how to resolve this problem.

Any suggestions? TIA

import logging
import threading
from pathlib import Path

from dulwich import porcelain as p
from dulwich.server import FileSystemBackend, TCPGitServer

logging.basicConfig(level=logging.DEBUG)

BASE = Path("c:/temp/dulwich")
REMOTE_PATH = BASE / "remote"
LOCAL_PATH = BASE / "local"

Path(REMOTE_PATH).mkdir(parents=True, exist_ok=True)
Path(LOCAL_PATH).mkdir(parents=True, exist_ok=True)

try:
    p.init(REMOTE_PATH)

    backend = FileSystemBackend(REMOTE_PATH)
    dul_server = TCPGitServer(backend, b'localhost', 0)
    threading.Thread(target=dul_server.serve).start()
    server_address, server_port = dul_server.socket.getsockname()

    p.clone(f"git://{server_address}:{server_port}/some_repo", LOCAL_PATH)
finally:
    dul_server.shutdown()

What's the backtrace you get when you use this code (i.e. don't encode the path as bytes)?

@lukjak
Copy link
Author

lukjak commented Oct 29, 2022

The trace with dulwich 0.20.46 for the above code is:

Traceback (most recent call last):
  File "C:\Programy\Python3\lib\socketserver.py", line 316, in _handle_request_noblock
    self.process_request(request, client_address)
  File "C:\Programy\Python3\lib\socketserver.py", line 347, in process_request
    self.finish_request(request, client_address)
  File "C:\Programy\Python3\lib\socketserver.py", line 360, in finish_request
    self.RequestHandlerClass(request, client_address, self)
  File "C:\Users\Łukasz\PycharmProjects\untitled1\.venv\lib\site-packages\dulwich\server.py", line 1165, in _make_handler
    return TCPGitRequestHandler(self.handlers, *args, **kwargs)
  File "C:\Users\Łukasz\PycharmProjects\untitled1\.venv\lib\site-packages\dulwich\server.py", line 1145, in __init__
    socketserver.StreamRequestHandler.__init__(self, *args, **kwargs)
  File "C:\Programy\Python3\lib\socketserver.py", line 747, in __init__
    self.handle()
  File "C:\Users\Łukasz\PycharmProjects\untitled1\.venv\lib\site-packages\dulwich\server.py", line 1155, in handle
    h = cls(self.server.backend, args, proto)
  File "C:\Users\Łukasz\PycharmProjects\untitled1\.venv\lib\site-packages\dulwich\server.py", line 296, in __init__
    self.repo = backend.open_repository(args[0])
  File "C:\Users\Łukasz\PycharmProjects\untitled1\.venv\lib\site-packages\dulwich\server.py", line 207, in open_repository
    abspath = os.path.abspath(os.path.join(self.root, path)) + os.sep
  File "C:\Programy\Python3\lib\ntpath.py", line 117, in join
    genericpath._check_arg_types('join', path, *paths)
  File "C:\Programy\Python3\lib\genericpath.py", line 155, in _check_arg_types
    raise TypeError("Can't mix strings and bytes in path components") from None
TypeError: Can't mix strings and bytes in path components

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants