Skip to content
This repository has been archived by the owner on Jul 24, 2020. It is now read-only.

UnicodeDecodeError exception in clone command #331

Open
johannesgajdosik opened this issue Aug 27, 2018 · 2 comments
Open

UnicodeDecodeError exception in clone command #331

johannesgajdosik opened this issue Aug 27, 2018 · 2 comments

Comments

@johannesgajdosik
Copy link

I try to use hg-git for cloning a large git repo.
After running 28 hours it gives me a UnicodeDecodeError.
I would rather like that the cloning command could finish successfully.
In my opinion the cloning should not be impossible just because some file in the repo contains a non-ascii character.
There is not even a hint which file has the non-ascii char.

I have hg-git 0.8.11 and Mercurial 4.5.2, gentoo Linux.

Please have a look at this and tell me how to proceed.
Thanks in advance!

gaj@gajdosik /sda3 $ time hg clone git+ssh://xxxxx.xxx.xxx.xx:22/xxx/DefaultCollection/_git/XXXXXX
destination directory: XXXXXX
importing git objects into hg
** unknown exception encountered, please report by visiting
** https://mercurial-scm.org/wiki/BugTracker
** Python 2.7.14 (default, Jun 8 2018, 19:03:03) [GCC 6.4.0]
** Mercurial Distributed SCM (version 4.5.2)
** Extensions loaded: hgk, hggit
Traceback (most recent call last):
File "/usr/lib/python-exec/python2.7/hg", line 41, in
dispatch.run()
File "/usr/lib64/python2.7/site-packages/mercurial/dispatch.py", line 88, in run
status = (dispatch(req) or 0) & 255
File "/usr/lib64/python2.7/site-packages/mercurial/dispatch.py", line 183, in dispatch
ret = _runcatch(req)
File "/usr/lib64/python2.7/site-packages/mercurial/dispatch.py", line 324, in _runcatch
return _callcatch(ui, _runcatchfunc)
File "/usr/lib64/python2.7/site-packages/mercurial/dispatch.py", line 332, in _callcatch
return scmutil.callcatch(ui, func)
File "/usr/lib64/python2.7/site-packages/mercurial/scmutil.py", line 154, in callcatch
return func()
File "/usr/lib64/python2.7/site-packages/mercurial/dispatch.py", line 314, in _runcatchfunc
return _dispatch(req)
File "/usr/lib64/python2.7/site-packages/mercurial/dispatch.py", line 918, in _dispatch
cmdpats, cmdoptions)
File "/usr/lib64/python2.7/site-packages/mercurial/dispatch.py", line 673, in runcommand
ret = _runcommand(ui, options, cmd, d)
File "/usr/lib64/python2.7/site-packages/mercurial/dispatch.py", line 926, in _runcommand
return cmdfunc()
File "/usr/lib64/python2.7/site-packages/mercurial/dispatch.py", line 915, in
d = lambda: util.checksignature(func)(ui, *args, **strcmdopt)
File "/usr/lib64/python2.7/site-packages/mercurial/util.py", line 1195, in check
return func(*args, **kwargs)
File "/usr/lib64/python2.7/site-packages/mercurial/commands.py", line 1449, in clone
shareopts=opts.get('shareopts'))
File "/usr/lib64/python2.7/site-packages/mercurial/hg.py", line 661, in clone
streamclonerequested=stream)
File "/usr/lib64/python2.7/site-packages/hggit/util.py", line 56, in inner
return f(*args, **kwargs)
File "/usr/lib64/python2.7/site-packages/hggit/init.py", line 354, in exchangepull
pullop.cgresult = repo.githandler.fetch(remote.path, heads)
File "/usr/lib64/python2.7/site-packages/hggit/git_handler.py", line 300, in fetch
self.update_remote_branches(remote_name, result.refs)
File "/usr/lib64/python2.7/site-packages/hggit/git_handler.py", line 1467, in update_remote_branches
self.git.refs[ref_name] = sha
File "/usr/lib64/python2.7/site-packages/dulwich/refs.py", line 290, in setitem
self.set_if_equals(name, None, ref)
File "/usr/lib64/python2.7/site-packages/dulwich/refs.py", line 625, in set_if_equals
realnames, _ = self.follow(name)
File "/usr/lib64/python2.7/site-packages/dulwich/refs.py", line 226, in follow
contents = self.read_ref(refname)
File "/usr/lib64/python2.7/site-packages/dulwich/refs.py", line 200, in read_ref
contents = self.read_loose_ref(refname)
File "/usr/lib64/python2.7/site-packages/dulwich/refs.py", line 556, in read_loose_ref
filename = self.refpath(name)
File "/usr/lib64/python2.7/site-packages/dulwich/refs.py", line 482, in refpath
name = name.decode(sys.getfilesystemencoding())
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 20: ordinal not in range(128)

real 1731m21.487s
user 1043m52.445s
sys 91m17.568s

@tamsky
Copy link

tamsky commented Nov 14, 2018

I've hit this error when tags or named refs contain unicode.
One fix would which will avoid cloning any tags or refs, would be to:

  • first use git to clone the upstream repo as a git repo to your local machine:
    git clone git+ssh://xxxxx.xxx.xxx.xx:22/xxx/DefaultCollection/_git/XXXXXX /path/to/local/XXXXXX
  • cd /path/to/local/XXXXXX
    then remove all git tags and refs (left as an exercise for the reader)
  • cd ..
  • hg clone XXXXXX new-hg-git-repo

which should let you succeed in cloning.

good luck, hopefully this helps

@johannesgajdosik
Copy link
Author

Thanks for your reply and suggestion!
The repository is huge and constantly used by many coworkers. My plan was not only to clone the repo with hg-git but to actually work with it every day. Removing the git tags and refs all over again each time I fetch or pull is not an option.
Can you instead lead me to some peace of code that I can tweak so that the cloning can continue despite of having unicode tags and refs?
Thanks in advance!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants