Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UTF-8 path-chars Encoding Issue during poetry install #8550

Closed
4 tasks done
etaMS20 opened this issue Oct 18, 2023 · 9 comments · Fixed by #8565
Closed
4 tasks done

UTF-8 path-chars Encoding Issue during poetry install #8550

etaMS20 opened this issue Oct 18, 2023 · 9 comments · Fixed by #8565
Labels
kind/bug Something isn't working as expected

Comments

@etaMS20
Copy link

etaMS20 commented Oct 18, 2023

  • I am on the latest stable Poetry version, installed using a recommended method.
  • I have searched the issues of this repo and believe that this is not a duplicate.
  • I have consulted the FAQ and blog for any relevant entries or release notes.
  • If an exception occurs when executing a command, I executed it again in debug mode (-vvv option) and have included the output below.

Issue

Hello,

there seems to be an Issue with UTF-8 Encoding during dynamic installation process of packages.
My Path contains a special character (ä), seems like that caused the problem during reading from stdin (no encoding declared).
I know its bad practice to have that in path but i use a company domain and user-path is auto-set by windows during startup :<.

(I was able to bypass this issue by downgrading to poetry 1.5.1)

Trace back:

poetry install
Installing dependencies from lock file

Package operations: 69 installs, 1 update, 0 removals

  • Installing certifi (2023.7.22)
  • Installing charset-normalizer (3.3.0)
  • Installing idna (3.4)
  • Installing urllib3 (2.0.6)

  CalledProcessError

  Command '['C:\\Users\\MalvinSchädle\\AppData\\Local\\pypoetry\\Cache\\virtualenvs\\eta-control-fKGJYgDt-py3.11\\Scripts\\python.exe', '-I', '-W', 'ignore', '-']' returned non-zero exit status 1.

  at C:\Python\Lib\subprocess.py:571 in run
       567|             # We don't call process.wait() as .__exit__ does that for us.
       568|             raise
       569|         retcode = process.poll()
       570|         if check and retcode:
    >  571|             raise CalledProcessError(retcode, process.args,
       572|                                      output=stdout, stderr=stderr)
       573|     return CompletedProcess(process.args, retcode, stdout, stderr)
       574| 
       575| 

The following error occurred when trying to handle this error:


  EnvCommandError

  Command ['C:\\Users\\MalvinSchädle\\AppData\\Local\\pypoetry\\Cache\\virtualenvs\\eta-control-fKGJYgDt-py3.11\\Scripts\\python.exe', '-I', '-W', 'ignore', '-'] errored with the following return code 1
  
  Error output:
  SyntaxError: Non-UTF-8 code starting with '\xe4' in file <stdin> on line 9, but no encoding declared; see https://peps.python.org/pep-0263/ for details
  
  
  Input:
  
  import importlib.util
  import json
  import sys
  
  from pathlib import Path
  
  spec = importlib.util.spec_from_file_location(
      "packaging", Path(r"C:\Users\MalvinSchädle\AppData\Roaming\Python\Python311\site-packages\packaging\__init__.py")
  )
  packaging = importlib.util.module_from_spec(spec)
  sys.modules[spec.name] = packaging
  
  spec = importlib.util.spec_from_file_location(
      "packaging.tags", Path(r"C:\Users\MalvinSchädle\AppData\Roaming\Python\Python311\site-packages\packaging\tags.py")
  )
  packaging_tags = importlib.util.module_from_spec(spec)
  spec.loader.exec_module(packaging_tags)
  
  print(
      json.dumps([(t.interpreter, t.abi, t.platform) for t in packaging_tags.sys_tags()])
  )
  

  at C:\Python\Lib\site-packages\poetry\utils\env\base_env.py:363 in _run
      359|                 output = subprocess.check_output(
      360|                     cmd, stderr=stderr, env=env, text=True, **kwargs
      361|                 )
      362|         except CalledProcessError as e:
    > 363|             raise EnvCommandError(e, input=input_)
      364| 
      365|         return output
      366| 
      367|     def execute(self, bin: str, *args: str, **kwargs: Any) -> int:


  CalledProcessError

  Command '['C:\\Users\\MalvinSchädle\\AppData\\Local\\pypoetry\\Cache\\virtualenvs\\eta-control-fKGJYgDt-py3.11\\Scripts\\python.exe', '-I', '-W', 'ignore', '-']' returned non-zero exit status 1.

  at C:\Python\Lib\subprocess.py:571 in run
       567|             # We don't call process.wait() as .__exit__ does that for us.
       568|             raise
       569|         retcode = process.poll()
       570|         if check and retcode:
    >  571|             raise CalledProcessError(retcode, process.args,
       572|                                      output=stdout, stderr=stderr)
       573|     return CompletedProcess(process.args, retcode, stdout, stderr)
       574| 
       575| 

The following error occurred when trying to handle this error:


  EnvCommandError

  Command ['C:\\Users\\MalvinSchädle\\AppData\\Local\\pypoetry\\Cache\\virtualenvs\\eta-control-fKGJYgDt-py3.11\\Scripts\\python.exe', '-I', '-W', 'ignore', '-'] errored with the following return code 1
  
  Error output:
  SyntaxError: Non-UTF-8 code starting with '\xe4' in file <stdin> on line 9, but no encoding declared; see https://peps.python.org/pep-0263/ for details
  
  
  Input:
  
  import importlib.util
  import json
  import sys
  
  from pathlib import Path
  
  spec = importlib.util.spec_from_file_location(
      "packaging", Path(r"C:\Users\MalvinSchädle\AppData\Roaming\Python\Python311\site-packages\packaging\__init__.py")
  )
  packaging = importlib.util.module_from_spec(spec)
  sys.modules[spec.name] = packaging
  
  spec = importlib.util.spec_from_file_location(
      "packaging.tags", Path(r"C:\Users\MalvinSchädle\AppData\Roaming\Python\Python311\site-packages\packaging\tags.py")
  )
  packaging_tags = importlib.util.module_from_spec(spec)
  spec.loader.exec_module(packaging_tags)
  
  print(
      json.dumps([(t.interpreter, t.abi, t.platform) for t in packaging_tags.sys_tags()])
  )
  

  at C:\Python\Lib\site-packages\poetry\utils\env\base_env.py:363 in _run
      359|                 output = subprocess.check_output(
      360|                     cmd, stderr=stderr, env=env, text=True, **kwargs
      361|                 )
      362|         except CalledProcessError as e:
    > 363|             raise EnvCommandError(e, input=input_)
      364| 
      365|         return output
      366| 
      367|     def execute(self, bin: str, *args: str, **kwargs: Any) -> int:


  CalledProcessError

  Command '['C:\\Users\\MalvinSchädle\\AppData\\Local\\pypoetry\\Cache\\virtualenvs\\eta-control-fKGJYgDt-py3.11\\Scripts\\python.exe', '-I', '-W', 'ignore', '-']' returned non-zero exit status 1.

  at C:\Python\Lib\subprocess.py:571 in run
       567|             # We don't call process.wait() as .__exit__ does that for us.
       568|             raise
       569|         retcode = process.poll()
       570|         if check and retcode:
    >  571|             raise CalledProcessError(retcode, process.args,
       572|                                      output=stdout, stderr=stderr)
       573|     return CompletedProcess(process.args, retcode, stdout, stderr)
       574| 
       575| 

The following error occurred when trying to handle this error:


  EnvCommandError

  Command ['C:\\Users\\MalvinSchädle\\AppData\\Local\\pypoetry\\Cache\\virtualenvs\\eta-control-fKGJYgDt-py3.11\\Scripts\\python.exe', '-I', '-W', 'ignore', '-'] errored with the following return code 1
  
  Error output:
  SyntaxError: Non-UTF-8 code starting with '\xe4' in file <stdin> on line 9, but no encoding declared; see https://peps.python.org/pep-0263/ for details
  
  
  Input:
  
  import importlib.util
  import json
  import sys
  
  from pathlib import Path
  
  spec = importlib.util.spec_from_file_location(
      "packaging", Path(r"C:\Users\MalvinSchädle\AppData\Roaming\Python\Python311\site-packages\packaging\__init__.py")
  )
  packaging = importlib.util.module_from_spec(spec)
  sys.modules[spec.name] = packaging
  
  spec = importlib.util.spec_from_file_location(
      "packaging.tags", Path(r"C:\Users\MalvinSchädle\AppData\Roaming\Python\Python311\site-packages\packaging\tags.py")
  )
  packaging_tags = importlib.util.module_from_spec(spec)
  spec.loader.exec_module(packaging_tags)
  
  print(
      json.dumps([(t.interpreter, t.abi, t.platform) for t in packaging_tags.sys_tags()])
  )
  

  at C:\Python\Lib\site-packages\poetry\utils\env\base_env.py:363 in _run
      359|                 output = subprocess.check_output(
      360|                     cmd, stderr=stderr, env=env, text=True, **kwargs
      361|                 )
      362|         except CalledProcessError as e:
    > 363|             raise EnvCommandError(e, input=input_)
      364| 
      365|         return output
      366| 
      367|     def execute(self, bin: str, *args: str, **kwargs: Any) -> int:


  CalledProcessError

  Command '['C:\\Users\\MalvinSchädle\\AppData\\Local\\pypoetry\\Cache\\virtualenvs\\eta-control-fKGJYgDt-py3.11\\Scripts\\python.exe', '-I', '-W', 'ignore', '-']' returned non-zero exit status 1.

  at C:\Python\Lib\subprocess.py:571 in run
       567|             # We don't call process.wait() as .__exit__ does that for us.
       568|             raise
       569|         retcode = process.poll()
       570|         if check and retcode:
    >  571|             raise CalledProcessError(retcode, process.args,
       572|                                      output=stdout, stderr=stderr)
       573|     return CompletedProcess(process.args, retcode, stdout, stderr)
       574| 
       575| 

The following error occurred when trying to handle this error:


  EnvCommandError

  Command ['C:\\Users\\MalvinSchädle\\AppData\\Local\\pypoetry\\Cache\\virtualenvs\\eta-control-fKGJYgDt-py3.11\\Scripts\\python.exe', '-I', '-W', 'ignore', '-'] errored with the following return code 1
  
  Error output:
  SyntaxError: Non-UTF-8 code starting with '\xe4' in file <stdin> on line 9, but no encoding declared; see https://peps.python.org/pep-0263/ for details
  
  
  Input:
  
  import importlib.util
  import json
  import sys
  
  from pathlib import Path
  
  spec = importlib.util.spec_from_file_location(
      "packaging", Path(r"C:\Users\MalvinSchädle\AppData\Roaming\Python\Python311\site-packages\packaging\__init__.py")
  )
  packaging = importlib.util.module_from_spec(spec)
  sys.modules[spec.name] = packaging
  
  spec = importlib.util.spec_from_file_location(
      "packaging.tags", Path(r"C:\Users\MalvinSchädle\AppData\Roaming\Python\Python311\site-packages\packaging\tags.py")
  )
  packaging_tags = importlib.util.module_from_spec(spec)
  spec.loader.exec_module(packaging_tags)
  
  print(
      json.dumps([(t.interpreter, t.abi, t.platform) for t in packaging_tags.sys_tags()])
  )
  

  at C:\Python\Lib\site-packages\poetry\utils\env\base_env.py:363 in _run
      359|                 output = subprocess.check_output(
      360|                     cmd, stderr=stderr, env=env, text=True, **kwargs
      361|                 )
      362|         except CalledProcessError as e:
    > 363|             raise EnvCommandError(e, input=input_)
      364| 
      365|         return output
      366| 
      367|     def execute(self, bin: str, *args: str, **kwargs: Any) -> int:
@etaMS20 etaMS20 added kind/bug Something isn't working as expected status/triage This issue needs to be triaged labels Oct 18, 2023
@dimbleby
Copy link
Contributor

probably wants an explicit encoding="utf-8" on the subprocess calls that pass text=True. Suggest try that and if it works merge request surely welcome.

@privet-kitty
Copy link

@dimbleby Hi. Thanks for addressing it. I've only checked base_env module, but _run appears to be called only for invoking pip or python. So my suggestion is forcing Python to output everything in utf-8 by setting env["PYTHONENCODING"] = "utf-8" before calling subprocess.run. This hack worked when I addressed a similar issue in poethepoet. Any concerns?

@dimbleby
Copy link
Contributor

I don't understand your concern - what do you think goes wrong if adding the parameter to subprocess calls?

prefer explicit parameters, clearer what they're for

@dimbleby
Copy link
Contributor

dimbleby commented Oct 20, 2023

also I don't think that your suggestion addresses the issue reported here. AIUI the question isn't whether the python subprocess is expecting or producing utf-8, the question is how the input to that python subprocess - GET_SYS_TAGS - is encoded.

@privet-kitty
Copy link

@dimbleby I completely misunderstood. OK, I see that the problem reported in this ticket is about the encoding which is handled not by the child python but by the parent one. Thanks for explaining it and I apologize for troubling you.

Nevertheless, for your information, I'll put another potential problem here that might appear after you set encoding of subprocess. Please see the following script:

# encoding_test.py
import os
import subprocess
import sys
from typing import Optional

env = dict(os.environ)
counter = 1

def run(enable_i_option: bool, encoding: Optional[str], input: str, env: dict):
    global counter
    print(counter)
    counter += 1
    cmd = [sys.executable] + (["-I"] if enable_i_option else []) + ['-W', 'ignore', '-']
    try:
        output = subprocess.run(
            cmd,
            stdout=subprocess.PIPE,
            input=input,
            check=True,
            env=env,
            text=True,
            encoding=encoding,
        ).stdout
        print(output)
    except Exception:
        print(sys.exc_info()[1])


# ASCII input
run(True, None, 'print("OK")', env)
# Your solution here for non-ascii input
run(True, "utf-8", 'print("ÖK")', env)
# My suggestion
run(False, "utf-8", 'print("ÖK")', env | {"PYTHONIOENCODING": "utf-8"})

In my Windows (Japanese) environment, the output of python encoding_test.py is as follows:

1
OK

2
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'cp932' codec can't encode character '\xd6' in position 0: illegal multibyte sequence
Command '['C:\\Users\\hugov\\.pyenv\\pyenv-win\\versions\\3.11.1\\python.exe', '-I', '-W', 'ignore', '-']' returned non-zero exit status 1.
3
ÖK

On the other hand, in Ubuntu (WSL), the output causes no problem.

1
OK

2
ÖK

3
ÖK

Your environment will produce the latter, I guess, while in some environment, the child python doesn't assume utf-8 unless forced.

I haven't checked if it actually produces an error after fixing poetry as you suggest. Sorry if I still misunderstand all of that.

@dimbleby
Copy link
Contributor

I'm not a windows user and don't have non-ascii characters in my path so I'm not well placed to do anything with this issue and also don't much care about it anyway.

I could believe that it might be necessary both to encode strings as utf8 from the parent python, and also to ask the child python to do the same. Or it might not, I'm really not sure.

Recommend that if you - or any other reader - are hitting problems here: try either or both fixes, and submit a merge request!

@etaMS20
Copy link
Author

etaMS20 commented Oct 20, 2023 via email

@privet-kitty
Copy link

FYI: I realized that I didn't need to drop -I option but just adding -X utf8 and forcing the UTF-8 mode would suffice.

Copy link

This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Feb 29, 2024
@abn abn removed the status/triage This issue needs to be triaged label Mar 2, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
kind/bug Something isn't working as expected
Projects
None yet
4 participants