Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem with encoding Polish characters in shell #1749

Closed
vojtek opened this issue Nov 18, 2023 · 6 comments
Closed

Problem with encoding Polish characters in shell #1749

vojtek opened this issue Nov 18, 2023 · 6 comments
Labels
bug Something isn't working

Comments

@vojtek
Copy link

vojtek commented Nov 18, 2023

  - trigger: ":pl"
    replace: "{{output}}"
    vars:
      - name: output
        type: shell
        params:
          cmd: "echo 'ąś'"

returns ��

The problem occurs when I use Python. In my script, I retrieve data from the clipboard, transform it, and return it to espanso through print(). To simplify the problem, I have reduced it to above example.

Setup information

  • OS: Windows 10
  • Version: espanso 2.1.8
@smeech
Copy link
Collaborator

smeech commented Nov 21, 2023

This may be the same issue as #1736.

@mistahBen
Copy link

This is definitely a sticky issue with how the various operating systems Espanso passes characters to parse those characters.

I have a Windows 10 machine running Espanso 2.1.8 (US english installation) and tried to replicate and try to see if I could get the characters to render as intended.

What I tried:

  1. Simple "trigger/replace" syntax. Result: characters printed correctly.
  2. Using echo to print the characters. Result: Characters were transformed to their closest English counterparts ("as")
  3. Saving the text into a simple txt file with utf8 encoding. Then used an espanso trigger to print out the text from the command Get-Content -path {\file\path\file.txt} -Encoding utf8. And then subsequently tried with -Encoding Unicode. Result: parsed as either question marks ("??") or as unrecognized chars just like @vojtek expereienced depending on the encoding ( in this step, i also tried saving the file in different encodings to see if that helped, but did not).

Unsure if there is a specific solution to this at the moment, given that Espanso is 'at the mercy' of the operating system to recognize characters properly and Windows seems to be the one out of the three supported OSes that has this issue the most.

@kpym
Copy link

kpym commented Dec 15, 2023

This is not an espanso bug, IMO. Espanso uses UTF8 (probably, or the encoding of the yaml file is used ?), so the shell console should be configured to work with UTF8. Here is how you can do this with posershell (the default shell used by espanso on Windows).

 - trigger: ":pl"
    replace: "{{output}}"
    vars:
      - name: output
        type: shell
        params:
          cmd: "[Console]::OutputEncoding = [System.Text.Encoding]::UTF8; echo 'ąśα'"

or if you accept to use wsl as shell you can simply do

  - trigger: ":pl"
    replace: "{{output}}"
    vars:
      - name: output
        type: shell
        params:
          cmd: "echo 'ąśα'"
          shell: wsl

In both cases the output is as expected ąśα.

@vojtek
Copy link
Author

vojtek commented Dec 16, 2023

Thanks @kpym,

Your solution for PS worked. Do you perhaps have a suggestion on how to do this for Python scripts? My example was a simplification of my problem. In reality, I am calling scripts in which print returns Polish characters.

  - trigger: ";;"
    replace: "{{output}}"
    vars:
      - name: output
        type: script
        params:
          args:
            - python
            - "%CONFIG%/scripts/script.py"

the script.py

print("ąś")

@vojtek vojtek closed this as completed Dec 16, 2023
@kpym
Copy link

kpym commented Dec 17, 2023

Thanks @kpym,

Your solution for PS worked. Do you perhaps have a suggestion on how to do this for Python scripts? My example was a simplification of my problem. In reality, I am calling scripts in which print returns Polish characters.

  - trigger: ";;"
    replace: "{{output}}"
    vars:
      - name: output
        type: script
        params:
          args:
            - python
            - "%CONFIG%/scripts/script.py"

the script.py

print("ąś")

So your problem is not with the consol encoding at all, because you are using script. Your problem is the Python problem How to make python 3 print() utf8, so check the accepted answer https://stackoverflow.com/a/3603160.

I have not checked, but probably your script.py should be

import sys

def print_utf8(s):
    sys.stdout.buffer.write(s.encode('utf8'))

print_utf8("ąśα❤")

@vojtek
Copy link
Author

vojtek commented Dec 17, 2023

For the Python this works well:

import sys
sys.stdout.reconfigure(encoding='utf-8')

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants