Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 #21

Closed
PeterStindberg opened this issue Jan 8, 2023 · 11 comments
Closed

UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 #21

PeterStindberg opened this issue Jan 8, 2023 · 11 comments

Comments

@PeterStindberg
Copy link

PeterStindberg commented Jan 8, 2023

Hi there,

on one script I tried to optimize, I get these kind of errors (one per each attempt, the numbers seem to be arbitrarily changing, the 2912 shows up often though):

UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position 4295: ordinal not in range(128)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 200: ordinal not in range(128)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 2912: ordinal not in range(128)

I tested my build process with a different script, and it ran fine, so I assume it's something in my current script the optimizer stumbles across.

My current build process:

  1. Edit the source in VSCode
  2. Paste the file into SL and compile
  3. Copy the Firestorm preprocessed code
  4. Paste the preprocessed code into a file preprocessed.lsl
  5. Run PyOptimizer with -O +ShrinkNames option

VSCode Build Task macro:

{
    // See https://go.microsoft.com/fwlink/?LinkId=733558
    // for the documentation about the tasks.json format
    "version": "2.0.0",
    "tasks": [
        {
            "label": "optimize",
            "type": "shell",
            "command": "/usr/bin/python ${userHome}/GitHub/LSL-PyOptimizer/main.py ${workspaceFolder}/Optimized/preprocessed.lsl  -O +ShrinkNames -o ${workspaceFolder}/Optimized/optimized.lsl ",
            "problemMatcher": [],
            "group": {
                "kind": "build",
                "isDefault": true
            }
        }
    ]
}
@Sei-Lisa
Copy link
Owner

Sei-Lisa commented Jan 8, 2023 via email

@PeterStindberg
Copy link
Author

Sure thing

macOS Mojave 10.14.6
Python 2.7.16

Traceback (most recent call last):
  File "/GitHub/LSL-PyOptimizer/main.py", line 782, in <module>
    ret = main(sys.argv)
  File "/GitHub/LSL-PyOptimizer/main.py", line 745, in main
    script = script_header + script_timestamp + outs.output(ts, options)
  File "/GitHub/LSL-PyOptimizer/lslopt/lsloutput.py", line 556, in output
    ret += self.OutCode(node)
  File "/GitHub/LSL-PyOptimizer/lslopt/lsloutput.py", line 504, in OutCode
    ret += self.OutCode(stmt)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 2896: ordinal not in range(128)

LANG="de_DE.UTF-8"
LC_COLLATE="de_DE.UTF-8"
LC_CTYPE="de_DE.UTF-8"
LC_MESSAGES="de_DE.UTF-8"
LC_MONETARY="de_DE.UTF-8"
LC_NUMERIC="de_DE.UTF-8"
LC_TIME="de_DE.UTF-8"
LC_ALL=

What do you mean with "encoding of your script"?

@Sei-Lisa
Copy link
Owner

Sei-Lisa commented Jan 8, 2023 via email

@PeterStindberg
Copy link
Author

Yes, it's UTF-8.

And indeed I am using (few) non-ASCII char's. There are passages like this:

list replace = ["&lt;","<","&gt;",">","&rt;",">","&quote;","\"","&quot;","\"","&amp;","&","&cent;","¢","&pound;","£","&yen;","¥","&euro;","€","&copy;","©","&reg;","®","&#39;","'"];

and passages like this:

msg = "*❮ [" + legacy_name + "](https://my.secondlife.com/" + legacy_name_dots +")*";

I can try to comb through them, and see what happens.

@PeterStindberg
Copy link
Author

Unfortunately, that didn't solved the problem. I replaced every non-ascii character with llChar(xxxx) and ran the code through https://pages.cs.wisc.edu/~markm/ascii.html to find any non-ascii left over. The code is clean, the error message the same:

Traceback (most recent call last):
  File "/GitHub/LSL-PyOptimizer/main.py", line 782, in <module>
    ret = main(sys.argv)
  File "/GitHub/LSL-PyOptimizer/main.py", line 745, in main
    script = script_header + script_timestamp + outs.output(ts, options)
  File "/GitHub/LSL-PyOptimizer/lslopt/lsloutput.py", line 556, in output
    ret += self.OutCode(node)
  File "/GitHub/LSL-PyOptimizer/lslopt/lsloutput.py", line 504, in OutCode
    ret += self.OutCode(stmt)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 2917: ordinal not in range(128)

@Sei-Lisa
Copy link
Owner

Sei-Lisa commented Jan 8, 2023 via email

@PeterStindberg
Copy link
Author

oh, okay - well, since we assume it's the non-ASCII characters, a test case might be easy to put together

@PeterStindberg
Copy link
Author

OK, this is the maximum stripped down that still fails:

list visitor_list_new;
list visitor_list_pos_old;
key owner;
key mygroup;
float xRight;
float xLeft;
float yNear;
float yFar;
float zLow;
float zHigh;

encode_and_send(string msg, key thisAvKey)
{
    integer msg_length;
}

key getAvatarGroup (key inAvatar)
{
    key result = NULL_KEY;
    return (result);
}

default
{
    state_entry()
    {
        mygroup = llList2Key(llGetObjectDetails(llGetKey(), [OBJECT_GROUP]), 0);
    }

    timer()
    {
        string msg;
        string legacy_name;
        string legacy_name_dots;
        integer numberOfKeys;
        integer i;
        key thisAvKey;
        vector agentpos;

        for (i = 0; i < numberOfKeys; ++i) {
            thisAvKey = llList2Key(visitor_list_new,i);

            if (TRUE) {

                agentpos = llList2Vector(llGetObjectDetails(thisAvKey, [OBJECT_POS]),0);

                // and break it down to x-y-z
                float avatarx = agentpos.x;
                float avatary = agentpos.y;
                float avatarz = agentpos.z;

                if ((avatarx <= xRight && avatarx >= xLeft && avatary <= yFar && avatary >= yNear && avatarz <= zHigh && avatarz >= zLow) && (getAvatarGroup(thisAvKey) == mygroup)) {

                } 
            }      
        }

        if (TRUE) {
            for (i = 0; i < numberOfKeys; i = i + 2) {
                if (TRUE) {
                    thisAvKey = llList2Key(visitor_list_pos_old,i);
                    legacy_name = llKey2Name(thisAvKey);
                    if (legacy_name != "") {

                        if (TRUE) {
                            msg = "*" + llChar(0x276E) + " [" + legacy_name + "](https://my.secondlife.com/" + legacy_name_dots +")*";
                            encode_and_send(msg, thisAvKey);
                        }
                    } 
                }
            }
        }
    } 
}

@Sei-Lisa
Copy link
Owner

Sei-Lisa commented Jan 9, 2023 via email

@PeterStindberg
Copy link
Author

Yep, the new version works without any error message, and produces the expected output. Thank you very much!

For laymen: What was the issue?

@PeterStindberg
Copy link
Author

nvm, found the commit and the explanation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants