Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

parser regression: yaml.reader.ReaderError: unacceptable character #x1f64b: special characters are not allowed #250

Closed
CarlFK opened this issue Feb 8, 2019 · 4 comments
Labels

Comments

@CarlFK
Copy link

CarlFK commented Feb 8, 2019

wget https://latin.grep.be/~wouter/released.yml 
python -m pip install PyYAML
>>> yaml.safe_load(open('released.yml'))
...
  yaml.reader.ReaderError: unacceptable character #x1f64b 

in #python Yhg1s: I don't get that error with Python 3.6 and PyYAML 3.12, but I do with 3.13.

@ingydotnet
Copy link
Member

Here I run PyYAML 3.12 on Python 3.6.7. Unless I am missing something, it gives the same error as you got with 3.13.

curl https://gist.githubusercontent.com/ingydotnet/731fdcfc9c012371b8d55f31a83e31cd/raw/d517439f098e68aca90071fcfa49bb61b941a988/pyyaml-issue-250 | bash

gives:

>> python3 -m venv v3
>> source v3/bin/activate
>> 
>> python --version
Python 3.6.7
>> 
>> python -m pip install pyyaml==3.12
Collecting pyyaml==3.12
  Using cached https://files.pythonhosted.org/packages/4a/85/db5a2df477072b2902b0eb892feb37d88ac635d36245a72a6a69b23b383a/PyYAML-3.12.tar.gz
Building wheels for collected packages: pyyaml
  Running setup.py bdist_wheel for pyyaml: started
  Running setup.py bdist_wheel for pyyaml: finished with status 'error'
  Complete output from command /home/ingy/v3/bin/python -u -c "import setuptools, tokenize;__file__='/tmp/pip-build-zt03jt8w/pyyaml/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" bdist_wheel -d /tmp/tmpbjzvnijfpip-wheel- --python-tag cp36:
  usage: -c [global_opts] cmd1 [cmd1_opts] [cmd2 [cmd2_opts] ...]
     or: -c --help [cmd1 cmd2 ...]
     or: -c --help-commands
     or: -c cmd --help
  
  error: invalid command 'bdist_wheel'
  
  ----------------------------------------
  Failed building wheel for pyyaml
  Running setup.py clean for pyyaml
Failed to build pyyaml
Installing collected packages: pyyaml
  Running setup.py install for pyyaml: started
    Running setup.py install for pyyaml: finished with status 'done'
Successfully installed pyyaml-3.12
>> python -c 'import yaml; print(yaml.__version__)'
3.12
>> 
>> wget https://latin.grep.be/~wouter/released.yml
--2019-02-12 11:07:22--  https://latin.grep.be/~wouter/released.yml
Resolving latin.grep.be (latin.grep.be)... 2a01:4f8:140:52e5::2, 46.4.76.168
Connecting to latin.grep.be (latin.grep.be)|2a01:4f8:140:52e5::2|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 990406 (967K)
Saving to: ‘released.yml’

     0K .......... .......... .......... .......... ..........  5%  128K 7s
    50K .......... .......... .......... .......... .......... 10%  245K 5s
   100K .......... .......... .......... .......... .......... 15% 2.71M 3s
   150K .......... .......... .......... .......... .......... 20%  327K 3s
   200K .......... .......... .......... .......... .......... 25% 1.12M 2s
   250K .......... .......... .......... .......... .......... 31% 2.45M 2s
   300K .......... .......... .......... .......... .......... 36% 2.78M 1s
   350K .......... .......... .......... .......... .......... 41% 2.81M 1s
   400K .......... .......... .......... .......... .......... 46%  372K 1s
   450K .......... .......... .......... .......... .......... 51% 2.76M 1s
   500K .......... .......... .......... .......... .......... 56% 2.35M 1s
   550K .......... .......... .......... .......... .......... 62% 1.34M 1s
   600K .......... .......... .......... .......... .......... 67% 1.53M 1s
   650K .......... .......... .......... .......... .......... 72% 2.67M 0s
   700K .......... .......... .......... .......... .......... 77% 2.83M 0s
   750K .......... .......... .......... .......... .......... 82% 2.35M 0s
   800K .......... .......... .......... .......... .......... 87%  582K 0s
   850K .......... .......... .......... .......... .......... 93% 1.69M 0s
   900K .......... .......... .......... .......... .......... 98% 2.41M 0s
   950K .......... .......                                    100% 3.93M=1.3s

2019-02-12 11:07:24 (744 KB/s) - ‘released.yml’ saved [990406/990406]

>> python -c 'import yaml; yaml.safe_load(open("released.yml"))'
Traceback (most recent call last):
  File "/home/ingy/v3/lib/python3.6/site-packages/yaml/reader.py", line 89, in peek
    return self.buffer[self.pointer+index]
IndexError: string index out of range

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/ingy/v3/lib/python3.6/site-packages/yaml/__init__.py", line 94, in safe_load
    return load(stream, SafeLoader)
  File "/home/ingy/v3/lib/python3.6/site-packages/yaml/__init__.py", line 72, in load
    return loader.get_single_data()
  File "/home/ingy/v3/lib/python3.6/site-packages/yaml/constructor.py", line 35, in get_single_data
    node = self.get_single_node()
  File "/home/ingy/v3/lib/python3.6/site-packages/yaml/composer.py", line 36, in get_single_node
    document = self.compose_document()
  File "/home/ingy/v3/lib/python3.6/site-packages/yaml/composer.py", line 55, in compose_document
    node = self.compose_node(None, None)
  File "/home/ingy/v3/lib/python3.6/site-packages/yaml/composer.py", line 84, in compose_node
    node = self.compose_mapping_node(anchor)
  File "/home/ingy/v3/lib/python3.6/site-packages/yaml/composer.py", line 133, in compose_mapping_node
    item_value = self.compose_node(node, item_key)
  File "/home/ingy/v3/lib/python3.6/site-packages/yaml/composer.py", line 82, in compose_node
    node = self.compose_sequence_node(anchor)
  File "/home/ingy/v3/lib/python3.6/site-packages/yaml/composer.py", line 111, in compose_sequence_node
    node.value.append(self.compose_node(node, index))
  File "/home/ingy/v3/lib/python3.6/site-packages/yaml/composer.py", line 84, in compose_node
    node = self.compose_mapping_node(anchor)
  File "/home/ingy/v3/lib/python3.6/site-packages/yaml/composer.py", line 133, in compose_mapping_node
    item_value = self.compose_node(node, item_key)
  File "/home/ingy/v3/lib/python3.6/site-packages/yaml/composer.py", line 64, in compose_node
    if self.check_event(AliasEvent):
  File "/home/ingy/v3/lib/python3.6/site-packages/yaml/parser.py", line 98, in check_event
    self.current_event = self.state()
  File "/home/ingy/v3/lib/python3.6/site-packages/yaml/parser.py", line 572, in parse_flow_mapping_value
    if not self.check_token(FlowEntryToken, FlowMappingEndToken):
  File "/home/ingy/v3/lib/python3.6/site-packages/yaml/scanner.py", line 116, in check_token
    self.fetch_more_tokens()
  File "/home/ingy/v3/lib/python3.6/site-packages/yaml/scanner.py", line 248, in fetch_more_tokens
    return self.fetch_double()
  File "/home/ingy/v3/lib/python3.6/site-packages/yaml/scanner.py", line 652, in fetch_double
    self.fetch_flow_scalar(style='"')
  File "/home/ingy/v3/lib/python3.6/site-packages/yaml/scanner.py", line 663, in fetch_flow_scalar
    self.tokens.append(self.scan_flow_scalar(style))
  File "/home/ingy/v3/lib/python3.6/site-packages/yaml/scanner.py", line 1146, in scan_flow_scalar
    chunks.extend(self.scan_flow_scalar_non_spaces(double, start_mark))
  File "/home/ingy/v3/lib/python3.6/site-packages/yaml/scanner.py", line 1186, in scan_flow_scalar_non_spaces
    while self.peek(length) not in '\'\"\\\0 \t\r\n\x85\u2028\u2029':
  File "/home/ingy/v3/lib/python3.6/site-packages/yaml/reader.py", line 91, in peek
    self.update(index+1)
  File "/home/ingy/v3/lib/python3.6/site-packages/yaml/reader.py", line 169, in update
    self.check_printable(data)
  File "/home/ingy/v3/lib/python3.6/site-packages/yaml/reader.py", line 144, in check_printable
    'unicode', "special characters are not allowed")
yaml.reader.ReaderError: unacceptable character #x1f64b: special characters are not allowed
  in "released.yml", position 763688
>> 
>> deactivate
>> rm -fr v3 released.yml

@CarlFK
Copy link
Author

CarlFK commented Feb 12, 2019

Hmm, well... can you explain why it errors?

unacceptable character #x1f64b

The allowed character range explicitly excludes the C0 control block #x0-#x1F (except for TAB #x9, LF #xA, and CR #xD which are allowed), DEL #x7F, the C1 control block #x80-#x9F (except for NEL #x85 which is allowed), the surrogate block #xD800-#xDFFF, #xFFFE, and #xFFFF.
On input, a YAML processor must accept all Unicode characters except those explicitly excluded above.

https://yaml.org/spec/1.2/spec.html#id2770814

@perlpunk
Copy link
Member

@CarlFK
It currently works for libyaml, and when PyYAML uses libyaml, but not when using the pure python implementation <= 3.13.
It was fixed in cf1c86c
So the fix will be in the next release.

@BoubacarDIALLO4
Copy link

Hi all,

I make the molecule test but i have this issues:

return yaml.safe_load(string) or {}
File "/home/diallob/anaconda3/envs/infrastructure-plant/lib/python3.6/site-packages/yaml/init.py", line 94, in safe_load
return load(stream, SafeLoader)
File "/home/diallob/anaconda3/envs/infrastructure-plant/lib/python3.6/site-packages/yaml/init.py", line 70, in load
loader = Loader(stream)
File "/home/diallob/anaconda3/envs/infrastructure-plant/lib/python3.6/site-packages/yaml/loader.py", line 24, in init
Reader.init(self, stream)
File "/home/diallob/anaconda3/envs/infrastructure-plant/lib/python3.6/site-packages/yaml/reader.py", line 85, in init
self.determine_encoding()
File "/home/diallob/anaconda3/envs/infrastructure-plant/lib/python3.6/site-packages/yaml/reader.py", line 135, in determine_encoding
self.update(1)
File "/home/diallob/anaconda3/envs/infrastructure-plant/lib/python3.6/site-packages/yaml/reader.py", line 169, in update
self.check_printable(data)
File "/home/diallob/anaconda3/envs/infrastructure-plant/lib/python3.6/site-packages/yaml/reader.py", line 144, in check_printable
'unicode', "special characters are not allowed")
yaml.reader.ReaderError: unacceptable character #x0000: special characters are not allowed
in "/tmp/molecule/cybereason_antivirus/default/state.yml", position 0

Someone can help

Best regards

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants