Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SyntaxError with a not-existing offset for unicode code #51509

Closed
noam mannequin opened this issue Nov 4, 2009 · 3 comments
Closed

SyntaxError with a not-existing offset for unicode code #51509

noam mannequin opened this issue Nov 4, 2009 · 3 comments
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs)

Comments

@noam
Copy link
Mannequin

noam mannequin commented Nov 4, 2009

BPO 7260
Nosy @ezio-melotti

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = None
closed_at = <Date 2009-11-04.07:49:56.764>
created_at = <Date 2009-11-04.07:04:58.157>
labels = ['interpreter-core']
title = 'SyntaxError with a not-existing offset for unicode code'
updated_at = <Date 2009-11-04.07:49:56.762>
user = 'https://bugs.python.org/noam'

bugs.python.org fields:

activity = <Date 2009-11-04.07:49:56.762>
actor = 'ezio.melotti'
assignee = 'none'
closed = True
closed_date = <Date 2009-11-04.07:49:56.764>
closer = 'ezio.melotti'
components = ['Interpreter Core']
creation = <Date 2009-11-04.07:04:58.157>
creator = 'noam'
dependencies = []
files = []
hgrepos = []
issue_num = 7260
keywords = []
message_count = 3.0
messages = ['94879', '94881', '94883']
nosy_count = 2.0
nosy_names = ['noam', 'ezio.melotti']
pr_nums = []
priority = 'normal'
resolution = 'duplicate'
stage = 'resolved'
status = 'closed'
superseder = None
type = None
url = 'https://bugs.python.org/issue7260'
versions = ['Python 3.2']

@noam
Copy link
Mannequin Author

noam mannequin commented Nov 4, 2009

Hello,

This is from the current svn:

> ./python
Python 3.2a0 (py3k:76104, Nov  4 2009, 08:49:44) 
[GCC 4.4.1] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> try:
... 	eval("u'שלום'")
... except SyntaxError as e:
... 	e
... 
SyntaxError('invalid syntax', ('<string>', 1, 11, "u'שלום'"))

As you can see, the offset (11) refers to a non-existing character, as
the code contains only 7 characters.

Thanks,
Noam

@noam noam mannequin added the interpreter-core (Objects, Python, Grammar, and Parser dirs) label Nov 4, 2009
@ezio-melotti
Copy link
Member

Apparently the position of the caret is based on the number of bytes in
the line, not on the characters:
>>> [aaa for x in]
  File "<stdin>", line 1
    [aaa for x in]
                 ^
SyntaxError: invalid syntax
>>> [äää for x in]
  File "<stdin>", line 1
    [äää for x in]
                    ^
SyntaxError: invalid syntax

In this example each ä takes two bytes so the caret is 3 extra chars on
the right.

@ezio-melotti
Copy link
Member

This is actually a duplicate of bpo-2382, so I'm closing it.
bpo-2382 also has some patch.

@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs)
Projects
None yet
Development

No branches or pull requests

1 participant