Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Preparser gets lost with iterated ellipsis_range #17378

Closed
sagetrac-tmonteil mannequin opened this issue Nov 21, 2014 · 22 comments
Closed

Preparser gets lost with iterated ellipsis_range #17378

sagetrac-tmonteil mannequin opened this issue Nov 21, 2014 · 22 comments

Comments

@sagetrac-tmonteil
Copy link
Mannequin

sagetrac-tmonteil mannequin commented Nov 21, 2014

Here is a bug reported during a workshop at Villetaneuse today:

sage: [1..10]
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
sage: len([1..10])
10
sage: [1..len([1..10])]
[1, 2, 3]

After a quick look, it seems that there is a preparsing problem, [1..len([1..10])] should be preparsed as something like:

(ellipsis_range(Integer(1),Ellipsis,len(ellipsis_range(Integer(1),Ellipsis,Integer(10)))))

but it is currently preparsed as:

sage: preparse('[1..len([1..10])]')
'(ellipsis_range(Integer(1),Ellipsis,len([Integer(1),Ellipsis,Integer(10)])))'

(the second ellipsis_range disappeared).

Component: user interface

Author: Nils Bruin

Branch/Commit: 7748c7c

Reviewer: Thierry Monteil, Marc Mezzarobba

Issue created by migration from https://trac.sagemath.org/ticket/17378

@sagetrac-tmonteil sagetrac-tmonteil mannequin added this to the sage-6.5 milestone Nov 21, 2014
@nbruin
Copy link
Contributor

nbruin commented Nov 21, 2014

comment:1

I think we're really out of our depth here. ellipsis_range can actually take multiple .. in its arguments, so we need to distinguish between

[1,2..10,12..20]

and

[1,2..len([10,12..20])]

The code in question is presently (see sage.misc.preparser line 520)

    ix = code.find('..')
    while ix != -1:
        if ix == 0:
            raise SyntaxError("Cannot start line with ellipsis.")
        elif code[ix-1]=='.':
            # '...' be valid Python in index slices
            code = code[:ix-1] + "Ellipsis" + code[ix+2:]
        elif len(code) >= ix+3 and code[ix+2]=='.':
            # '...' be valid Python in index slices
            code = code[:ix] + "Ellipsis" + code[ix+3:]
        else:
            start_list, end_list = containing_block(code, ix, ['()','[]'])
(*)         arguments = code[start_list+1:end_list-1].replace('...', ',Ellipsis,').replace('..', ',Ellipsis,')
            arguments = re.sub(r',\s*,', ',', arguments)
            if preparse_step:
                arguments = arguments.replace(';', ', step=')
            range_or_iter = 'range' if code[start_list]=='[' else 'iter'
            code = "%s(ellipsis_%s(%s))%s" %  (code[:start_list],
                                               range_or_iter,
                                               arguments,
                                               code[end_list:])
        ix = code.find('..')

The problem is in the line marked (*). It really does need to be prepared to make multiple substitutions, but it should leave ".." that are enclosed in some kind of bracket for later substitutions.

This is really one of those cases where string-substitutions aren't really powerful enough. It we'd start with a '..' that has a containing block that isn't properly contained in a containing block of any other '..', rather than the first '..' we find, then we'd be ok. (because inside that containing block all the '..' do belong to the same ellipsis_range).

I'm a little worried about cost with that approach: what happens if a whole ".sage" file gets preparsed that has a whole lot of '..' in it? does this routine then execute on that huge string? execution time quadratic (or worse) in the number of '..' might be problematic.

EDIT: correction: what we should do is: after we have found our initial ix and our start_list,end_list, we should do

    while True:
        new_ix = code.find('..',ix+1,end_list)
        if new_ix == -1:
            break #EXIT HERE if there are no other '..' to consider
        else:
            ix = new_ix
        new_start_list,new_end_list = containing_block(code,ix,['(),'[]'])
        if (new_start_list != start_list) or (new_end_list != end_list):
            start_list = new_start_list
            end_list = new_end_list

This should be pretty quick because it simply zooms in on the block that's under consideration. Those blocks should be pretty small, even in a big file.

This code should probably be edited a bit because there are apparently some occurrences of '...' that need to be treated differently and I'm not doing that here at the moment.

@nbruin
Copy link
Contributor

nbruin commented Nov 21, 2014

comment:2

Incidentally, the comma handling is perhaps a little too permissive:

sage: [1,,2..3]
[1, 2, 3]
sage: [1,,2,3]

Running preparse_ellipsis has the side effect for turning something that should be a syntax error into valid code.

@nbruin
Copy link
Contributor

nbruin commented Nov 22, 2014

@nbruin
Copy link
Contributor

nbruin commented Nov 22, 2014

Commit: 0c0ddd9

@nbruin
Copy link
Contributor

nbruin commented Nov 22, 2014

comment:4

OK, I did not deal with the extra comma replacements (which seem relatively harmless). i think the "..." detection is simply a matter that those "..." can be replaced by Ellipsis but should not trigger the generation of an ellipsis_range/ellipsis_iter. That's my current interpretation.


New commits:

0c0ddd9trac 17378: make sure preparse_ellipsis deals with nested ranges properly

@nbruin
Copy link
Contributor

nbruin commented Nov 22, 2014

Author: nbruin

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Nov 22, 2014

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

7b4b9ectrac 17378: make sure preparse_ellipsis deals with nested ranges properly

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Nov 22, 2014

Changed commit from 0c0ddd9 to 7b4b9ec

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Nov 22, 2014

Changed commit from 7b4b9ec to a0f5dae

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Nov 22, 2014

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

a0f5daetrac 17378: make sure preparse_ellipsis deals with nested ranges properly

@sagetrac-tmonteil
Copy link
Mannequin Author

sagetrac-tmonteil mannequin commented Dec 8, 2014

Reviewer: Thierry Monteil

@sagetrac-tmonteil
Copy link
Mannequin Author

sagetrac-tmonteil mannequin commented Dec 8, 2014

Changed author from nbruin to Nils Bruin

@sagetrac-tmonteil
Copy link
Mannequin Author

sagetrac-tmonteil mannequin commented Dec 8, 2014

comment:8

This makes sense and fixes the problem.

Could you just replace #17378 by :trac:`17378` in the doctest so that sphinx can handle it ?

@sagetrac-tmonteil
Copy link
Mannequin Author

sagetrac-tmonteil mannequin commented Jan 7, 2015

@sagetrac-tmonteil
Copy link
Mannequin Author

sagetrac-tmonteil mannequin commented Jan 7, 2015

Changed commit from a0f5dae to 0f8764f

@sagetrac-tmonteil
Copy link
Mannequin Author

sagetrac-tmonteil mannequin commented Jan 7, 2015

New commits:

0f8764f#17378 let sphinx know about trac ticket (reviewer commit).

@vbraun
Copy link
Member

vbraun commented Jan 12, 2015

comment:11

Conflicts with #17396

@nbruin
Copy link
Contributor

nbruin commented Jan 12, 2015

@nbruin
Copy link
Contributor

nbruin commented Jan 12, 2015

Changed commit from 0f8764f to 7748c7c

@nbruin
Copy link
Contributor

nbruin commented Jan 12, 2015

comment:13

hard rebase to resolve conflict


New commits:

7748c7ctrac 17378: make sure preparse_ellipsis deals with nested ranges properly

@mezzarobba
Copy link
Member

Changed reviewer from Thierry Monteil to Thierry Monteil, Marc Mezzarobba

@vbraun
Copy link
Member

vbraun commented Feb 8, 2015

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants