Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about scopes and selectors #83

Open
brupelo opened this issue Feb 27, 2019 · 13 comments
Open

Question about scopes and selectors #83

brupelo opened this issue Feb 27, 2019 · 13 comments
Labels

Comments

@brupelo
Copy link
Contributor

brupelo commented Feb 27, 2019

@gamecreature Hi Rick, hope this offtopic question won't bother you... Here's the thing, I'd like to implement my own version of Sublime's view.extract_scopes and view.scope_name methods in python but I'm a little bit lost.

So far I've got ready to go a python oniguruma regex engine as well as few dozens of tmLanguage.json files ready to be consumed.

That said, could you please guide/mentor me about what I'd need to implement to achieve this little goal? Asking you cos my knowledge about scoping is still pretty limited and even if I've rewritten some bits of your code I still don't know very well how to use it or what the big picture is hehe :D

Thx in advance!

@gamecreature
Copy link
Member

gamecreature commented Feb 27, 2019

I will try ;-)

Access alread parsed scopes:

In the edbee-app code all active scopes are displayed in the status bar:

QString str;
QVector<TextScope*> scopes = textDocument()->scopes()->scopesAtOffset( caret ) ;
for( int i=0,cnt=scopes.size(); i<cnt; ++i ) {
TextScope* scope = scopes[i];
str.append( scope->name() );
str.append(" ");
}
text.append( str );
text.append( QObject::tr(" (%1)").arg( textDocument()->scopes()->lastScopedOffset() ) );

This is how you can access the parsed scopes.

Scope Parsing

When parsing a file, a the scopes are trying to match the regexps.
There are multi-line regexps (Begin and end) and single line regexps (match)

TextGrammarRule* TmLanguageParser::createGrammarRule( TextGrammar* grammar, const QVariant& data)
{
QHash<QString,QVariant> map = data.toHash();
QString match = map.value("match").toString();
QString include = map.value("include").toString();
QString begin = map.value("begin").toString();
QString name = map.value("name").toString();
// match filled?
if( !match.isEmpty() ) {
TextGrammarRule* rule = TextGrammarRule::createSingleLineRegExp( grammar, name, match );
QHash<QString,QVariant> captures = map.value("captures").toHash();
addCapturesToGrammarRule( rule, captures );
return rule;
} else if( !include.isEmpty() ) {
return TextGrammarRule::createIncludeRule( grammar, include );
} else if( !begin.isEmpty() ) {
QString end = map.value("end").toString();
// TODO: contentScopeName
QString contentScope = name;
TextGrammarRule* rule = TextGrammarRule::createMultiLineRegExp( grammar, name, contentScope, begin, end );
// add the patterns
QList<QVariant> patterns = map.value("patterns").toList();
addPatternsToGrammarRule( rule, patterns );
if( map.contains("captures")) {
QHash<QString,QVariant> captures = map.value("captures").toHash();
addCapturesToGrammarRule( rule, captures );
addCapturesToGrammarRule( rule, captures, true );
}
if( map.contains("beginCaptures")) {
QHash<QString,QVariant> captures = map.value("beginCaptures").toHash();
addCapturesToGrammarRule( rule, captures );
}
if( map.contains("endCaptures")) {
QHash<QString,QVariant> endCaptures = map.value("endCaptures").toHash();
addCapturesToGrammarRule( rule, endCaptures, true );
}
return rule;
} else {
TextGrammarRule* rule = TextGrammarRule::createRuleList(grammar);
// add the patterns
QList<QVariant> patterns = map.value("patterns").toList();
addPatternsToGrammarRule( rule, patterns );
return rule;
}

When a multi-line regexps matches, a scope is opened and active.
It will stay in this scope until the end regexp matches
This way it builds a tree with scopes.

(See

// (INIT) ALGORITHM BELOW:
//
// - first find the current active scopes.
// - all multi-line-scopes that start on this line we need to 'remove'
// - all multi-line-scopes that end on this line we need to set the end to 'unkown'
// - next we must find the ruleset for the current scopes by going into the grammar rules
// WHILE there are chars left to match
// - run all rule reg-exps for this scope and
// - also run all reg-exps for active multi-line scopes to find (a possibly new end-marker)
// - use the the rule regexp with the first offset. Add the range to the scopes
// - if this is a begin-block regexp, activate the new ruleset. (check if the end-regexp is here)
comment for the algorithm description)

@brupelo
Copy link
Contributor Author

brupelo commented Feb 27, 2019

Thank you very much, that helps! today I'll try to allocate some time to port that to the experimental pylime.

Btw... I already asked in some freenode irc channels like {python, pyside2, pyqt, sublimetext} for people's help to make something real out of that little experiment but so far nobody got interested. I guess that's pretty normal in any case, from what I've seen here in github people will just start contributing to very mature projects or they will just open PR requesting features they want :D

@brupelo
Copy link
Contributor Author

brupelo commented Mar 13, 2019

At this point I've learned all the basics about scopes&selectors but one thing still remains unclear to me... and that is basically how extract_scope works in SublimeText is implemented, consider this data extracted from SublimeText using this command:

class TestScopeCommand(sublime_plugin.TextCommand):

    def run(self, edit, block=False):
        print('-' * 80)

        view = self.view
        for i in range(view.size()):
            a = i
            b = repr(view.substr(i))
            c = view.extract_scope(i)
            d = repr(view.substr(view.extract_scope(i)))
            e = view.scope_name(i)
            print("{:<5}{:<5}{:<10}{:<65}{}".format(a,b,c,d,e))

used on this test file foo.py:

# I'm a comment

def foo():
    print('# No comment')

if you apply that command on foo.py you'll get the below data. where the 3 column are the ranges obtained by extract_scope:

table = [
    [0, '#',    (0, 16),        "# I'm a comment\n",                                             "source.python comment.line.number-sign.python punctuation.definition.comment.python "],
    [1, ' ',    (0, 16),        "# I'm a comment\n",                                             "source.python comment.line.number-sign.python "],
    [2, 'I',    (0, 16),        "# I'm a comment\n",                                             "source.python comment.line.number-sign.python "],
    [3, "'",    (0, 16),        "# I'm a comment\n",                                             "source.python comment.line.number-sign.python "],
    [4, 'm',    (0, 16),        "# I'm a comment\n",                                             "source.python comment.line.number-sign.python "],
    [5, ' ',    (0, 16),        "# I'm a comment\n",                                             "source.python comment.line.number-sign.python "],
    [6, 'a',    (0, 16),        "# I'm a comment\n",                                             "source.python comment.line.number-sign.python "],
    [7, ' ',    (0, 16),        "# I'm a comment\n",                                             "source.python comment.line.number-sign.python "],
    [8, 'c',    (0, 16),        "# I'm a comment\n",                                             "source.python comment.line.number-sign.python "],
    [9, 'o',    (0, 16),        "# I'm a comment\n",                                             "source.python comment.line.number-sign.python "],
    [10, 'm',   (0, 16),        "# I'm a comment\n",                                             "source.python comment.line.number-sign.python "],
    [11, 'm',   (0, 16),        "# I'm a comment\n",                                             "source.python comment.line.number-sign.python "],
    [12, 'e',   (0, 16),        "# I'm a comment\n",                                             "source.python comment.line.number-sign.python "],
    [13, 'n',   (0, 16),        "# I'm a comment\n",                                             "source.python comment.line.number-sign.python "],
    [14, 't',   (0, 16),        "# I'm a comment\n",                                             "source.python comment.line.number-sign.python "],
    [15, '\n',  (0, 16),        "# I'm a comment\n",                                             "source.python comment.line.number-sign.python "],
    [16, '\n',  (0, 54),        "# I'm a comment\n\ndef foo():\n    print('# No comment')\n",    "source.python "],
    [17, 'd',   (17, 24),       'def foo',                                                       "source.python meta.function.python storage.type.function.python "],
    [18, 'e',   (17, 24),       'def foo',                                                       "source.python meta.function.python storage.type.function.python "],
    [19, 'f',   (17, 24),       'def foo',                                                       "source.python meta.function.python storage.type.function.python "],
    [20, ' ',   (17, 24),       'def foo',                                                       "source.python meta.function.python "],
    [21, 'f',   (17, 24),       'def foo',                                                       "source.python meta.function.python entity.name.function.python meta.generic-name.python "],
    [22, 'o',   (17, 24),       'def foo',                                                       "source.python meta.function.python entity.name.function.python meta.generic-name.python "],
    [23, 'o',   (17, 24),       'def foo',                                                       "source.python meta.function.python entity.name.function.python meta.generic-name.python "],
    [24, '(',   (24, 26),       '()',                                                            "source.python meta.function.parameters.python punctuation.section.parameters.begin.python "],
    [25, ')',   (24, 26),       '()',                                                            "source.python meta.function.parameters.python punctuation.section.parameters.end.python "],
    [26, ':',   (25, 27),       '):',                                                            "source.python meta.function.python punctuation.section.function.begin.python "],
    [27, '\n',  (0, 54),        "# I'm a comment\n\ndef foo():\n    print('# No comment')\n",    "source.python "],
    [28, ' ',   (0, 54),        "# I'm a comment\n\ndef foo():\n    print('# No comment')\n",    "source.python "],
    [29, ' ',   (0, 54),        "# I'm a comment\n\ndef foo():\n    print('# No comment')\n",    "source.python "],
    [30, ' ',   (0, 54),        "# I'm a comment\n\ndef foo():\n    print('# No comment')\n",    "source.python "],
    [31, ' ',   (0, 54),        "# I'm a comment\n\ndef foo():\n    print('# No comment')\n",    "source.python "],
    [32, 'p',   (32, 38),       'print(',                                                        "source.python meta.function-call.python meta.qualified-name.python support.function.builtin.python "],
    [33, 'r',   (32, 38),       'print(',                                                        "source.python meta.function-call.python meta.qualified-name.python support.function.builtin.python "],
    [34, 'i',   (32, 38),       'print(',                                                        "source.python meta.function-call.python meta.qualified-name.python support.function.builtin.python "],
    [35, 'n',   (32, 38),       'print(',                                                        "source.python meta.function-call.python meta.qualified-name.python support.function.builtin.python "],
    [36, 't',   (32, 38),       'print(',                                                        "source.python meta.function-call.python meta.qualified-name.python support.function.builtin.python "],
    [37, '(',   (37, 38),       '(',                                                             "source.python meta.function-call.python punctuation.section.arguments.begin.python "],
    [38, "'",   (38, 52),       "'# No comment'",                                                "source.python meta.function-call.python meta.function-call.arguments.python meta.string.python string.quoted.single.python punctuation.definition.string.begin.python "],
    [39, '#',   (38, 52),       "'# No comment'",                                                "source.python meta.function-call.python meta.function-call.arguments.python meta.string.python string.quoted.single.python "],
    [40, ' ',   (38, 52),       "'# No comment'",                                                "source.python meta.function-call.python meta.function-call.arguments.python meta.string.python string.quoted.single.python "],
    [41, 'N',   (38, 52),       "'# No comment'",                                                "source.python meta.function-call.python meta.function-call.arguments.python meta.string.python string.quoted.single.python "],
    [42, 'o',   (38, 52),       "'# No comment'",                                                "source.python meta.function-call.python meta.function-call.arguments.python meta.string.python string.quoted.single.python "],
    [43, ' ',   (38, 52),       "'# No comment'",                                                "source.python meta.function-call.python meta.function-call.arguments.python meta.string.python string.quoted.single.python "],
    [44, 'c',   (38, 52),       "'# No comment'",                                                "source.python meta.function-call.python meta.function-call.arguments.python meta.string.python string.quoted.single.python "],
    [45, 'o',   (38, 52),       "'# No comment'",                                                "source.python meta.function-call.python meta.function-call.arguments.python meta.string.python string.quoted.single.python "],
    [46, 'm',   (38, 52),       "'# No comment'",                                                "source.python meta.function-call.python meta.function-call.arguments.python meta.string.python string.quoted.single.python "],
    [47, 'm',   (38, 52),       "'# No comment'",                                                "source.python meta.function-call.python meta.function-call.arguments.python meta.string.python string.quoted.single.python "],
    [48, 'e',   (38, 52),       "'# No comment'",                                                "source.python meta.function-call.python meta.function-call.arguments.python meta.string.python string.quoted.single.python "],
    [49, 'n',   (38, 52),       "'# No comment'",                                                "source.python meta.function-call.python meta.function-call.arguments.python meta.string.python string.quoted.single.python "],
    [50, 't',   (38, 52),       "'# No comment'",                                                "source.python meta.function-call.python meta.function-call.arguments.python meta.string.python string.quoted.single.python "],
    [51, "'",   (38, 52),       "'# No comment'",                                                "source.python meta.function-call.python meta.function-call.arguments.python meta.string.python string.quoted.single.python punctuation.definition.string.end.python "],
    [52, ')',   (51, 53),       "')",                                                            "source.python meta.function-call.python punctuation.section.arguments.end.python "],
    [53, '\n',  (0, 54),        "# I'm a comment\n\ndef foo():\n    print('# No comment')\n",    "source.python "],
]

by looking at those ranges would you be able to infere what's the algorithm used by extract_scope behind the curtains? I've asked already in the Sublime forums and nobody has been able to provide a positive answer so I'm asking to you as I'm aware you've got quite a deep knowledge about this subject... crossing fingers :)

Ps. I've tried to extract the asm code from this particular function with the debugger but I haven't been able to even find it... no clue if it's living in plugin_host.exe or sublime_text.exe... hehe ;)

Thanks in advance.

@gamecreature
Copy link
Member

I don't know exactly what sublime uses..
But edbee searches for 'multi-line scopes' with a start and end regular expression.
It finds single-line scopes by applying the scopes in a given context. (regexps can be conditional to active scopes)

Wat sublime returns in the example above are just the multi-line scope ranges.... (column 3)
As you can see the first character of the comment is also a 'punctuation.definition.comment.python'.
The range for that scope is (0,0)

@brupelo
Copy link
Contributor Author

brupelo commented Mar 14, 2019

Mmm, not sure if I've understood correctly... But, do you mean the exact equivalent sublime extract_scope is what you call multi line scopes? What do you mean the range of that scope is (0,0)? In sublime pt 0 gives (0,16).

Anyway, the idea is using this https://github.com/brupelo/pysyntect to mimick sublime extract_scope. At this point i know how to get column 5 and i know how to apply selectors... Would that be enough to implement extract_scope? If you could give more details about how to implement it that would be great. Its the only missing function for me to have a 1:1 equivalent to sublime toggle comments on a qscintilla widget. As ive already reverse engineered the rest of st functions, ie: view.insert, view.erase and view.replace ;-)

@brupelo
Copy link
Contributor Author

brupelo commented Mar 14, 2019

Of course, once i've confirmed it works on a qscintilla we could adapt the code to edbee. Toggle comments is one of the most important features a text editor should have. And sublime works fantastically well

@gamecreature
Copy link
Member

What I see is that extract_scope returns the scope for every character (offset in the document).

It is not efficient, but you could fetch the scope at every character.. (pseude code)

foreach(offset in document) {
  document->scopes()-> scopesAtOffset(offset);
}

I guess it's more efficient, to fetch all scopes and fill the characters yourself
[Ctrl + Shift + X, S] dumps the scopes (Mac Os X, [Command + Shift + X, S] )

The DebugCommand.cpp file contains the code which dumps the scopes

https://github.com/edbee/edbee-lib/blob/master/edbee-lib/edbee/commands/debugcommand.cpp

I don't know if this is the answer to your question.
The TextDocumentScopes class contains all scopes that have been parsed

@brupelo
Copy link
Contributor Author

brupelo commented Mar 14, 2019

Mmm, hehe, i guess my poor english and the fact I'm texting from phone makes difficult to understand. About the explanation of your previous comment. The name of the scope on each character is something i know already how to do, i've got that information (5th column) at my disposal. What i dont know how to compute is the 3rd column.

Extract_scope receives either a position or a range as input and returns a range as output. Range in this context is just a tuple of 2 integers, start and end offsets. Hopefully now my question makes more sense ;)

@gamecreature
Copy link
Member

gamecreature commented Mar 14, 2019

The range displayed there, seems to be the range of the last active multi-line scopes.
(In this example the 'comment.line.number-sign.python' scope. Which spans from (0, 15). (16 exclusive)

In edbee, the last scope of:

 document->scopes()->multiLineScopedRangesBetweenOffsets(0,0)

@brupelo
Copy link
Contributor Author

brupelo commented Mar 14, 2019

Mmm, ia that so? Wow, cool... Maybe i should revaluate this project then... About the other feature of multiselection with the mouse... How hard do you think would be to have it ready? Asking cos maybe is worth implementing toggle comment directly on this project and not on a qscintilla...

@brupelo
Copy link
Contributor Author

brupelo commented Mar 25, 2019

I haven't implemented extract_scope in pyblime yet but theorically when I figure out how to do it I'll be able to use directly Sublime's existing code and it will work out of the box.

Btw, I've read somewhere in your blog few days ago you liked the demoscene... was one of the main reasons you created edbee so you could use it in some demotool/tool intended for 3d graphics? I'm quite curious about it actually :)

Let me tell you I'm a scener myself and one of the things I love the most is making demotools... creating 3d tools using python and c++ is a really enjoyable and fun experience... highly recommended :)

@gamecreature
Copy link
Member

It wasn't directly written for the demoscene. But at this moment we use edbee for our own demo-tool...

image

@brupelo
Copy link
Contributor Author

brupelo commented Apr 1, 2019

Awesome! Cool stuff ;)

Let me tell you I'm a total geek about demotools (specially the ones intended to create 4k/64k) and I know almost every existing tool in the scene... I'm curious about that texture/mesh editor, do you use nodes?

In my case, I'll also use pyblime as a drop-in widget replacement eventually for some of my tools ;)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants