Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Highlighting of SQL inside strings #27

Open
elpres opened this issue Nov 10, 2015 · 16 comments
Open

Highlighting of SQL inside strings #27

elpres opened this issue Nov 10, 2015 · 16 comments

Comments

@elpres
Copy link

elpres commented Nov 10, 2015

The stock Python package performs syntax highlighting inside strings containing SQL commands. That's rather useful, even though the implementation might need some more work. In this example:

create = 'CREATE TABLE test (id INT, name TEXT);'
insert = 'INSERT INTO test VALUES (1, "Bob")'

in the second line, words recognized as SQL are highlighted, but not in the first one, where "CREATE TABLE" has the same color as the non-reserved names.

(Actually, as you can see, even Github's Markdown does it, with exactly the same behavior, so perhaps something is wrong with the first line, although it does execute properly)

Do you consider this a useful addition?

@vpetrovykh
Copy link
Member

Could you specify whether you use Sublime Text or Atom?
The philosophy of MagicPython has been to actually highlight various Python features (i.e. %- and {}-style formatting, regular expressions and docstrings), rather than trying to merge and blend Python with other languages. There are really just too many possibilities there: Python strings can easily be SQL, HTML, XML, JavaScript, not to mention Django or Jinja templates as well as TeX. So usually the best we can do is to try and not highlight highly questionable things (see #28, #23, #12).
There is also a fundamental problem with highlighting other languages inside strings: it's very easy to break the highlighting of an entire source file by an incomplete expression inside a string. Try this in the stock package, I believe it should break the highlighter after the regexp string:

foo = spam(a=1)
bar = r'''regexp['''
baz = spam(a=1)

So far, the above issue seems to have some kind of a generic solution in Sublime Text, but not in Atom.
We have some plans for making user-configurable options for what gets actually highlighted inside the strings. As a configurable option it might be useful to highlight SQL, although at the moment I'm not certain how reliable we could make it, so it wouldn't be a priority for now.

@DanielOaks
Copy link

In terms of knowing what sort of string it is, one thing I've seen work decently is to only highlight SQL when it's fully uppercased. Just the keywords and phrases, like CREATE TABLE, INSERT INTO and such, since they tend to not come up much outside of specifically SQL statements.

But however you want to handle it. It would be very nice to have, if it's possible at some point.

@1st1 1st1 added the question label Nov 11, 2015
@elpres
Copy link
Author

elpres commented Nov 11, 2015

Could you specify whether you use Sublime Text or Atom?

Sublime Text, and you are correct about that code fragment breaking the syntax highlighter there. Since both examples we've posted show the same irregularities in Sublime and here, on Github, it's probably safe to assume that they both delegate highlighting to the same 3rd party, maybe Pygments, and there is no special handling for these languages embedded in strings built into Sublime.

And I see that this is a tricky thing to implement, but it also does add a lot to the readability of the code if you use SQL (or any of those other languages). So it comes with costs and benefits, and it's your call whether the benefits are worth it.

@vpetrovykh
Copy link
Member

Hmm... I've looked at the default SQL grammar that both Sublime Text and Atom seem to be using. The good news is that this particular grammar would not cause problems from within python strings (or, indeed, any other language strings that are delimited with quotes). So including SQL highlighting as a user-configurable option could be possible. Stay tuned.

@infininight
Copy link

@elprans Github uses TextMate-style grammars for highlighting, see github/linguist. For python they use this grammar, MagicPython.

@vpetrovykh One issue with matching SQL inside a string can be begin/end matches matching outside the string boundaries, for example:

'SELECT * FROM "table" -- comment'

I'm looking at solving this in TextMate by matching either the string content (for single line strings) or each line first then passing it off to the SQL grammar in a patterns array in the capture. Unfortunately I believe this feature is only available in Atom not Sublime Text. Unsure what form of grammars Visual Studio Code handles.

@vpetrovykh
Copy link
Member

@infininight I understand that generally falling back onto an external grammar can cause issues with string boundaries. It seems that at the moment the only reliable way to use the same basic grammar across Atom, Sublime Text and Visual Studio Code is to include an SQL grammar that is safe to use inside strings (much like we do with regular expressions). This would have to be optional, though, as detecting SQL may cause annoying false positives for all users.

@gtalarico
Copy link

Sublime's default Python syntax seems to do a good job knowing when to highlight SQL keywords.

Default Python Tests

image

Default Python

image

Magic Python

image

@movalex
Copy link

movalex commented Sep 3, 2018

Sublime uses not very sophisticated SQL indicator for triple quote highlighting:
\s*(?:SELECT|INSERT|UPDATE|DELETE|CREATE|REPLACE|ALTER|WITH)\b

Is it possible to implement this in MagicPython? Seems like a good feature, and used both in Atom and Sublime natively (does not work in VSCode).

    # Triple-quoted raw string, unicode or not, will detect SQL, otherwise regex
    - match: '([uU]?r)(""")'
      captures:
        1: storage.type.string.python
        2: meta.string.python string.quoted.double.block.python punctuation.definition.string.begin.python
      push:
        - meta_content_scope: meta.string.python string.quoted.double.block.python
        - match: '(?={{sql_indicator}})'
          set:
            - meta_scope: meta.string.python string.quoted.double.block.python
            - match: '"""'
              scope: punctuation.definition.string.end.python
              set: after-expression
            - match: ''
              push: scope:source.sql
              with_prototype:
                - match: '(?=""")'
                  pop: true
                - include: escaped-unicode-char
                - include: constant-placeholder
        - match: '(?=\S)'
          set:
            - meta_scope: meta.string.python string.quoted.double.block.python
            - match: '"""'
              scope: punctuation.definition.string.end.python
              set: after-expression
            - match: ''
              push: scope:source.regexp.python
              with_prototype:
                - match: '(?=""")'
                  pop: true
                - include: escaped-unicode-char

@pauliusbaranauskas
Copy link

This would help my daily work a lot, since I have to use sql in pyspark. My text editor of choice is VSCode.

@ma7555
Copy link

ma7555 commented Sep 3, 2019

is this issue dead?

@1st1
Copy link
Member

1st1 commented Sep 3, 2019

Well, we have no plans for adding SQL highlighting. There are many cons/pros about this feature, see the above discussion.

@selik
Copy link

selik commented Nov 22, 2019

@1st1 That's one of the first features I missed when trying out VSCode. I'm pretty happy with the way it works in Sublime Text.

@AlJohri
Copy link

AlJohri commented Jan 16, 2020

@1st1 is it possible to add a directive comment that will change the syntax highlighting for a multiline string? I'm in a weird situation where I need to write python code within a python script and I wanted to use syntax highlighting.

@task
def get_cookies():

	import textwrap

	# some directive to make the below docstring use python syntax highlighting (or SQL as above)
	script = textwrap.dedent(f"""
	import browser_cookie3
	cj = browser_cookie3.chrome()
	cookies1 = cj._cookies['domain1']['/']
	cookies2 = cj._cookies['.domain2']['/']
	print('VALUE1', cookies1['value1'].value)
	print('VALUE2', cookies1['value2'].value)
	print('VALUE3', cookies2['value3'].value)
	""").strip()

	shell(f"""
		python3 -m venv .venv &&
		.venv/bin/pip install browser_cookie3 &&
		.venv/bin/python -c "{script}"
	""")

@sairam4123
Copy link

I have the same problem in VSCode. Can anyone fix the issue?

@1st1
Copy link
Member

1st1 commented Jul 6, 2020

@1st1 is it possible to add a directive comment that will change the syntax highlighting for a multiline string? I'm in a weird situation where I need to write python code within a python script and I wanted to use syntax highlighting.

Unfortunately not: highlighters are very limited in what they see / how they can act :( They are basically big regular expressions without any means to do any code analysis (even rudimentary one)

@munro
Copy link

munro commented Sep 28, 2021

FWIW I'm currently using the python-string-sql, but it's a bit annoying that I have to litter my code with --sql & --end-sql to get highlighting. Issue with this extension is that I have to add --end-sql, otherwise all the Python code after the string ends is broken.

https://marketplace.visualstudio.com/items?itemName=ptweir.python-string-sql

Also, as another option, PyCharm was nice because I could explicitly set the highlighting with # language=SQL before my string, and it would highlight the code inside the string appropriately.

I don't mind explicitly setting the syntax highlighting with a comment—though auto detect would be nice—my main gripe is having to add --end-sql at the end with python-string-sql.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests