Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lexer SQL to use whitespace token - regarding #1905 #1908

Merged
merged 4 commits into from Oct 6, 2021

Conversation

blu-base
Copy link
Contributor

@blu-base blu-base commented Oct 4, 2021

This PR specifes whitespace-only matches with the respective Text.Whitespace token.
It is submitted in the context of #1905. The changes are minimal, and related test files have been adapted.


Anyhow, I noticed a possible inconsitency and would like a second opinion on how to deal with it.

While the following regex does not include the trailing white space, the sqlite prompt does.

re_prompt = re.compile(r'^(\S.*?)??[=\-\(\$\'\"][#>]')

for match in line_re.finditer(data):
line = match.group()
if line.startswith('sqlite> ') or line.startswith(' ...> '):
insertions.append((len(curcode),
[(0, Generic.Prompt, line[:8])]))
curcode += line[8:]
else:
if curcode:
yield from do_insertions(insertions,
sql.get_tokens_unprocessed(curcode))
curcode = ''
insertions = []
if line.startswith('SQL error: '):
yield (match.start(), Generic.Traceback, line)
else:
yield (match.start(), Generic.Output, line)

In this state there are Prompt Token lexed such as tests/examplefiles/psql/psql_session.txt.output line 27 (matched by the re_prompt regex) which are trailed by a Whitespace token. However, the sqlite> prompt does include the trailing space in the token group (matched via the startswith expressions). The latter has an example in tests/examplefiles/sqlite3/sqlite3.sqlite3-console.output line 98 (at the end of the diff)

@Anteru
Copy link
Collaborator

Anteru commented Oct 4, 2021

Thanks for preparing this change! On the SQLite prompt: I'd like to see it produce whitespace and not put it into the prompt itself, similar to the PgSQL prompt. I assume if you add manual whitespace after the prompt it will actually highlight it as whitespace? In which case all whitespace should have been marked as such.

@blu-base
Copy link
Contributor Author

blu-base commented Oct 6, 2021

The two additional commits deal with the sqlite prompt, and complete this PR from my perspective EDIT see below.

Thanks to the well prepared test file, i caught a bug I'd introduce with the second commit only. Leaving the trailing whitespace "in the open" for further lexing, it could result in grouping the space with other long running tokens, such as string literals in the test file.

@blu-base
Copy link
Contributor Author

blu-base commented Oct 6, 2021

OK! Now the sqlite diff looks much more as expected.

This should complete it.

@Anteru Anteru self-assigned this Oct 6, 2021
@Anteru Anteru added the changelog-update Items which need to get mentioned in the changelog label Oct 6, 2021
@Anteru Anteru merged commit 0ef9b78 into pygments:master Oct 6, 2021
@Anteru
Copy link
Collaborator

Anteru commented Oct 6, 2021

Merged, thanks a lot! Output looks indeed quite nice!

@blu-base blu-base deleted the sql_whitespaces branch October 8, 2021 22:00
@Anteru Anteru removed the changelog-update Items which need to get mentioned in the changelog label Dec 12, 2021
@Anteru Anteru added this to the 2.11.0 milestone Dec 12, 2021
@Anteru Anteru linked an issue Dec 12, 2021 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

WIP - Whitespace lexing after Keywords - collecting issue
2 participants