Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scintillua #916

Closed
wants to merge 0 commits into from
Closed

Scintillua #916

wants to merge 0 commits into from

Conversation

moesasji
Copy link
Contributor

This is a first stab at bringing the lexers in sync with upstream. What has been done:

  1. all lexers unchanged in vis since last sync updated.
  2. added upstream lexers not present in vis.
  3. where patches were obvious the lexer was updated with the changes pushed upstream as pull-requests.
  4. opened a pull-request for all lexers in vis not in upstream, but left them in the legacy format for now.

There are still a number of lexers to do, but those are not that obvious or the commits I see in vis seems to have removed functionality. I'll add a list of those to the bug-report on this.

Copy link
Owner

@martanne martanne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for tackling this!

A few general/administrative comments:

  • please use one commit per lexer to import the upstream version
  • then apply commits per lexer with our logically grouped changes on top of it
  • start the commit message with lexers/$lang: followed by a short description

More concretely, this would result into commit history similar to this:

  • lexers: update copyright dates
  • lexers/actionscript: import upstream revision 09e0ff0
  • lexers/ada: import upstream revision 09e0ff0
  • ...
  • lexers/jq: new upstream lexer revision 09e0ff0
  • ...
  • lexers/elixir: import upstream revision 09e0ff0 in the long description you can mention that the white space fix is already present.
  • lexers/perl: import upstream revision 09e0ff0
  • lexers/perl: fix long regex patterns preserve the long description from the original vis commit
  • ...

I hope you get the idea. This allows to more easily revert language specific changes and incorporate changes once merged upstream.

Thanks again!

@moesasji
Copy link
Contributor Author

I get the idea and now working on cleaning this up made easier by @orbitalquark already merging the pull-requests that made sense.

Copy link
Owner

@martanne martanne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is moving into the right direction, thanks! I added a minor style related inline comment.

We will eventually have to update the lexer README with updated references to the new API and lexer template.

There are still a number of lexers to do, but those are not that obvious or the commits I see in vis seems to have removed functionality. I'll add a list of those to the bug-report on this.

Please provide this list. The removed functionality is likely due to performance issues (e.g. #797, #726), maybe upstream also has suggestions how to improve it. In general the same methodology, i.e. first committing the upstream version followed by our own changes on top, should be applicable.

Thanks again.

lua/plugins/filetype.lua Outdated Show resolved Hide resolved
@moesasji
Copy link
Contributor Author

moesasji commented Dec 20, 2020

We will eventually have to update the lexer README with updated references to the new API and lexer template.

Agreed, just left that one for now.

Please provide this list. The removed functionality is likely due to performance issues (e.g. #797, #726), maybe upstream also has suggestions how to improve it. In general the same methodology, i.e. first committing the upstream version followed by our own changes on top, should be applicable.

Most of the ones that I haven't yet touched are ones that required a bit more thought as I opted to do the easy ones first. The list of files that still need a look with comments added:

  • README.md
  • ansi_c.lua (at first glance changes didn't look compatible)
  • bash.lua (more than just a couple)
  • erlang.lua (vis commit removes a lot and appears wrong)
  • fennel.lua (not the same origin - vis version seems more extensive)
  • html.lua (performance fix)
  • markdown.lua (not sure how vis changes fit with upstream)
  • php.lua (more than just a couple, not an easy one)
  • scheme.lua (more than just a couple, not an easy one)
  • text.lua (upstream empty, vis whitespace only)
  • toml.lua (initially didn't see why indent error got removed)
  • wsf.lua (performance fix)
  • xml.lua (performance fix)
  • yaml.lua (performance fix)

btw) the ones labeled performance fix should be not be difficult, but I just left them out in my first stab at it.

@ghost
Copy link

ghost commented Dec 21, 2020

  • fennel.lua (not the same origin - vis version seems more extensive)

Vis version is not more extensive, it's just that the vis copy of Scintillua does not support inheritance.
Upstream Scintillua made it possible lexers to inherit from each other, in order to avoid duplication:

fennel.lua:9

@martanne
Copy link
Owner

  • bash.lua (more than just a couple)

The most recent changes are related to here documents, @silasdb can probably make sure they work as expected with the upstream version? The commit message of 21727ae also gives an example with variable substitution.

  • erlang.lua (vis commit removes a lot and appears wrong)

I am not exactly sure what the intention was in #632, the description is very vague. Maybe @emfa can clarify, until then drop the changes and go with the upstream version.

  • markdown.lua (not sure how vis changes fit with upstream)

Check whether #823 still happens with the upstream version. (Not sure whether it actually works correctly in vis either).

  • php.lua (more than just a couple, not an easy one)

The issue fixed in 0ff02a7 is that ?> should break out of PHP mode. The other changes just add constants and keywords, it is probably best to fetch the latest ones from the the documentation and incorporate them into the upstream version.

  • scheme.lua (more than just a couple, not an easy one)

6455498 has descriptive commit message, maybe @ii8 would be willing to update it to the new upstream API

  • text.lua (upstream empty, vis whitespace only)

This is such that replacement symbols are visible, see 807ba6f. It is not relevant for upstream. We should just convert it to the new API.

  • toml.lua (initially didn't see why indent error got removed)

This might have been a misunderstanding on my part.

@moesasji
Copy link
Contributor Author

  • text.lua (upstream empty, vis whitespace only)

This is such that replacement symbols are visible, see 807ba6f. It is not relevant for upstream. We should just convert it to the new API.

  • toml.lua (initially didn't see why indent error got removed)

This might have been a misunderstanding on my part.

To prevent double work. I've opened pull request on these and several other ones, see: upstream That would tell whether or not they apply to upstream.

@silasdb
Copy link
Contributor

silasdb commented Dec 21, 2020

Hi! Is it a better practice to send lexers PRs to upstream instead of sending them to vis?

@martanne
Copy link
Owner

Is it a better practice to send lexers PRs to upstream instead of sending them to vis?

In principle sending them upstream is preferable. However, this assumes both projects use the same format which is not yet the case, but is the goal of this PR. Hence, if you could make sure that the upstream bash.lua works as expected and contains all the necessary fixes that would be appreciated.

@moesasji
Copy link
Contributor Author

I'm hitting a bit of a glitch with the upstream markdown lexer. Out of the box the upstream version fails to render the headers in red and none of the patches that are applied to the legacy version in vis would affect that behavior.

As the code in the lexer looks fine to me....is this problem caused at this end, i.e. did we miss something in the changes made in vis.lua to be able to switch away from the legacy lexer?

@ghost
Copy link

ghost commented Dec 21, 2020

Hi @moesasji!

I think the glitch is due to the Markdown lexer using themeable colors (as opposed to hard-coded ones):
markdown.lua:19
markdown.lua:54

Some themes have a colors table defined and exported - solarized.lua:23, but the default one doesn't.
(And even solarized.lua doesn't currently have the black color for the hr rule above.)
If you do :set theme solarized, you'll see that the headers become red.

For a quick test, you can try replacing lexer.colors.red with 'red'.
But in the long term, I think it's better to add a colors table to all the themes, so vis could use unmodified lexers from Scintillua.

@martanne
Copy link
Owner

It should be fixed by abee985?

@moesasji
Copy link
Contributor Author

It should be fixed by abee985?

That indeed fixes the issue for me. Markdown headers have the right color again. Thanks for the quick fix and @lepotic for spotting it!

btw) with this being a subset probably worth keeping fixing the color theme on the todo-list to get the colors one would expect.

@ghost
Copy link

ghost commented Dec 21, 2020

It should be fixed by abee985?

Yes, and it's better than what I suggested (duplicating the table in each theme). :)

@moesasji
Copy link
Contributor Author

Not sure what I did wrong to trigger build failures for this recent sets of commits. Any thoughts on where the mistake is would be greatly appreciated?

In any case this should be getting closer. Things left to do:

Copy link
Owner

@martanne martanne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haven't looked yet what might be wrong, maybe it is related to the text lexer which is used as a default?

Could you please drop your merge commit and instead rebase the changes on top of my latest scintillua branch.

Also some of the latest commit messages related to the performance improvements could probably be improved a bit.

Thanks again!

local l = require('lexer')
local lexer = require('lexer')
local token, word_match = lexer.token, lexer.word_match
local P, S = lpeg.P, lpeg.S
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove all the unnecessary code here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. I stuck to the default part of the template, but will take a look at what I can drop from it.

  2. I thought I had rebased on top of your scintillua branch, but clearly need to 'fight' a bit more with git.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've made the changes requested and also split the push in two parts, which (after a long wait) indeed shows that modifying the text lexer to the new format is what triggers the build failures.

I can retry without the white space patch, but that will probably make no difference?

local lex = lexer.new('text')

-- Whitespace
lex.add_rule('whitespace', lexer.token(lexer.WHITESPACE, lexer.space^1))
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@moesasji it has to be lex:add_rule (a colon, not a dot).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The joy of being far too used to using C based languages. Thanks for spotting this typo!

@ghost
Copy link

ghost commented Dec 22, 2020

I found where the Markdown lexer goes wrong, but don't have a solution yet.
Actually, it's not the lexer's fault (literally the same lexer works fine in Textadept), more like the vis integration is missing something.

If you do

-lex:add_style('code', lexer.styles.embedded .. {eolfilled = true})
+lex:add_style('code', {back = 'red'})

code blocks will be highlighted.

lexer.styles.embedded is a table with a metatable, adding some magic I don't understand.
There are more lexers that use lexer.styles.* and are probably subtly broken as well.
I'll see what I can do tomorrow.

@moesasji
Copy link
Contributor Author

moesasji commented Dec 22, 2020

Actually, it's not the lexer's fault (literally the same lexer works fine in Textadept), more like the vis integration is missing something.

If you do

-lex:add_style('code', lexer.styles.embedded .. {eolfilled = true})
+lex:add_style('code', {back = 'red'})

code blocks will be highlighted.

It appears to be the same cause as the earlier markdown issues as these styles appear to be defined in the color-themes used for textadept, see for example this one. Still puzzling as this one is set in the default themes except for zenburn

@martanne
Copy link
Owner

The issue is indeed related. Not sure what the best/most elegant way to fix it is. This patch seems to work in my very brief test.

That also suggest we should probably review the (default) themes.

@moesasji
Copy link
Contributor Author

moesasji commented Dec 23, 2020

With the last commits I've pushed we've reached the state I'm pretty much done with the bits I'm confident dealing with. To summarize what is remaining:

  • fix issues with theme templates
  • scheme patch that @ii8 contributed still needs to be applied: upstream issue here
  • ansi_c had a couple of updates not yet applied: upstream issue here
  • bash needs fix for nested variables: upstream issue here
  • bash upstream needs heredoc patches that @silasdb has contributed; but our fix is actually not complete. Upstream issue with a file to test valid heredoc usage here
  • the performance patches don't work as expected upstream. Needs checking once template issues are fixed.

@martanne
Copy link
Owner

martanne commented Dec 28, 2020

Thanks. I rebased once again, incorporating the latest upstream changes to the PHP, jq and text lexers.

I also moved the lexer template back into the README for now.

Another TODO item is to port existing lexers using either deprecated functions or the old style:

I saw that upstream (@orbitalquark) also has an issue tracking the progress.

@eworm-de
Copy link
Collaborator

Was this closed by intention?

@moesasji
Copy link
Contributor Author

moesasji commented Dec 31, 2020

Was this closed by intention?

No, this issue is still being worked on. Github for some reason closes a pull-request when the branches are in sync as I made the mistake to push to my branch after the rebase and I can't reopen it.

@moesasji
Copy link
Contributor Author

moesasji commented Jan 1, 2021

@martanne Did you have any luck figuring out the best way to fix the issue with the new color theming as mentioned here?

@mcepl
Copy link
Contributor

mcepl commented Jul 11, 2022

According to upstream “All lexers except rest have been migrated.”

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants