Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat: Custom Minimap Section Header Marker Detection RegExp ✨ #210271

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

pouyakary
Copy link
Contributor

@pouyakary pouyakary commented Apr 12, 2024

This fixes #209904

This PR adds the ability to define custom regexes for finding the section markers. This not only makes possible to change the MARK: to say Title:, but also to define any custom style. For example the Comment 6's Title Comment:

A note on my implementation was at first I wanted to have an array of regexes and support multiple templates. But then I was afraid of both how much chaos this can bring to projects and how much computationally expensive it would be. So I intentionally limited this for uniformity in the codebase and keeping the performance acceptable.

Also, I think by having this we can eliminate the mark/region boolean checkpoints in the settings and just have one regexp option. (or support some predefined enums like mark and region that automatically translate to their regexes`)

@Die4Ever
Copy link

Die4Ever commented Apr 12, 2024

would this be something you could adjust in workspace's settings.json? because I would love to just do

function\\s*(.*)$

@gjsjohnmurray
Copy link
Contributor

@Die4Ever at the moment I think the match must be in a comment

@pouyakary
Copy link
Contributor Author

@Die4Ever yeah that is it exactly

@pouyakary pouyakary changed the title Custom Minimap Section Header Marker Detection RegExp Feat: Custom Minimap Section Header Marker Detection RegExp ✨ Apr 13, 2024
@pouyakary
Copy link
Contributor Author

@gjsjohnmurray thanks a lot for the review and acceleration of this PR!

@pouyakary
Copy link
Contributor Author

A note on my implementation was at first I wanted to have an array of regexes and support multiple templates. But then I was afraid of both how much chaos this can bring to projects and how much computationally expensive it would be. So I intentionally limited this for uniformity in the codebase and keeping the performance acceptable.

@Keavon
Copy link

Keavon commented Apr 14, 2024

Question: does this support multi-line regular expressions? I often use the style

// =======
// FOO BAR
// =======

So I'd like to be able to use an expression like ^===+$\n^.*$\n^===+$ to detect at least three equal signs, then anything on the next line, then another three-or-more equal signs on the third line.

@pouyakary
Copy link
Contributor Author

pouyakary commented Apr 15, 2024

@Keavon sadly no. The current implementation feeds one line at a time to the regexp. And for performance reasons I kept the architecture the same. What I think ultimately must be done is for the section headers to be defined through API as well as this regexes so that language extensions can define their own. This way for example I can define them in my own extension and you can develop your own extension where a speciality tool scans the codes and finds these. Reading your comment, I though about workarounds like feeding 3 lines at a time to the regexp, but then it becomes too complicated for users to define.

All that being said, I think the best solution to this problem is still the way Xcode has done this. Me and you we put characters representing a line in the code so that the separation is visible. However in Xcode the editor applies an actual line decoration when it detects the hyphen after the MARK: sign:

I think the one line solution is performance-wise perfection and vscode must adopt the line decoration (If I find a way to do it I certainly will make the decoration in a future PR). Currently I have created the regexp in away that it takes two match groups labeled label and separator. So for example the mark can be written as:

/MARK:\s*(?<separator>-\s?)\s*(?<label>.+)$/d

Here if the match group separator matches anything but an empty string, we conclude that the line must be rendered. And I hope this information be used to render the decoration.

@Keavon
Copy link

Keavon commented Apr 15, 2024

I figured that'd be the case, no worries. Just thought I'd ask. I think the simplicity and performance tradeoff makes the one-line solution worth it.

@pouyakary
Copy link
Contributor Author

@Keavon well it was really nice of you to do so, and I'm happy I met you, I loved what you're doing with Graphite. For the time being, I'm going to wait to see how someone from the vscode team will respond. If the idea of extensions marking the titles are considered, then it will be the best solution, otherwise maybe we can configure the system to detect some mark in the regexp as a linebreak and then according to that we can change the number of lines fed into the regexp. Maybe that be the second best solution that also suits your needs.

@pouyakary
Copy link
Contributor Author

@alexdima can you please check this out?

@bhallionOhbibi
Copy link

system to detect some mark in the regexp as a linebreak and then according to that we can change the number of lines fed into the regexp

It would be so great to enable multi-line support of some sort via the regex.
Usually thoses kind of marks are multi-line for lack of better solution to make them standout.
And while the MARK: is no ideal, being forced to rely on only one line will also be quite limiting (imo).

@pouyakary
Copy link
Contributor Author

pouyakary commented May 18, 2024

@bhallionOhbibi I agree. Unfortunately there has not been any reaction from the vscode team and this issue has either been ignored or lost. So I'm waiting for an official reaction to see if they are okay with this and if I should peruse the multiline solution.

However for what it is worth I believe that (in my experience) regular expressions are hard for even the average developer, and specifying the multiline regexp will become so hard to even debug. two solutions come to my mind:

  • Array of Regular Expressions • for each line. If the line we are parsing is tested with the first regexp we accept the line and test the next line with the second regexp and so on. it will be both performant and easier to understand. But then I have to be sure that vscode team will not kill effort to continue and make it so.

  • API for ExtensionstmLanguage file are one of the most brilliant breakthroughs of the text editor era. It made possible to have precise token colors without a full fledged language-specific parser working. And it also meant that the logic for the language grammar could be shared across editors. Yet, it was too complex to be written. Things like treesitter happened and they were all right, but the killer feature was when vscode API made possible for the actual language server to specify tokens. And about this, the regexp feels like the tmLanguage. I think it would be a more better if you could make your title style an extension and then anyone who wishes to have it will install it. Simple. And then there can be an architecture for the extension to make it easier to debug and more performant. (like lazy evaluation and partial evaluation, but those can be applied to the regular expressions solution as well).

Ultimately, we have to wait to get a response from the vscode team and see what should be done (or even if it should be done)

@Keavon
Copy link

Keavon commented May 18, 2024

I really like the idea of the array of regular expressions where it passing one lets it move to the next and so on. But yeah, hopefully this gets attention, otherwise it's not worth your time if they abandon this.

@Die4Ever
Copy link

Die4Ever commented Jun 5, 2024

@Die4Ever at the moment I think the match must be in a comment

Could it be made to work outside of comments? In some languages you could easily make a section header for every function

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Allow the new minimap section headers to use configurable tags
6 participants