Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: support basic regex analysers in XML #828

Merged
merged 1 commit into from
Aug 21, 2023
Merged

Conversation

alecthomas
Copy link
Owner

The <analyse> element contains a regex to match against the input, and a score if the pattern matches.

The scores of all matching patterns for a lexer are summed.

Replaces #815, #813 and #826.

The `<analyse>` element contains a regex to match against the input, and
a score if the pattern matches.

The scores of all matching patterns for a lexer are summed.

Replaces #815, #813 and #826.
@alecthomas alecthomas merged commit a20cd7e into master Aug 21, 2023
2 checks passed
@alecthomas alecthomas deleted the aat/xml-analyse branch August 21, 2023 19:32
@gandarez
Copy link
Contributor

Why scores are added instead of early returned? It won't work unless you implement a control flow and the developer decides if it needs to be added or not.

For example C# Aspx

if csharpAspxAnalyzerPageLanguageRe.MatchString(text) {
	return 0.2
}

if csharpAspxAnalyzerScriptLanguageRe.MatchString(text) {
	return 0.15
}

return 0

@gandarez
Copy link
Contributor

Pygments ex1 and ex2

@alecthomas
Copy link
Owner Author

Summing the scores seems like a generally more useful approach to me. For the two PRs you've sent it makes no sense to early exit. If a file has both #include < and using namespace, those are strong signals.

If you'd like different behaviour, send a PR. I could see an extra attribute like single="true" being useful.

@gandarez
Copy link
Contributor

I think would be better if the attribute lives in the root node of analyse. So my suggestion is to change like this. What do you think?

<config>
  <analyse single="true">
    <regex value="(?m)^\s*#include &lt;" score="0.1">
    <regex value="(?m)^\s*#ifn?def " score="0.1">
  </analyse>
</config>

@alecthomas
Copy link
Owner Author

Yeah great idea!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants