added syntest example to run ST syntax tests #44

keith-hall · 2017-03-10T13:18:24Z

This PR adds a new example called "syntest", which will parse and execute ST's "syntax_test_" files.

Currently, there is no way to reference a syntax definition in a SyntaxSet from it's original file path, which is how tests specify which syntax definition to use, so it just lets syntect choose based on the extension of the test file at the moment.

I used the regular "regex" crate to achieve the test line parsing - feel free to replace this with "onig" if you don't want the extra reference - I don't quite have enough experience yet to know how to use it effectively.

Also, if the syntax test files use Windows line endings, it has to replace "\r" with nothing, because otherwise the regular expressions don't match expressions like "$\n?" properly, but perhaps it would be better to update the main parsing module to fix this?

I've checked and it shows success on the XML, HTML and JSON syntax tests. I also tried modifying them to cause a test to fail, and syntest correctly picked that up.

However, it seems to detect a failure in the Haskell syntax test file, and it looks like some meta scopes are being applied multiple times - so I'd appreciate if you could take a look and see if it's a flaw in my syntest logic somewhere or a bug in the parser. Thanks!

cargo run --example syntest testdata/Packages/Haskell/
    Finished debug [unoptimized + debuginfo] target(s) in 0.0 secs
     Running `target/debug/examples/syntest testdata/Packages/Haskell/`
Testing file testdata/Packages/Haskell/syntax_test_haskell.hs
The test file references syntax definition file: Packages/Haskell/Haskell.sublime-syntax
Assertion selector "- comment.line.double-dash.haskell" from line 6 failed on line 5, column range 0-1 (with text ['2']) has scope [<source.haskell>, <comment.line.double-dash.haskell>, <comment.line.double-dash.haskell>, <comment.line.double-dash.haskell>, <comment.line.double-dash.haskell>, <constant.numeric.haskell>]
Ok(FailedAssertions(1, 422))

keith-hall · 2017-03-10T13:53:31Z

I vote for something fishy in easy:ScopeRegionIterator, because if I do println!("{:?}", &ops); after state.parse_line, I get what I would expect, but when I println!("{:?}", op); inside the loop, I am missing the last pop:

[(0, Push(<source.haskell>)), (0, Push(<comment.line.double-dash.haskell>)), (0, Push(<punctuation.definition.comment.haskell>)), (2, Pop(1)), (57, Pop(1))]
Noop
Push(<source.haskell>)
Push(<comment.line.double-dash.haskell>)
Push(<punctuation.definition.comment.haskell>)
Pop(1)
[]

keith-hall · 2017-03-10T15:09:03Z

The following change seems to help:

--- syntect-master/src/easy.rs
+++ syntect-syntest/src/easy.rs
@@ -148,10 +148,9 @@
 impl<'a> Iterator for ScopeRegionIterator<'a> {
     type Item = (&'a str, &'a ScopeStackOp);
     fn next(&mut self) -> Option<Self::Item> {
-        let next_str_i = if self.index >= self.ops.len() {
-            if self.last_str_index >= self.line.len() {
-                return None;
-            }
+        let next_str_i = if self.index > self.ops.len() {
+            return None;
+        } else if self.index == self.ops.len() {
             self.line.len()
         } else {
             self.ops[self.index].0

with this change, the "Batch File" and "Haskell" syntax tests start working, and the number of failures in "PHP" are reduced significantly.

trishume

This is just a preliminary review. I have to go right now but I can do a complete review soon.

In general this looks awesome. Thanks so much for writing this. It seems to be finding a bunch of bugs, but that's a good thing since after we get these to pass I'll feel a lot more confident that things work.

trishume · 2017-03-10T15:49:59Z

examples/syntest.rs

+}
+
+fn main() {
+    let ss = SyntaxSet::load_defaults_newlines(); // note we load the version with newlines


For this case it might be good to use SyntaxSet::load_from_folder(argv[0]) so that you can use it for syntax development without recompiling every time. Once we get syntect to a point where you can actually rely on it to do things correctly that is...

Actually since this is using the version with newlines included and load_from_folder defaults to without newlines. You can just do what load_from_folder does yourself:

let mut ss = SyntaxSet::new(); ss.load_syntaxes(args[0], true).unwrap(); ss.link_syntaxes();

great idea! however, in my testing using cargo run --example syntest testdata/Packages/HTML/, args[0] refers to the example in the target/debug folder, and it would have to traverse up a few folder levels to find the testdata/Packages folder where the syntaxes are, so we will need to come up with a suitable workaround I think.

Sorry I meant args[1] and then you would pass testdata/Packages as the first argument and it would run tests for all languages, although it's kinda nice to be able to run tests for only one language I guess. Maybe have the first command line argument be the same as now, and then if there's a second command line argument load from that folder, otherwise just load the defaults.

I was thinking something along those lines too :) thanks, I'll give it a go

trishume · 2017-03-10T15:53:58Z

examples/syntest.rs

+            println!("Testing file {}", entry.path().display());
+            let result = test_file(&ss, entry.path(), true);
+            println!("{:?}", result);
+            if let Err(_) = result {


It would be great if there were a flag/argument to make it not die on the first failing file. It would be easier to get a sense of how much was broken then.

it currently doesn't die on the first failing file (unless there is a panic), it just ensures that when it has run through all the files, it will exit with an appropriate exit code :) maybe we could add a flag though so that it optionally can fail fast, to save time (not that these syntax tests take long at all!) when all the caller wants to know is whether there were any failures

Oh woops. I just skimmed over the code briefly and missed that this didn't actually stop anything.

I think the behaviour you describe is perfect for now. I don't see any pressing need to add a fail-fast flag.

trishume · 2017-03-10T16:06:31Z

Cargo.toml

@@ -24,6 +24,7 @@ rustc-serialize = "^0.3"
 bincode = "0.6"
 flate2 = "^0.2"
 fnv = "^1.0"
+regex = "0.2.1"


For now the use of regex is fine. I may port it to onig before or after it is merged.

However, it should be under dev-dependencies since it is only used in an example. See rust-lang/cargo#523

keith-hall · 2017-03-11T19:11:20Z

It seems like the test failures are due to scopes from meta_content_scope being missing in the scope stack.
For example, Java:

Testing file testdata/Packages/Java/syntax_test_java.java
The test file references syntax definition file: Packages/Java/Java.sublime-syntax
Assertion selector "meta.class" from line 20 failed on line 19, column range 7-12 (with text ['c', 'l', 'a', 's', 's']) has scope [<source.java>, <storage.type.java>]

trishume · 2017-03-11T20:14:06Z

@keith-hall interesting. Thanks for the diagnosing. I'm going to see if I can review this PR later today and maybe fix some of the bugs it found. If not today I'll try and do it tomorrow.

keith-hall · 2017-03-12T08:10:59Z

looking at the Java example a bit more closely, it might be related to how set interacts with meta scopes - in ST, meta_content_scope and meta_scope apply to the match that sets a different context

trishume · 2017-03-12T15:57:48Z

@keith-hall I'm working on reviewing this. Just a heads up I pushed a commit to the branch refactoring the looking up of syntaxes by path (thanks for enabling maintainer commits). You may want to git pull and take a look at the commit.

keith-hall · 2017-03-12T16:04:13Z

@trishume nice one, that's a much better solution than what I had, thanks! :)

trishume · 2017-03-12T19:16:03Z

@keith-hall reviewed the fix to ScopeRegionIterator and committed the cleanup I did while doing so. It looks good, thanks for the debugging. You may want to pull.

Now all that's left for me to review is syntest.rs.

trishume · 2017-03-12T20:27:03Z

@keith-hall I just reviewed the syntest.rs file and it looks good. I pushed a commit to rearrange some error handling code and fix a few small things. I recommend you review the whitespace-insensitive diff to see what actually changed and what was just me changing the indentation of a large segment: 3e95235?w=1

I've now reviewed everything in this PR and am pretty happy with it. There's still the bugs that keep the tests from actually passing, but I'm thinking I/we can fix those in separate PRs. I'm ready to merge this now if you are. If you're ready to merge I think I'll do an interactive rebase to remove the dump changes to the first commit that changes them, just to avoid increasing the repo size unnecessarily.

After merging we can create issues for all the bugs we find. The one that makes the C# test suite crash at least already has an issue: #37

I'm currently looking at why the ASP test has so many failures. I think walkdir might actually be giving us the tests in a different order (possibly because of different OSs, I'm on OSX). Mine runs ASP (with many failures), then Batch (success), then C# (success), then another C# test (crash #37). The ASP failures may or may not be the same underlying bug as you are encountering with Java.

trishume · 2017-03-12T20:27:43Z

Also I notice that Travis is failing, but only the nightly one is failing, the stable one passes. And nightly fails on cargo --version so I think it's just Rust nightly that's broken on Travis.

keith-hall · 2017-03-12T20:44:47Z

@trishume yep looks good, I'm ready to merge - I agree that it makes sense to have separate PRs for the remaining issues :) thanks for the cleanup and associated commentary - makes it super easy to follow along :) it's a pleasure working with you!

do you want me to create the issue for the Java failures? I was thinking that it might make sense to tackle that first, then everything related to the same underlying issue will go away and make it easier to investigate what (if anything) else is wrong...

I was curious about those Travis failures - what's the benefit of building against nightly, if I may ask, when it's so unstable? :)

robinst · 2017-03-13T01:09:20Z

Wow, nice change! After all the bugs are fixed, are you planning to run this as part of regular tests?

trishume · 2017-03-13T01:17:00Z

@robinst yah I'm thinking maybe move it to a module and then use it from both the example and cargo test.

keith-hall · 2017-03-13T10:36:59Z

@trishume FYI, I've fixed the Java failures over at https://github.com/forkeith/syntect/tree/set, and it reduces the number of ASP test failures quite drastically too. I'm not sure if I will get it to a state where it is Pull Request-able or not, but wanted you to see my progress so we can collaborate together rather than work separately to achieve the same thing.

trishume · 2017-03-13T19:54:06Z

@keith-hall awesome. I'll take a look at it.

Don't worry about duplicating work in the next few days. I'll be busy with school work until this weekend so I won't be doing any work on syntect (other than Githubbing).

commit ammended later to remove changes to packdump files, to reduce repo size.

This change makes it so that it can't crash on non-UTF8 paths, makes lookup by file path more efficient, and avoids allocating a hash map data structure that is never used as a map. It also adds a test of the functionality Signed-off-by: Tristan Hume <tris.hume@gmail.com>

i.e. the ASP syntax tests make use of this behavior, to test whether line continuation punctuation works properly

I did this to make it clearer what was going on, so that I could understand it again and review that it was doing the right thing, since I clearly got it wrong the first time.

The error handling in test_file was rearranged to use the try! macro and ok_or combinator. This decreases indentation and makes the code easier to read. It is recommended to view this diff without whitespace to make it clearer what actually changed.

i.e. allow text after the assertion to not interfere with the assertion. This is how ST works when it executes the syntax tests.

trishume reviewed Mar 10, 2017

View reviewed changes

keith-hall force-pushed the syntest branch from 3d65934 to 6f5ccc8 Compare March 11, 2017 13:49

keith-hall force-pushed the syntest branch from df9425d to d2bc03a Compare March 12, 2017 06:12

keith-hall force-pushed the syntest branch 3 times, most recently from aafe342 to b03fe13 Compare March 12, 2017 15:00

keith-hall and others added 8 commits March 16, 2017 12:02

added syntest example to run ST syntax tests

8a957ff

fix bug in ScopeRegionIterator skipping the last operation on the line

29a24e5

don't ignore assertions after the newline, assert them against newline

5eac892

store original path when syntax is loaded, and use that in syntest

1904dc2

commit ammended later to remove changes to packdump files, to reduce repo size.

support exclude-only scope selectors that don't start with a space

09b64b4

make test behaviour more like ST's when not only tests are on the line

a4030a4

i.e. the ASP syntax tests make use of this behavior, to test whether line continuation punctuation works properly

Clean and comment ScopeRegionIterator

ac8b32b

I did this to make it clearer what was going on, so that I could understand it again and review that it was doing the right thing, since I clearly got it wrong the first time.

trishume and others added 2 commits March 16, 2017 12:05

Clean up syntest.rs example

ee5482f

The error handling in test_file was rearranged to use the try! macro and ok_or combinator. This decreases indentation and makes the code easier to read. It is recommended to view this diff without whitespace to make it clearer what actually changed.

trim scope selector at test token end

045dd0c

i.e. allow text after the assertion to not interfere with the assertion. This is how ST works when it executes the syntax tests.

trishume force-pushed the syntest branch from cb5353e to 045dd0c Compare March 16, 2017 16:10

trishume merged commit f5e01f1 into trishume:master Mar 17, 2017

keith-hall deleted the syntest branch March 17, 2017 12:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

added syntest example to run ST syntax tests #44

added syntest example to run ST syntax tests #44

keith-hall commented Mar 10, 2017

keith-hall commented Mar 10, 2017

keith-hall commented Mar 10, 2017

trishume left a comment

trishume Mar 10, 2017

trishume Mar 10, 2017

keith-hall Mar 11, 2017

trishume Mar 11, 2017

keith-hall Mar 11, 2017

keith-hall Mar 11, 2017

trishume Mar 10, 2017

keith-hall Mar 11, 2017 •

edited

Loading

trishume Mar 11, 2017

trishume Mar 10, 2017

keith-hall commented Mar 11, 2017

trishume commented Mar 11, 2017

keith-hall commented Mar 12, 2017

trishume commented Mar 12, 2017 •

edited

Loading

keith-hall commented Mar 12, 2017

trishume commented Mar 12, 2017

trishume commented Mar 12, 2017

trishume commented Mar 12, 2017

keith-hall commented Mar 12, 2017

robinst commented Mar 13, 2017

trishume commented Mar 13, 2017

keith-hall commented Mar 13, 2017

trishume commented Mar 13, 2017

added syntest example to run ST syntax tests #44

added syntest example to run ST syntax tests #44

Conversation

keith-hall commented Mar 10, 2017

keith-hall commented Mar 10, 2017

keith-hall commented Mar 10, 2017

trishume left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

keith-hall Mar 11, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

keith-hall commented Mar 11, 2017

trishume commented Mar 11, 2017

keith-hall commented Mar 12, 2017

trishume commented Mar 12, 2017 • edited Loading

keith-hall commented Mar 12, 2017

trishume commented Mar 12, 2017

trishume commented Mar 12, 2017

trishume commented Mar 12, 2017

keith-hall commented Mar 12, 2017

robinst commented Mar 13, 2017

trishume commented Mar 13, 2017

keith-hall commented Mar 13, 2017

trishume commented Mar 13, 2017

keith-hall Mar 11, 2017 •

edited

Loading

trishume commented Mar 12, 2017 •

edited

Loading