-
Notifications
You must be signed in to change notification settings - Fork 89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move libsass specific tests to own repository #1484
Conversation
Any behavioral specs that are relevant to LibSass are relevant to all Sass implementations, so they should remain in this repository where every implementation can test against them. The same goes for any new regression specs: they should be added to the shared language spec repository. At a higher level, it's inappropriate to try to fork the language specs and close all your outstanding PRs because we've instituted a protocol for adding new ones. The plan to refactor the spec repo is well-documented in the issue tracker, and the PR that landed the style guide (#1363) was open for comment for a full month, with an explicit invitation to file issues about it after the fact. Workflows change, and part of working with a team is being flexible about adapting to workflows that support the broader team even if they aren't ideal for you in particular. |
@mgreter Ping—please re-open your other pull requests to keep LibSass's specs up-to-date. |
@mgreter Ping again. This radio silence isn't an effective way to collaborate. |
Can I hijack this for a moment? I am a drive-by contributor to this repository and think I have contributed 50 or so commits to this repository so maybe my opinion does not weigh much, but maybe relevant to other drive-by contributors. I am totally new to the HRX format and I have few suggestions/questions regarding the format itself and its use by the sass/sass-spec repo
And finally a pretty important question relevant to this issue in particular. I think we currently store both the spec ("how things really should be" or "machine-readable and testable language specification") and the regression tests ("this thing was broken in the past for implementation Zeta"). Since I am not a core sass developer and I my contributions to libsass itself have been tiny (mostly whining) I believe I was mostly contributing things that came out of bug reports, mostly from node-sass I am somewhat familiar with. My ideal workflow is the following:
This is a regression test case and is rarely minimal and does not describe Therefore I will almost never satisfy rules "DO test only one thing per spec" in those tests, because they are produced before even the issue is understood at all. I'd like to keep them around for a whole product lifecycle even if they are somehow duplicating language spec tests. Can't speak for @mgreter but this is what I think from the perspective of a small drive-by contributor that mostly deals with regressions and user reported bugs. Maybe keeping implementation specific regression tests in another repo is not a very bad idea overall - the language spec suite could become much cleaner (no todo flags etc.) in that case. Or we should have a common repo, but regression tests should be nicely integrated into the test suite. |
@saper That's a big post, and I don't really have time to respond to each question individually. I think a lot of those questions can be answered by reading through the HRX spec and looking at examples of how various HRX-ified test cases look in this repo. For concrete suggestions, feel free to file separate issues so we can discuss there. To address your larger point about regression tests, I don't consider them to be a totally separate class from spec tests the same way you do. I think of a bug that needs a regression test as an indication that the original spec tests weren't comprehensive enough, so the regression test itself should be in the same form and the same location as the original spec tests. This is true even if it involves an intersection of multiple language features—and in fact we have many non-regression tests that cover multiple features. There's always a way to represent that intersection without any unrelated behavior making it less clear. The style guide aims to encode one specific, consistent way of doing that. |
@nex3 please tell me which ones you mean, I checked and the only ones I saw were from WIP branches. |
@nex3 that PR was merged, so how do you expect me to "please re-open your other pull requests to keep LibSass's specs up-to-date."?? Main reason to pull out libsass spec tests is as @saper wrote: "I think we currently store both the spec ("how things really should be" or "machine-readable and testable language specification") and the regression tests ("this thing was broken in the past for implementation Zeta")." |
I think my point is that it might be difficult to coerce regression tests into the style guide. Which is fine, because I think this style guide is good for spec tests.
The style guide says: DO test only one thing per spec One feature per test, multiple features per test?
Yes, but it takes time, requires significant effort and can introduce bugs in the process.
The way I think now about regression tests is that they should preferably be the unedited versions of original bug reports (subject to sanity of course). The overlap between "clean" specs (this is how the feature looks like) and "dirty" regression tests is expected. User code being nice or ugly documents the actual usage of the language features, sometimes way beyond the imagination of the language creator. The regression test should just stay there, preferably identified by a bug number - as the style guide suggests in PREFER referring to issues when marking specs as Even if a particular regression issue gets closed as going against the spec, the corresponding regression could still be salvaged to test a hopefully improved error reporting. While specs grow and get better over time, regression tests are write-once-stay-ugly features. It might be worthwhile to have a clear separation between the two and apply the style guide rules to the specs test only. |
Sorry you're right. I guess I was thinking about the non-style-guide-ified pull requests you landed.
It's useful to test things that were broken for one implementation in other implementations as well. The point of the style guide is to ensure that those tests are as minimal as possible, rather than just copying whatever code the user originally reported. This helps emphasize exactly what the problem being tested was.
I don't think that's true. Consider #1486 and #1487 as counter-examples: both of these add regression tests for specific issues that still follow the style guide and minimally target the issues in question.
Another way to phrase that requirement is, "make the test as minimal as possible while still exercising the behavior in question". There are plenty of tests, both written before a feature landed and written to guard against regressions for a specific issue, that cover the intersection of multiple features. The point is that a single test shouldn't make a bunch of assertions at once the way, say, this one does.
I agree that it takes work to generate clear and minimal specs—I've personally been shouldering the bulk of that work as I go through and rewrite specs (or write fresh specs) for existing language features that follow the style guide. The style guide is designed to make that as smooth as possible by providing clear guidelines on what an ideal spec should look like. But ultimately, it is work, and that work is valuable: it's an investment in the long-term maintainability and flexibility of the repo.
What's the value you see in having these specs "stay ugly"? Is it just a question of minimizing the effort of initially adding them? Because I think it's essentially always worth the effort to avoid taking on technical debt for a project like this that's intended to live a long time. Those "ugly" tests come with significant costs relative to minimal, style-guide-compliant regression tests. When I was first enabling specs for Dart Sass, complex tests that covered multiple features posed a huge hassle for determining which features the new implementation actually supported. When they use deprecated features, they need to be manually migrated forward to new versions of the language even when those features aren't the point of the test. And whenever they do need to be changed, it's next to impossible to figure out if doing so actually preserves the meaning of the test because it's totally unclear what a minimal reproduction of the actual bug would even look like. Keeping a test case around is not free. Whether we decide to keep it ugly or not, it needs to be maintained over time, and that maintenance requires understanding what it's testing and why. On the other hand, it's legitimately valuable to test regressions across all implementations, because regressions represent a lack of coverage in the original test suite and a real breakage that happened in practice. These factors together point to making regression tests clear, well-organized, and maintainable just like all the other tests in the repo. |
This implies that specs need to be improved, and regression tests need to be added, necessitating twice the work to test one thing. Wouldn't it be more efficient to focus on cleanly improving the spec? I think @nex3 makes a great point about "ugly tests". They are essentially a lose-lose—simultaneously technical debt and missed opportunities to test across Sass implementations. |
So what is the outcome? Can we have libsass specific bug-report regression specs in a separate repo or not? I don't know what else to write beside pretty please with a cherry on top! We already miss quite a lot of regression tests in our unit tests: https://github.com/sass/libsass/issues?q=is%3Aissue+is%3Aclosed+label%3A%22Dev+-+Needs+Test%22. I don't have enough free time (and honestly also not the motivation) to break them down to an acceptable minimal level as currently set out in the readme. Won't go into details why this would be difficult (e.g. null-ptr access under certain nesting conditions), but I'd rather have them "ugly" than don't have them at all. P.s. as a concrete exercise, how would you expect me/us to break down regressions like https://github.com/sass/sass-spec/blob/master/spec/libsass-closed-issues/issue_245443/input.scss or https://github.com/sass/sass-spec/blob/master/spec/libsass-todo-issues/issue_221267/input.scss or P.p.s @nex3 "Sorry you're right. I guess I was thinking about the non-style-guide-ified pull requests you landed.": Feel free to point me to the PR you're meaning: https://github.com/sass/sass-spec/commits?author=mgreter&since=2018-01-01&until=2020-03-01 ... I only changed libsass specific stuff beside the fix in the spec runner that first broke our CI and then after my fix yours :-/ |
cd1b55a
to
c89119d
Compare
Add specs for selector-extend()
Since this isn't language-level behavior but specific to LibSass, it would be a great candidate for testing in
Thanks for pointing to concrete examples—let's look at the last one's input: I can't easily tell why Sass is parsing this incorrectly. Is it the empty comment If the error is indeed caused by a
This minimal test makes it clear what we need to fix in Dart Sass and guards against that brokenness in future Sass implementations. If the error actually has multiple causes, having small tests for each is also good. Say the other causes are:
When fixing the problem in a different implementation, there are now 3 minimal specs that we can address individually. IMO that gives more certainty that we've implemented the correct behavior, compared to trying to get |
One of the reasons we do not know answers to those questions in advance we need regression tests. Currently we have the following workflow in libsass:
Analysis at 4 is difficult and may not always be complete even during the bug fixing. Now back to the example at hand: Suppose we the parentheses bug in libsass 1.0. We have produced the following test (spec) in a process of fixing the bug:
Now, libsass 4.0 comes out and, again,
fails but for a different reason. Some code refactoring made this to pass:
but this to fail:
for at least 5 parentheses. Without a regression test, a "clean" spec test does not alert us of a problem in the 4.0 release. Does adding Did we re-introduce old bug because we didn't store the original problem? Yes. Guarding against reintroduction of old bugs is a primary goal of regression testing. The tests @mgreter provided are non-functional ones, they do not teach us about sass language and the desired functionality. They expose dark corners of the libsass implementation. Sure, Dart Sass and other implementations should be able to pass regression tests as well and they might help to find bugs as well. But we won't be able to share them if the some form of a style guide will be forced on us. Implementations that do not have multiple output styles (compact, compressed, etc.) will not be able to run those regression tests though as they are implementation-specific. |
I would like to keep them in sass-spec under libsass folder OR anywhere else (that's the main purpose of this PR), but since I did so my commit rights have been revoked by @nex3. So what do you expect me to do, I rather can follow the readme or don't do anything at all! We had a workflow in libsass to add a regression test for every bug that was reported, eg. #1473 (comment) or #1481 (as @saper also pointed out). So can we move "libsass specific tests to own repository" or not!?? They are not gone, anybody feel free to make minimal spec tests off of it, but I/we just want to have a 1 to 1 regression test for each issue!! Btw. @Awjin the above issue is because of stack limitation (and fuzzy people) in c/c++, which dart-sass also has in comon, but we as libsass get far worse reputation for it: https://www.cvedetails.com/vulnerability-list/vendor_id-16627/Libsass.html
PLEASE 🙏 |
@xzyfer @saper I'm also OK to just copy the existing regressions tests to a new repo and add the outstanding tests to it. I feel like this is currently the only way to keep our libsass regression test policy. I don't think we can get an acceptance under the sass github"folder" so we probably have to live with it living in my own "mgreter" space. I'll give a few more days, but will start to copy and to cover outstanding/existing regressions test there! |
Will seek another way to keep libsass regressions! |
Yes, I would just rename the repo to make sure it is not the "spec" repo. We just need to hook it into the libsass CI I think? |
I'm on board with a separate repo. We get a lot of external fuzzing that
doesn't group neatly into language features.
…On Sat, 7 Mar 2020, 1:40 pm Marcin Cieślak, ***@***.***> wrote:
Yes, I would just rename the repo to make sure it is not the "spec" repo.
We just need to hook it into the libsass CI I think?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1484?email_source=notifications&email_token=AAENSWBBIY6LKYH5WMKWGT3RGGXZLA5CNFSM4JAFICSKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEODM2DY#issuecomment-596036879>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAENSWDYN6Y6HVHQU37UFATRGGXZLANCNFSM4JAFICSA>
.
|
Since the workflow we used to have for libsass issues seems no longer welcome,
I would like to move all libsass spec tests to a separate repository so we can
at least have those tested with libsass CI. Once the move is done I try to
adjust all of our libsass CI specs to test and pass both repos.
And to also ensure we don't regress output mode specific stuff again.
https://github.com/mgreter/libsass-specs/tree/incubating [WIP]