Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Follow-up on #484, #473 and #470: looking for ways to print fewer quoted strings #485

Closed
GerHobbelt opened this issue Apr 16, 2019 · 6 comments

Comments

@GerHobbelt
Copy link

This continues the discussion that's at the end of pullreq #484 as I came to the conclusion that patching isPlainSafe() in js-yaml dumper is not the way forward when you want fewer quoted strings, but instead should look for ways to make js-yaml dump string scalars in literal or folded style using the yaml-specified block chomping indicators where necessary.

References:

How to prevent mistakes / introducing bugs while working on this?

Fuzzer galore

The fuzzer box that's part of the js-yaml test rig caught a few mistakes already (test/25-dumper-fuzzy.js) and it will surely help in the future.

Including Test Suites from elsewhere

Another thing to add would (IMO) be the inclusion of a (augmented/patched/whatchammacallit)
version of the yaml/yaml-test-suite. (https://github.com/yaml/yaml-test-suite)

FYI: Here's commit GerHobbelt@41bb115 which is part of a dev path on my local js-yaml fork. Do heed the warning in the commit's comments: 60 out of ~220 tests pass, the others still need work done on the test rig to produce a test that's doable for js-yaml and usable to tickle all the edge cases.

Maybe @perlpunk has some ideas 😉 how I might approach this? The basic idea right now is to take the yaml-test-suite and integrate its basics in the js-yaml test rig (i.e. have the js-yaml test rig run those tests and see what barfs or not), while yaml spec parts which are not supported by js-yaml are edited out, without patching the original .tml test spec files directly -- so that we can update that set any time anyone wishes.

That's my 2 cents so far; all critique appreciated, particularly when you think this approach is fundamentally flawed. (Note: I'm not shooting for a yaml-test-matrix clone, but for a unit test set that can be run on node via make test with little to no fuss and still catch 99.9% of the nasties. 😄 )

@GerHobbelt
Copy link
Author

FYI: dialing up the cycle count of the fuzz box from 100 to 3000 is in this dev commit: GerHobbelt@587b02c

(It's got some debug stuff in there too. The numRuns is the magic trick to increase your chances of hitting the edge cases dug up by the fuzzer every time - compare #483)

@perlpunk
Copy link

perlpunk commented Apr 16, 2019

@GerHobbelt the test suite has valid and invalid documents. You can identify them by the error tag or the error data point.
You can either use the node testml-compiler from here https://github.com/testml-lang/testml, or you can use a data release (or the data branch https://github.com/yaml/yaml-test-suite/tree/data), which has all the data from the testml files split into a file for each data point. No need to parse the tml files yourself.
The data branch is not guaranteed to stay (might be force-pushed), using a release tag would be better, but I haven't created a release for a while. I should do that. It's still in the works.

For testing parsing just use those tests which have an in-json data point (or in.json file), which you can then compare to the result from loading the in-yaml data point.

I would add a list of tests locally that are known to fail, so they can be skipped.
Alternatively you could make a whitelist of tests that are known to pass.

My approach for my own perl parser YAML::PP currently is: I created an orphan branch in my repo and added the yaml-test-suite data branch as a subrepo https://github.com/ingydotnet/git-subrepo which I can update when necessary.

btw, if you use IRC, you can find @ingydotnet and me (aka tinita) on freenode #yaml-dev and #testml

@puzrin
Copy link
Member

puzrin commented Apr 16, 2019

How to prevent mistakes / introducing bugs while working on this?

From my experience, the only way is to read all yaml spec and try to understand it :). That's why big changes are rare - very few people have time to dive into details.

@eemeli
Copy link

eemeli commented May 28, 2019

The way I handle this in yaml is to start trying to stringify a string as a plain value (unless the node has a different preference), and then escalate from there to block values & single/double-quoted values as appropriate. For plain in particular it's necessary to try and re-parse values matching /^[\w.+-]+$/ to avoid conflicts with any schema-dependent values like true or dates.

Regarding tests, you may be interested in looking at what I've got, which includes everything I could extract from the spec, a harness for the yaml-test-suite, and a number of corner cases that have come up. Getting the errors to zero on all that was a bit of work, but it has really paid off in the long run.

@perlpunk
Copy link

I also have my own testcases for strings, but not yet happy with the format:
https://github.com/perlpunk/YAML-PP-p5/blob/master/examples/yaml-schema.yaml
It has examples for 4 schemas: Failsafe, JSON, Core, and YAML 1.1
This is partially specific to perl.
Maybe we can put that into the yaml-test-suite at some point.

@puzrin
Copy link
Member

puzrin commented Dec 18, 2020

Close as "endless". Let me know if i missed something important and reopen required.

@puzrin puzrin closed this as completed Dec 18, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants