Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Validating static site generator data files #13

Closed
wkoszek opened this issue Jul 12, 2016 · 4 comments
Closed

Validating static site generator data files #13

wkoszek opened this issue Jul 12, 2016 · 4 comments

Comments

@wkoszek
Copy link

wkoszek commented Jul 12, 2016

Hi,

I think I may find yamllint useful, but I'm trying to understand which rules does it use to validate the files. Spec? I ask, since I've made myself a mini-lint version in Ruby and run it on my sample files:

wget https://raw.githubusercontent.com/wkoszek/me/master/scripts/yamllint.rb
wget https://raw.githubusercontent.com/wkoszek/me/master/source/blog/2016-06-27-what-docker-really-is.md
chmod 755 yamllint.rb
./yamllint.rb 2016-06-27-what-docker-really-is.md

So my understanding is that the static code generator files (YAML frontmatter + text) are a correct files YAML-wise: 1st document in an .md file is 1 YAML document, and whatever is after --- is a 2nd document. Command from above seems to work indicating the file is correct enough for YAML module in Ruby. If you uncomment the pp part of the script, you'll see that the front matter is loaded. Content isn't (which is OK for now), but might be loaded if I modify the script a bit: http://stackoverflow.com/questions/19325251/ruby-yaml-multiple-documents

Anyway: yamllint is more picky however with this file:

yamllint 2016-06-27-what-docker-really-is.md
2016-06-27-what-docker-really-is.md
  27:1      error    syntax error: expected '<document start>', but found '<scalar>'

I'm trying to understand what's wrong with this line, since it seems valid too me and pointing somewhere in the middle of my content.

Any ideas?

@adrienverge
Copy link
Owner

adrienverge commented Jul 12, 2016

Hi @wkoszek,

yamllint uses rules to detect style problems, but also looks for syntax errors. For this purpose, it uses the Python YAML library. This is where the problem is: the pyyaml library doesn't like the comment at line 25 in your file.

The issue can be reduced to this simple YAML file:

---
This is paragraph one.
# comment
This is paragraph two.

Importing this file through Python's pyyaml lib results in an error. But if you remove the comment, it works fine. You can check it with the following command:

python -c 'import sys, yaml; yaml.dump(yaml.load(sys.stdin), sys.stdout)' </tmp/bad-yaml

I'm not sure whether this is invalid YAML or a pyyaml bug. According to the official YAML spec:

Comments are a presentation detail and must not have any effect on the serialization tree or representation graph.

To conclude, I see two solutions here:

  • Solution 1: File a bug at pyyaml

  • Solution 2: Make your YAML literal string explicit, for example:

    --- |
      This is paragraph one.
      # comment
      This is paragraph two.

@wkoszek
Copy link
Author

wkoszek commented Jul 12, 2016

@adrienverge Thanks. That's helpful. Filled an issue here: yaml/pyyaml#30

@adrienverge
Copy link
Owner

Sorry @wkoszek, after re-reading the YAML specifications (both 1.1 and 1.2), I realize this is not a PyYAML bug, but a problem with your YAML document.

As stated in paragraph 3.2.3.3 of the spec:

Comments must not appear inside scalars, but may be interleaved with such scalars inside collections.

This explains why the following document, that only contains one big string scalar, is parsed OK:

---
This is paragraph one.

This is paragraph two.

but the following doesn't, because the string scalar has a comment in the middle:

---
This is paragraph one.
# comment
This is paragraph two.

@wkoszek
Copy link
Author

wkoszek commented Jul 28, 2016

@adrienverge Let's close this guy here, and maybe keep the discussion open in pyyaml. I guess we can't really do much about it anyway.

@wkoszek wkoszek closed this as completed Jul 28, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants