Colons in metadata make Jekyll sad #2030

Closed
gjtorikian opened this Issue Feb 11, 2014 · 12 comments

Projects

None yet

7 participants

@gjtorikian
Member

Consider this metadata:

---
 description:
   We are conducting research with developers at: large companies, small companies, and individuals with side projects.
 ---

Generating this file throws the following error:

      Generating...   Liquid Exception: undefined method `to_str' for #<Hash:0x0000010f4535c0> in _includes/_post.html, included in categories/research/index.html

See the problem? The developers at: portion appears to be another YAML hash for Jekyll, but it isn't at all!

I'm not sure if we can legitimately escape all colons. Is there a way we can fix this, though?

Member

Y u no quote?

---
description:
  "We are conducting research with developers at: large companies, small companies, and individuals with side projects."
---

Tried on my rig and it seemed to work...

Owner
parkr commented Feb 11, 2014

If you try

---
description: "We are conducting research with developers at: large companies, small companies, and individuals with side projects."
---

it should work!

Owner
parkr commented Feb 11, 2014

Whoop, @troyswanson beat me to it ;)

@parkr parkr added the Question label Feb 11, 2014
Member

For the first time in my life!

Member
ixti commented Feb 11, 2014

For long lines I would suggest:

description: >
  We are conducting research with developers at:
  large companies, small companies,
  and individuals with side projects.
Member

To clarify: this isn't a problem for me, but I had to help a less technical colleague.

When the same actions produce the right result 99% of the time, and then falters for the inclusion of a :, it seems to me that Jekyll should either be a little more permissive or a little more descriptive with the error. We've just now essentially trained someone to learn a new rule ("always wrap with quotes!") when we should be making it as simple as possible.

Just my two cents! As I alluded to in the OP maybe it's an unsolvable problem. Computers are stupid tough.

@parkr parkr referenced this issue in dtao/safe_yaml Feb 12, 2014
Closed

Be more descriptive about multiple colons #54

Owner
parkr commented Feb 12, 2014

Jekyll should either be a little more permissive or a little more descriptive

I agree – I wish it were possible for us to change this. Two reasons it's an NPHard problem:

  1. We can't know what type a key's value should be, so we can't say "whoopsie! you meant a string there, right? I got a hash".
  2. It breaks with the YAML spec to have this second : and expect it to be one key-value pair, as a : is what distinguishes a key from a value, so we'd have to break from the YAML spec to automagically fix this

I've asked if there's anything we can do in the YAML parser we use: dtao/safe_yaml#54

We could also start allowing TOML Front-Matter.

Contributor
dtao commented Feb 12, 2014

Is there something special about the key "description" here? Here are the two possibilities as I see it:

  1. You could have some way to indicate to the YAML engine: "always treat the value(s) for this/these key(s) as X"; I would definitely be OK w/ supporting that in SafeYAML
  2. The YAML engine could have some heuristic that's like, "Whoa, that is waaay longer than an ordinary key... maybe the user just meant for this whole thing to be a string?"

The second approach seems a lot shakier to me.

That said, @parkr suggests that maybe the YAML engine could be "more descriptive": that's also something I could get behind. Like—configurable by some option, naturally—SafeYAML could detect certain unusual things like really long keys and output warnings in these cases, e.g.

Detected unusually long key "[unusually long key]" --
did you mean for the "description" property to be a string?
If so, enclose the text in quotes.

Even as I type that I can sense developers elsewhere in the world rolling their eyes, or something. But I'm not strictly against doing something like that.

Still, the first option seems best, if it's a possibility.

tamouse commented Feb 13, 2014

Another option: stop expecting computers to know what you want. Follow the
spec.

Contributor
dtao commented Apr 4, 2014

I'm planning on releasing a new version of SafeYAML soon and was curious if there are any updates from jekyll's end here (maybe discussions that took place outside this thread?).

@tamouse, it feels to me like you're maybe attacking a straw man in your last comment? I don't think anyone's proposing anything crazy like deviating from the YAML spec. The goal is just to be as helpful as possible to users, who may not understand a relatively technical error like "undefined method 'to_str' for #<Hash:0x0000010f4535c0>".

My thinking is that it might be useful for SafeYAML to support a custom schema, for when you know what keys you're expecting and what types the values should be. Or maybe a "shallow" option, for when the data you're expecting is supposed to be only key-value pairs, with no nested maps within maps.

On the other hand, if it turns out this isn't really important, I won't bother doing anything.

Owner
parkr commented Apr 4, 2014

My thinking is that it might be useful for SafeYAML to support a custom schema, for when you know what keys you're expecting and what types the values should be. Or maybe a "shallow" option, for when the data you're expecting is supposed to be only key-value pairs, with no nested maps within maps.

I really like the first idea here, where Jekyll could be like "warn me if the YAML you read in deviates from a particular schema" or something. With Jekyll, it's incredibly important to be as helpful as possible with warnings. I would love to see a way to mitigate problems with the colons whether by saying "oh hey I encountered some YAML that looks wonky is this what you meant?" or by just flipping out when you encounter multiple colons on the same line.

@parkr parkr closed this Jul 31, 2014
Contributor
dtao commented Aug 1, 2014

OK just spitballing here.

Let's say you have this YAML:

foo: 123
bar: "blah"
description:
  We are conducting research with developers at: large companies, small companies, and individuals with side projects.

Then let's say you could provide SafeYAML with some sort of "schema" in the call to load, like:

SafeYAML.load(yaml, nil, :schema => {
  :foo => String,
  :bar => Array,
  :description => String
})

SafeYAML might then throw an exception with something like:

Expected "foo" to be a String, but got a Number
  - Did you forget to put quotes around it?
Expected "bar" to be an Array, but got a String
Expected "description" to be a String, but got a Hash
  - Is there a colon in the string? You may need to put quotes around the whole thing.

The idea here is that, obviously, SafeYAML could check the actual YAML against the expectations you provided and throw an exception if any of them isn't met. Then it could also provide helpful hints for common mistakes (like the one at the source of this ticket), where applicable.

All that said, I'm torn on whether this belongs in SafeYAML or not. On the one hand I could see it being useful, and having the logic to do this verification in the library itself could save a lot of developers (potentially) the trouble of implementing it themselves. On the other hand, of course every application is going to have plenty of its own validation. For example, Jekyll could certainly do some sanity checks on whatever it gets back from SafeYAML.load before proceeding... such as that description is a string.

So maybe this would be a waste of effort? I'm undecided. I'll think about it.

@jekyllbot jekyllbot locked and limited conversation to collaborators Feb 27, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.