Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Archetype fields are parsed/sorted upon instantiations #452

Closed
yacoob opened this issue Aug 29, 2014 · 18 comments · Fixed by #3605
Closed

Archetype fields are parsed/sorted upon instantiations #452

yacoob opened this issue Aug 29, 2014 · 18 comments · Fixed by #3605

Comments

@yacoob
Copy link
Contributor

yacoob commented Aug 29, 2014

I've created archetypes/post.md containing some fields. Most of them are just there to remind me that I should fill them in:

---
zyzio: url
hg:
  - foo
aliases:
  - bar
cover_img: url
social_img: url

---

After hugo new post/foo.md, it turned out that hugo unhelpfully sorted all of the fields:

---
aliases:
- bar
cover_img: url
date: 2014-08-29T23:18:10+01:00
hg:
- foo
social_img: url
title: foo
zyzio: url

---

Not really helpful :(

@natefinch
Copy link
Contributor

Yeah, the problem is that Hugo isn't just copying the contents of the
fields, it's parsing them and then spitting them back out along with a
couple default values. It's nontrivial to fix, because it's not that we're
sorting them on purpose, that's just hire the underlying libraries work.

You're not the first person to complain about it, and I think you're right,
Hugo should maintain the order. I'll take a look at the code and see how
hard that would be.
On Aug 29, 2014 6:20 PM, "Jakub Turski" notifications@github.com wrote:

I've created archetypes/post.md containing some fields. Most of them are
just there to remind me that I should fill them in:

---zyzio: urlhg:

  • fooaliases:
  • barcover_img: urlsocial_img: url---

After hugo new post/foo.md, it turned out that hugo unhelpfully sorted
all of the fields:

---aliases:- barcover_img: urldate: 2014-08-29T23:18:10+01:00hg:- foosocial_img: urltitle: foozyzio: url---

Not really helpful :(


Reply to this email directly or view it on GitHub
#452.

@spf13 spf13 added this to the v0.13 milestone Aug 30, 2014
@halostatue
Copy link
Contributor

I’ve just been looking at this for the last hour or so, and I’m not sure it’s going to be easy to do. The order of keys is determined by the front matter creator.

If the front matter were a specific type rather than map[string]interface {}, it might be possible to implement this, but it would mean that there would have to be custom encoders/decoders written for that type for each of the parsers (and that may not be easy to do).

Strictly speaking, this isn’t even a matter of the underlying libraries, it’s a deliberate design decision of Go itself for map[] iterating in a pseudorandom matter (it’s explicitly unordered iteration).

@mohae
Copy link
Contributor

mohae commented Nov 2, 2014

It's not surprising that no solution was found as that is the proper behavior and not a bug, according to YAML specs. To do otherwise would be to not follow the specs. In these situations, a sequence is to be used. http://yaml.org/spec/1.2/spec.html#id2765608

I also feel I need to address the 'blaming' of Go for this as this is a misunderstanding of maps and hashes, from a technical perspective. Since maps are hash backed and hashes have no order themselves and it's impossible to infer order from hashes, e.g. sorting files by checksums would not produce anything useful. This is not something unique to Go, which is why YAML specs say the same thing.

Go not supporting ordering of map entries and their solution, to use a more appropriate data-strucuture(s) when ordering is required, or provide methods to provide ordered output, seems reasonable.

I much prefer it to the way other languages allow developers to rely on a side-effect that they shouldn't, ordering of hashes. It leads to numerous problems; Chef's re-ordering of recipes and Salt's re-ordering of their state files, both to the bane of developers writing those files who expect that the code they write will have their order preserved, are two examples that have occurred in recent years.

This bug-report is in the same class, imo, and a result of the same misunderstanding of maps as data strucutures.

@halostatue
Copy link
Contributor

@mohae Go originally supported insertion-ordered maps, but the core language designers changed that and changed map iteration to always use a pseudorandom starting hash for iteration so that it would be impossible to depend on that.

You’re right that hashtables are inherently unordered, but it is possible to spec your maps such that they support insertion ordering. The folks behind Go simply chose not to do so (and to make it that much harder to do so). I don’t blame them—it’s a perfectly cromulent decision to make. It’s just an observation that even if the YAML spec said nothing about ordering, Go makes it very difficult to do otherwise.

When I was looking at the solution, I was looking from the perspective of writing an OrderedMap structure that would act like map[string]interface{} but also work with insertion ordering—but I decided that I don’t know enough Go to be able to implement that…and it wouldn’t help in any case because we are using parsers that insert into map[string]interface{} so I would lose the sorting battle before it even got started.

All that clear, I think that this just has to be closed because it is pathologically unfixable from a Go perspective (unless, again, Hugo were to switch to streaming parsers and perform the bulk of the parsing itself into an OrderedMap type of structure; I have no interest in implementing something like that).

@bep
Copy link
Member

bep commented Nov 15, 2014

What I could like a little better than the current situation, and should be easy to implement, is to add some kind of sorting prior to writing the front matter to file. Even alphabetic would be better than today's shuffle.

But also maybe weigh some fields higher than others, title, date? I have been a little bit annoyed about this when browsing my content on GitHub.

EDIT IN: I notice Hugo uses a string map also internally, which makes the above NOT so easy to implement. But there are sorted map implementations out there.

@derekperkins
Copy link
Contributor

Does it really need to parse the parameters at that point? Couldn't it just parse them to ensure validity, then copy the original into the new post? It shouldn't be too hard to do a simple string insertion of the date or any other auto-injected parameters.

@bep
Copy link
Member

bep commented Nov 20, 2014

@derekperkins I haven't looked closely into this one, but the parsing is delegated to different frameworks (YAML, TOML, JSON) -- when Hugo gets the data it is a Go map, and there has been no guaranteed iteration order in those maps since Go 1.

@halostatue
Copy link
Contributor

@derekperkins, @bjornerik is correct and what I found with my quick investigation and where I noted that we would still need to write custom encoders and decoders to ensure an order to the map.

You also can’t not parse the archetype front matter, because the archetype may be coming from the theme that you’re using and you may have indicated that you prefer a different front matter format than what the archetype indicated (with the MetaDataFormat option).

@derekperkins
Copy link
Contributor

This is what I came up with to solve the problem.

// This is my archetype
+++
[Main]
    title               = "title"
    description         = "insert_description_here"
    author              = "Derek|Joe|Tanner"

[ID]
    slug                = "slug"
    disqus_identifier   = "this_needs_to_be_unchangeable_and_unique"

[Taxonomies]
    devtags             = ["tag1", "tag2"]
    personas            = ["cat1", "cat2"]
    series              = ["x", "y"]
+++

When I create a new post, this is what comes out. Comments are stripped out and the map is alphabetized, but the grouping solves most of my issues.

+++
date = "2014-11-20T18:01:54-07:00"
title = "newpost"

[ID]
  disqus_identifier = "this_needs_to_be_unchangeable_and_unique"
  slug = "mynewpost"

[Main]
  author = "Tanner"
  description = "insert_description_here"
  title = "title"

[Taxonomies]
  devtags = ["tag1", "tag2"]
  personas = ["cat1", "cat2"]
  series = ["x", "y"]
+++

@anthonyfok anthonyfok added the Bug label Jan 14, 2015
@anthonyfok
Copy link
Member

I noticed this discrepancy while going through the documentation, and did some testing. With the example archetypes/default.md of:

+++
tags = ["x", "y"]
categories = ["x", "y"]
+++

Hugo v0.11 (hugo_0.11_amd64_linux new post/test.md) gives:

+++
categories = ["x", "y"]
tags = ["x", "y"]
title = "test"
date = 2015-01-14T02:26:19Z
+++

whereas Hugo v0.12 and above (up to HEAD of v0.13-DEV) gives:

+++
categories = ["x", "y"]
date = "2015-01-13T19:20:04-07:00"
tags = ["x", "y"]
title = "test"

+++

I couldn't get -f yaml to work with Hugo v0.11, but oh well... :-)

So, regardless of the root cause, to the end users, it is a regression: At least for TOML, It used to keep the proper order in v0.11, but now it sorts all the variables alphabetically and messes things up. I haven't looked into it any deeper, though I wonder if a commit in Hugo causes this, or whether an external library changed.

Just my 2 cents. :-)

@halostatue
Copy link
Contributor

That doesn’t look right, @anthonyfok. As you indicated, if you have this archetype:

+++
tags = ["x", "y"]
categories = ["x", "y"]
+++

If Hugo v0.11 (hugo_0.11_amd64_linux new post/test.md) gives,

+++
categories = ["x", "y"]
tags = ["x", "y"]
title = "test"
date = 2015-01-14T02:26:19Z
+++

Notice that categories and tags are now sorted and not the same order as the archetype. The later versions also sort, but insert a few other fields.

As I indicated, this is something that the Go team has gone out of its way to ensure map structures do not iterate the same way every time (without providing an omap implementation that would make this work for us).

That said, I did a little more digging, and the folks behind YAML have possibly done us a favour if we use yaml.MapSlice instead of map[string]interface{}. It’s not perfect (you lose fast lookup…), but I don’t know of another way to have an ordered map in Go at this point.

@anthonyfok
Copy link
Member

I stand corrected. Thank you for your detailed explanation, @halostatue.

(I didn't even read my tests correctly. Note to self: Next time, read the whole thread before speaking.)

Thanks for digging deeper into this: the new yaml.MapSlice in yaml.v2 sounds interesting indeed.

Cheers,
Anthony

@spf13 spf13 modified the milestones: v0.13, v0.14 Feb 22, 2015
@anthonyfok anthonyfok modified the milestones: v0.15, v0.14 Sep 16, 2015
@dimo414
Copy link
Contributor

dimo414 commented Oct 1, 2015

Just to clarify, the reason Hugo needs to parse and re-write the archetype files is just to insert the date and title fields, correct? I know it's hack, but why not simply insert those strings into the file contents right after the ---/+++? While map data structures may not be ordered, data mappings often are, and if the parser available for Go can't support that use case, it seems like (for this specific task) it shouldn't be used at all.

@bep
Copy link
Member

bep commented Oct 1, 2015

@dimo414 you got a point. It would make the code a little more complex, but doable I guess. Well tested PRs are welcome.

@bep bep changed the title Archetype fields are sorted upon instantiations Archetype fields are parsed/sorted upon instantiations Feb 28, 2017
@bep bep modified the milestones: v0.21, future Apr 13, 2017
@bep bep modified the milestones: v0.22, v0.21 May 9, 2017
@bep bep modified the milestones: v0.22, v0.23 Jun 7, 2017
@bep bep modified the milestones: v0.23, v0.24 Jun 16, 2017
@bep bep self-assigned this Jun 16, 2017
@bep bep added the InProgress label Jun 16, 2017
bep added a commit to bep/hugo that referenced this issue Jun 16, 2017
bep added a commit to bep/hugo that referenced this issue Jun 18, 2017
This commit removes the fragile front matter decoding, and takes the provided archetype file as-is and processes it as a template.

This also means that we no longer will attempt to fill in default values for `title` and `date`.

The upside is that it is now easy to create these values in a dynamic way:

```toml
+++
title = {{ .Name | title }}
date = {{ .Date }}
draft = true
+++
```

You can currently use all of Hugo's template funcs, but the data context is currently very shallow:

* `.Type` gives the archetype kind provided
* `.Name` gives the target file name without extension.
* `.Path` gives the target file name
* `.Date` gives the current time as RFC3339 formatted string

The above  will probably be extended in gohugoio#1629.

Fixes gohugoio#452
Updates gohugoio#1629
@bep bep closed this as completed in #3605 Jun 18, 2017
bep added a commit that referenced this issue Jun 18, 2017
This commit removes the fragile front matter decoding, and takes the provided archetype file as-is and processes it as a template.

This also means that we no longer will attempt to fill in default values for `title` and `date`.

The upside is that it is now easy to create these values in a dynamic way:

```toml
+++
title = {{ .BaseFileName | title }}
date = {{ .Date }}
draft = true
+++
```

You can currently use all of Hugo's template funcs, but the data context is currently very shallow:

* `.Type` gives the archetype kind provided
* `.Name` gives the target file name without extension.
* `.Path` gives the target file name
* `.Date` gives the current time as RFC3339 formatted string

The above  will probably be extended in #1629.

Fixes #452
Updates #1629
@github-actions
Copy link

This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Mar 18, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.