New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support metadata on markdown files and pages #97

Open
egeozcan opened this Issue Sep 17, 2013 · 44 comments

Comments

Projects
None yet
@egeozcan
Copy link

egeozcan commented Sep 17, 2013

The layout files and partials can use some inline metadata about the current page being viewed. For example, a title attribute from an article. I guess, Jekyll syntax could be used. Something like: https://raw.github.com/egeozcan/egeozcan.github.com/master/_posts/2012-02-07-by-the-way.md

...which would be accessible by current.data.title. Wouldn't it be great?

@kennethormandy

This comment has been minimized.

Copy link
Collaborator

kennethormandy commented Sep 17, 2013

Hey @egeozcan, thanks very much for opening an issue. We’ve definitely discussed this, and something similar was also brought up in #45, so there is interest.

I think you already know this, but just so we’re on the same page, here’s how I’d try what you’re after. You could bring in JB/setup as a partial on that blog post’s _layout.jade or _layout.ejs. That would be in the same folder as your posts, along with a _data.json file, which could look like this:

{
  "by-the-way": {
    "category": "about",
    "tags": ["intro", "javascript"],
    "date": "2012-02-07"
  }
}

The _data.json approach is definitely different than Jekyll’s, which can be really useful: It centralise related metadata. Front matter interfering with this benefit would be my biggest concern. I don’t think @sintaxi or I were that enthusiastic about supporting YAML inside Markdown, though I did think @ryanfitzer’s MultiMarkdown suggesting was interesting. But then if Markdown supports front matter, but I write some fancier blog posts in Jade or EJS, do they then have to support front matter, too?

Hopefully I’m not sounding negative—that’s just why it hasn’t been done thus far. It’s really great to hear other perspectives on how people think this should work, so if you have more thoughts or examples of how you’d like to use it, that’d be really helpful. Thanks!

@egeozcan

This comment has been minimized.

Copy link

egeozcan commented Sep 17, 2013

Thanks for the clear response @kennethormandy, and no, you don't sound negative at all. I totally understand how you made this decision and I'd like to point to some problems with this approach and offer a solution.

Problems
  • When using source control, merge errors occur when a lot of people edit json files, such as the closing brackets being parsed as the same line. This problem could theoretically be mitigated by using a json-aware diff tool but I've yet to find any.
  • Scalability. When the number of files grow, it becomes harder and harder to maintain the _data.json.
  • When trying to use the data from the layout, such as using the title in the page header, there's no way to easily access the contents of the _data.json. (I guess this can be solved by just merging the parsed data to the current object though)
Solution

Prepending the data to the files, as my initial suggestion, would break the compatibility of the files with their parsers when used outside harp, and is not a good practice overall for separation of concerns. It also could be problematic to parse if the same annotations are used in a theoretical future markup syntax.

It seems to me that the best solution would be to allow json and yaml files, prefixed with an underscore and named the same as the document that they'll attach, to be appended to the current object when rendering.

Example:

//by-the-way.json
{
    "title": "By the way",
    "category": "about",
    "tags": ["intro", "javascript"],
    "date": "2012-02-07"
}

So that I should be able to do this in layout:

<html>
  <head>
    <title><%- current.data.title %> | My Awesome Harp Based Blog</title>
  </head>
  <body>
    <%- yield %>
  </body>
</html>

Would this be too hard?

@ryanfitzer

This comment has been minimized.

Copy link

ryanfitzer commented Sep 17, 2013

The _data.json approach is definitely different than Jekyll’s, which can be really useful

One of the big reasons I like the front matter approach (regardless of syntax) is the reduced friction when creating content. Having to create/update 2 separate files isn't ideal.

While I do like the ability to define meta outside of the content file, it would be in cases when that meta doesn't add any valuable context/meaning to the content.

But when that meta does add context/meaning, I want in the same file.

@kennethormandy

This comment has been minimized.

Copy link
Collaborator

kennethormandy commented Sep 18, 2013

@ryanfitzer Well put. That’s definitely a strong use case for front matter, it’s come up a couple of times before. So you would like to have both in some format? Which would supersede the other if they both had a title, for example?


@egeozcan Very thorough, thank you. I definitely agree the separation of concerns. I think this is idea behind the _data.json file already, actually. It sounds like what you’re asking for is actually already possible, although it would take place in all one file. I tend to think that editing one small thing in many files is more difficult than many small things in one file. Anyway, here’s how I do what you’re suggesting.

The App

app/
  |- _harp.json
  |- _layout.ejs
  |- index.ejs
  |+ posts/
      |- _data.json
      |- there-is-no-spoon.md
      |- by-the-way.md

_harp.json

{
  "globals": {
    "title": "egeozcan",
    "tagline": "My Awesome Harp Based Blog"
  }
}

posts/_data.json

{
  "there-is-no-spoon": {
    "title": "There Is No Spoon",
    "tags": ["intro", "personal"],
    "date": "2012-02-07"
  },
  "by-the-way": {
    "title": "By The Way",
    "tags": ["intro", "javascript"],
    "date": "2012-02-07"
  }
}

_layout.ejs

<!DOCTYPE>
<html>
  <head>
    <title><%= title %> | <%= tagline %></title>
  </head>
  <body>
    <%- yield %>
  </body>
</html>

index.ejs

<h1><%= title %></h1>
<ul>
  <% for (var slug in public.posts.data) { %>
    <% var post = public.posts.data[slug] %> 
    <li>
      <a href="posts/<%= slug %>">
        <%= post.title %>
      </a>
    </li>
  <% } %>
</ul>
@ryanfitzer

This comment has been minimized.

Copy link

ryanfitzer commented Sep 18, 2013

@kennethormandy The <title> tag in @egeozcan example is using the post's title. In yours it's the global title.

I ran into this limitation as well.

@kennethormandy

This comment has been minimized.

Copy link
Collaborator

kennethormandy commented Sep 18, 2013

Actually, if you’re using the latest version of Harp, it will use the current context’s title! (You can update with sudo npm update harp -g).

So, if I’m at /posts/there-is-no-spoon, there’s a corresponding title in the _data.json, so my title tag will be <title>There Is No Spoon | My Awesome Harp Based Blog</title>. There is no _data.json file in the root directory, so there is no metadata for the index page. This means title falls back to whatever’s in the harp.json, if there is anything. So, on the index page, the title is <title>There Is No Spoon | My Awesome Harp Based Blog</title>

You could even take this further. If I wanted to add a different tagline on the By The Way post:

_data.json

{
  "there-is-no-spoon": {
    "title": "There Is No Spoon",
    "tags": ["intro", "personal"],
    "date": "2012-02-07"
  },
  "by-the-way": {
    "title": "By The Way",
    "tags": ["intro", "javascript"],
    "date": "2012-02-07",
    "tagline": "This is my Harp post"
  }
}

Now, one /posts/by-the-way, the title will be <title>By The Way | This is my Harp post</title>.

@ryanfitzer

This comment has been minimized.

Copy link

ryanfitzer commented Sep 18, 2013

@kennethormandy Nice! Glad to see the update. Thanks for pointing that out.

Which would supersede the other if they both had a title, for example?

Not sure. My sense is that the most intuitive scenario would be for content file to overwrite the json. But that's because I see the content as the most local context as far as scope. Others may see it differently.

Can you see a use case that would make a good case for the opposite?

@egeozcan

This comment has been minimized.

Copy link

egeozcan commented Sep 18, 2013

Thanks a lot for the great examples! They'll help a lot.

I agree with @ryanfitzer about the local overwriting global. But what if we had local under the "current" object? So we could do something like:

<%= current.title || title %>

This isn't a deal-breaker for me though. Most probably also wouldn't mind if local just overwrites.

And I still think that having everything in a _data.json file is not scalable (though I hate using that word). We definitely need individual metadata support, be it an external file or an annotation in the article/page itself.

@sintaxi

This comment has been minimized.

Copy link
Owner

sintaxi commented Sep 18, 2013

So there are many pros and cons (both subjective and objective) to front-matter and IMHO once you add them both up the disadvantages far outweigh the advantages. Here it is as I see it.

Pros

Edit one file instead of two

This is many peoples first instinct and it has some merit for sure. If one is optimizing for efficiency (as we all should be) it seems unnecessary to have to add hello-world.md and edit _data.json file when it could just be adding a hello-world.md file. This makes perfect sense until you start to see all the negative side effects of this system.

Cons

Front-matter does not give you order.

It must be pointed out that front-matter alone only gives you local variables. We still need a way to order content. With front-matter we are at the mercy of the filesystem how things are ordered and this rarely comes out as desired. There are two common work arounds for this–that in my opinion are both terrible.

1) Have a naming convention in the filename that allows you to order things.

for example, instead of:

posts/
  |- _data.json
  |- `hello-world.md`
  |- `hello-brazil.md`
 +- `hello-canada.md`

I would have:

posts/
  |- `1_hello-world.md`
  |- `2_hello-brazil.md`
 +- `3_hello-canada.md`

Now imagine you have dozens of files and you want to add in a new file or reorder something, you would have to rename every file. what a pain in the ass this would be. Not to mention, having the URL and filename not match would be confusing. In the case of _data.json you would just order the json object the way you want it.

2) Have a blessed property such as date in the front-matter that gives you order.

This is what Jekyll does which I think works reasonably well in a case of a blog which is one of the things that makes Jekyll "blog aware". This is one of the main reasons Jekyll becomes super awkward to use once you are building something other than a blog.

Many have felt this pain when working with Jekyll...

Harp's _data.json approach makes this a cinch and more importantly, there is only one mechanism that covers ordering regardless of if you are displaying blog posts or a navigation or anything else.

Front-matter is an anti-pattern.

It deserves to be mentioned, having files that are half YAML, half markup is semantically incorrect. This alone probably shouldn't be enough to not have front-matter as we all know, building great systems is all about knowing when to break the rules but it has to be seen as a drawback of this approach.

Breaking this rule causes text editors to freak out, pushes complexity onto syntax highlighters. This is not cool.

Front-matter is punishing on performance

Harp is very fast, and we want to keep it this way. One of the reasons it is so fast is it does a lot of things in parallel such as building the file system tree. During this step it walks the file system opens every _data.json and builds the state for public object for iterating over. Having all the metadata in _data.json files is one of the reasons harp can do this so quickly even with large projects. Harp can do this so quickly that we rebuild this state between every request when in development mode. This is why you can edit your _harp.json, _data.json, or a template and simply refresh the browser to see the changes.

If we supported front-matter we would have to open every template to fetch its metadata and this would have a significant affect on performance and we would very likely run out of file descriptors on the file system which means we would have to throttle how many files we open at at time. All this impacts performance and complexity.

Its worth mentioning that Jekyll has a good reason for not making performance a priority, it works strictly as a static site generator and therefor assets are always served with a static web server. Harp on the other hand IS a static web server. It has to be fast.

YAML is bloated

Not that front-matter has to be YAML it could be JSON but I might as well address this point.

YAML spec is 80 pages long and implementations are complex and have had known security issues. JSON is sooo simple. Parsers exist everywhere, and it is secure. Not saying JSON is better than YAML, just saying its a better choice for a high performance web-server such as Harp

TL;DR

  • Front-matter alone is not enough, we would also need ways to address ordering items.
  • Front-matter is an anti-pattern and pushes complexity to text-editors
  • Front-matter is horrible for performance.

Hope this helps with understanding the rationale behind not supporting front-matter. A lot of thought went into evaluating these tradeoffs. I see how people have become used to using front-matter since Jekyll has become such a popular tool. BTW - I don't want to come across as slagging on Jekyll, I think it has been a great tool. Harp has the luxury of hind-sight since it is a new tool. So we have the benefit of fixing the mistakes Jekyll made, one of which IMHO is front-matter.

@egeozcan

This comment has been minimized.

Copy link

egeozcan commented Sep 18, 2013

Thanks for the detailed explanation. Could you please also comment on allowing data files per document, while keeping support for _data.json. Like having _my-article.json next to my-article.md or my-article.jade. The data in the individual files could be processed as if they were part of the _data.json file in their directory when compiling (added under a key of their name).

@sintaxi

This comment has been minimized.

Copy link
Owner

sintaxi commented Sep 19, 2013

Yeah sure.

Having matching _my-article.json metadata file would still suffer from the performance issue as we would have to open n files per directory to build our metadata object but at least in this case the performance hit is opt-in unlike front-matter where we would have to open every file regardless to see if there is metadata or not. So I suppose this idea could be entertained.

The main problem I would see with this is there would now be two ways to do the same thing which isn't much of a problem other than people may get confused when they have _data.json file and they end up overriding the metadata with a _my-article.json file. It could cause some confusion. Are you having a hard time with _data.json? are you finding it difficult to maintain this file or does it feel unsavoury to you?

@ryanfitzer

This comment has been minimized.

Copy link

ryanfitzer commented Sep 19, 2013

@sintaxi Thanks for taking the time to explain your reasoning so thoroughly.

Are you having a hard time with _data.json? are you finding it difficult to maintain this file or does it feel unsavoury to you?

In my case, migrating a blog started in 2006 with 1500+ posts makes the single json file very tedious.

It deserves to be mentioned, having files that are half YAML, half markup is semantically incorrect.

Agreed. My use case was specific to Markdown, where YAML doesn't have that problem. But I definitely agree with your point for other file types.

Front-matter is punishing on performance

Great point. Hadn't thought about it this way. The tradeoff in my situation is more friction on the user's side.

but at least in this case the performance hit is opt-in

I like thinking of these features as op-in. With @egeozcan's feature, if the files are present, Harp would use them. For the front matter feature (not that it would need to be YAML), a flag in the _data.json could be used to dictate if a directory's content files contain the meta.

One way or another, I appreciate the thought behind how Harp is trying to balance performance.

@egeozcan

This comment has been minimized.

Copy link

egeozcan commented Sep 19, 2013

@sintaxi yes, single _data.json file isn't maintainable; especially when you have many articles and many people working on those articles.

@holic

This comment has been minimized.

Copy link

holic commented Oct 7, 2013

We are looking to migrate our company blog to Harp.io and it's a bummer to see the _data.json requirement. With 3-4 different authors, hundreds of articles, and dozens in the pipeline, a single file to manage them is not ideal.

@utensil

This comment has been minimized.

Copy link

utensil commented Oct 8, 2013

Human Author's Perspective

If you are blogging with, say, Markdown, you would certainly want to write tags or title and so on in the file, not somewhere else.

Summary information like indexes are supposed to be generated, but not hand-crafted, and _data.json is exactly such a thing.

While having _data.json for something global is nice, being forced to do some "register"-like stuff isn't pleasant, from the author's perspective.

Front-matter is intuitive for human who is writing the content. If there is any technical problems, they should be left to machines and programmers. Let the design feel human and work for human, not the other way around.

Programmer's Perspective

Front-matter of Jekyll syntax is clear and easy to strip before parse phase. Markdown and other parsers should not see the front-matter at all.

And BTW, the Github markdown renderer is now front-matter-aware, it will recognize the front-matter and render it to a table. Why? Because many people are writing front-matter, like in Jekyll, like in middleman. I can't think of any reason that one should say no to front-matter.

Cons Aren't Really Cons

Let's see the Cons listed in @sintaxi 's post:

Front-matter alone is not enough, we would also need ways to address ordering items.

@sintaxi already presented the solutions. And the solutions are better than just fine to me.

Even if order is really that matter, and the solutions suck, order is not something specific to individual files, but something global, so it's exactly what _data.json is supposed to do.

_data.json is good by itself and it brought in new features, but it doesn't do what front-matter is good at.

Front-matter is an anti-pattern and pushes complexity to text-editors

Front-matter is anti-pattern? Where does this statement come from?

Half YAML, half markup is semantically incorrect ? What about Javascripts in HTML? What about markdown filters in Jade?

I think text-editors can handle that, they have handled formats even much more screwed. And one of the design purpose of highlight.js is to recognize half-half-like code, and most editors can recover from a piece of incorrect code and keep life going.

Front-matter is horrible for performance.

That's a premature assertion. I can't see why harp can't maintain a cached json file and simple check the time stamps next round, or some other measures.

And again, don't just consider the performance of machine, consider the efforts of human maintaining the horrible _data.json, that's a much more crucial performance overhead.

TL;DR

Forgive me for being harsh, but lack of front-matter is really why I'm not moving to harp. And everything else about harp seems so great...

@ghost ghost assigned sintaxi Oct 10, 2013

@egeozcan

This comment has been minimized.

Copy link

egeozcan commented Oct 25, 2013

I know it doesn't add to the discussion but I really wanted to say that I'm patiently waiting for any news about this.

@colinscroggins

This comment has been minimized.

Copy link

colinscroggins commented Oct 25, 2013

Just wanted to chime in and say that while I am really liking Harp, I really miss the easy pairing of meta and content in the same file (front-matter).

If JSON is easier/faster than YAML, then I am fine with that, but I want to write that meta once and then use template logic to make changes to things like post display order (by title, by date, by creation, etc.). The current Harp solution just feels klunky compared with how the rest of Harp works.

@jcswart

This comment has been minimized.

Copy link

jcswart commented Oct 27, 2013

First I want to thank the authors for their work! You guys rock, that said I have noticed a few things in the last few days. My experience comes as a developer that wanted to convert an old neglected Jekyll blog into a self hosted & served Harp site.

In my opinion the problems that I encountered stem from the fact that Harp is entertaining competing ideas:

  1. Static content generation
  2. A fast development platform for developers using modern techniques (Less, Jade, etc.)
  3. A server for said content

The problem is that separately these work really well, but all together are sort of tangled as it stands.

The story for harp serve is really great. Making UI tweaks, building out pages with Jade, etc is all very fast and feels fluid.

But then you finish developing the site and something happens: you want to create content. At this point the story gets muddy. Because when you create content you have to maintain the file: my-post.jade and the _data.json. This is a pain, especially if someone other than a developer will be creating the content.

If I had a magic lamp that granted wishes I would ask for:

  1. harp serve --production When I specify this dynamic files will be compiled once and only once on initialization, and then served from their cached locations. Without pre-compiling large assets the current production server can take almost a second to serve pages. Long by modern patience levels.
  2. `harp post new 'My new post'`` This command creates a new file in /posts/my-new-post.(jade|md). In addition it updates the _data.json with the proper meta information. This would alleviate a lot of the where is the front-matter rage.
  3. Better integration of content with templates. I want to specify the title of a blog post, the most recent blog post, etc. without _data.json hackery in the template. Again an issue with lacking front-matter.

I think that ultimately Harp is a great contribution and I thank the authors for their efforts. However it does not appear that Harp is very suited to my workflow: create the site once and then add individual pieces of content over and over. Right now development on harp is great, but continued creation of additional content, ie: blog posts is not as great.

I hope my constructive feedback will be helpful. Thank you!

@kennethormandy

This comment has been minimized.

Copy link
Collaborator

kennethormandy commented Oct 27, 2013

Thanks everyone, there’s a lot of great feedback for us here. We discuss this issue pretty frequently, and while I don’t have any specific answers, I just wanted to say this issue hasn’t been forgotten by any means. I really appreciate everyone writing about their personal experiences with Harp, it helps us make much more informed design decisions.

@jcswart I also just wanted to address your two other points:

  1. I believe this can be accomplished now, it’s on the Harp server page, but we could definitely expand on it:

    Harp is production ready, by specifying an environment variable we add extra LRU caching to make your site run even faster.

NODE_ENV=production harp server --port 3000

Hopefully that helps, and if not, feel free to open another issue.

  1. Personally, I can’t see this happening as part of the CLI. You could probably make some sort of script that could do this for you, but it’s a very blog or static site generator-centric approach. Harp is great for those things, but probably won’t have features for that single, specific use case like Jekyll does. That said, some kind of interface built upon Harp for managing content and metadata could be great, I just don’t think it will take the form of that feature, that’s all. It is helpful feedback, though, so thanks!
@sintaxi

This comment has been minimized.

Copy link
Owner

sintaxi commented Nov 3, 2013

@jcswart @utensil @colinscroggins @holic @egeozcan @ryanfitzer

Thank you all for expressing your thoughts on this topic. All other arguments aside, at this time adding front-matter would have extremely large impact on performance especially for larger apps. Performance is very important for this project. Although at this time we are unwilling to make this compromise I have drafted a plan to implement front-matter in a way that might have a manageable performance hit. Though this would take significant changes in #sintaxi/terraform (something I would like to do anyway). I think we will table this discussion until these changes are in terraform where we can debate the pros/cons on the merits of the design and the performance compromise hopefully out of the picture. Sound good?

Its worth mentioning that Jekyll has recently added harp-style data files http://jekyllrb.com/docs/datafiles/

@jcswart regarding your "magic lamp" feature requests, nothing wrong with that idea thought that seems like the responsibility of another tool. Not harp itself.

-b

@edrex

This comment has been minimized.

Copy link
Contributor

edrex commented Nov 7, 2013

For anyone migrating, I wrote a small script to convert Jekyll post metadata to the Harp format:

https://npmjs.org/package/jekyll2harp

@edrex

This comment has been minimized.

Copy link
Contributor

edrex commented Nov 7, 2013

Not to keep this thread going, but I'd like to point out that the ordering of keys in a JSON object is not guaranteed, and in fact many intermediate representations don't preserve order. It is the case that V8 does, but it's a little funny to rely on the ordering of keys to specify post listing order.

@sintaxi

This comment has been minimized.

Copy link
Owner

sintaxi commented Nov 7, 2013

Fantastic! nice work.

You are correct that the order of objects in "JSON" is not guaranteed. However, ordering in Harp is. If V8 for any reason changes their API in this regard we will seek alternate JSON parsing methods to ensure the behaviour in Harp does not change.

Thanks for writing this library. Should be a great resource for people coming to harp from jekyll.

@egeozcan

This comment has been minimized.

Copy link

egeozcan commented Nov 27, 2013

I would like to kindly ask if there are any news about this. Is there any way we can contribute?

@jorgepinon

This comment has been minimized.

Copy link

jorgepinon commented Feb 14, 2014

Just to address the performance concern, and if this topic is still being considered, one approach may be to honor front matter only during a compile down to flat files.

@gilbert

This comment has been minimized.

Copy link

gilbert commented Jul 9, 2014

Performance seems like a strange reason to not implement front matter. Caching should solve this problem pretty easily, especially since you would only need to cache when you're done editing the content, i.e. production.

I see front matter as a compliment to _data.json. At the moment, _data.json overrides values in _harp.json. Wouldn't it make sense to have front matter override values in _data.json?

@ixley

This comment has been minimized.

Copy link

ixley commented Jul 9, 2014

I can't speak to performance issues one way or another, but from a usability perspective, I find the ability to use front matter as a much more maintainable and friendly way to page meta data.

@andreyvit

This comment has been minimized.

Copy link

andreyvit commented Sep 29, 2014

Another vote here; just ruled out Harp as an option for my company web site because of the lack of front matters.

zeke referenced this issue in sintaxi/terraform Oct 3, 2014

@m-o-e

This comment has been minimized.

Copy link

m-o-e commented Oct 22, 2014

+1 for frontmatter. For me it's also the single reason why I can't use harp.

@ir-g

This comment has been minimized.

Copy link

ir-g commented Oct 27, 2014

Another +1 for front matter - It is the exclusive reason that I'm on Jekyll for my blog, etc. I use harp for some web apps, but it currently doesn't provide nice blog posts, etc with front matter.

@jamesalexanderdickerson

This comment has been minimized.

Copy link

jamesalexanderdickerson commented Nov 20, 2014

+1 for front matter. If harp had this, Jekyll could not compete.

@gilesbowkett

This comment has been minimized.

Copy link

gilesbowkett commented Mar 23, 2015

just a suggestion - if somebody were to write up a pull request implementing front matter, and CC everybody who threw in a +1 on this ticket, that would probably be useful.

you can find steps towards an implementation in this commit:

sintaxi/terraform@7cc5a2f

@dominykas

This comment has been minimized.

Copy link

dominykas commented Apr 17, 2015

This is a very long discussion for something that seems like a trivial feature... Made a pull request sintaxi/terraform#89 - this simply strips off the front matter before rendering markdown.

Next up - I intend to use @edrex's code (haven't looked it yet, probably needs to be extracted into separate, smaller modules?) to update _data.json from markdown before running Harp or via git hooks or something. Either way, for me personally, this will be the 80/20, allowing me to move further with harpifying my blog.

Not sure what all the fuss with "harmful", "bad practice", etc discussion is all about. Keeping metadata together with data (i.e. same file, not some other big file where you can't really find anything) is much more future proof and maintainable and transferable between folders, projects, etc. Both approaches can also happily live together in the same project.

@hcschuetz

This comment has been minimized.

Copy link

hcschuetz commented May 19, 2015

+1 for front matter from me too. There's simply some data (e.g. title and author) that logically belongs to a particular file and other data (e.g. ordering, if it cannot be determined otherwise) that belongs to the directory, that is, into _data.json (or, maybe, _data.yaml).

I understand @sintaxi's performance and "antipattern" concerns. But couldn't these be solved by making front matter an opt-in choice? So by default no front matter is expected. But the existence of front matter might be advertised to harp by some configuration. And since harp prefers convention over configuration, we might also advertise front matter by a file name such as my-very-interesting-post.md.yaml-fm. Here .yaml-fm indicates a multipart file type consisting of YAML-formatted front matter and then some non-YAML data. And similarly there would be .json-fm.

This way harp need not open each and every .md file to search for front matter. (Or is checking the file names already too slow?) Applications without extreme performance requirements can use the convenient front-matter approach without being forced into premature optimization. Only if performance really turns out to be an issue, one has to bite the bullet and move data from front-matter to a centralized data file.

@orenmizr

This comment has been minimized.

Copy link

orenmizr commented Jun 4, 2016

long thread. so no frontmatter for harp? i wanted to use it for editing my tags for post within the md file. having it decoupled isn't comfortable. should we really consider performance issues for static generated systems... it's not the user's runtime.

@allanwhite

This comment has been minimized.

Copy link

allanwhite commented Jul 5, 2016

Greetings. I know this thread is a bit old, but I just wanted to add another voice to the "we need file-level front-matter/metadata" discussion.

Harp looks wonderful, and lets me use Jade and Markdown seamlessly. But, the centralized _data.json is difficult for humans when it comes to managing content. "Wait, the title & front-matter are in this one big file?" I hear.

Perhaps if Jade allowed multi-line data objects that were more JSON-like, this wouldn't be a problem; pretty much all my pages are unique in my use case, and require unique data structures. An optional myfile.json for each myfile.jade would be amazingly useful.

Suggestions moving forward:

  • Consider having opt-in, per-file metadata in JSON or (preferred) YAML format. If you feel YAML is too bloated, then we could parse it with gulp-yaml first.
  • The whole pattern/antipattern thing... I would go as far as to say that the convenience for humans to manage content & data in a per-entry way outweighs raw performance. People can use the compilation feature and wait for their files to be generated. Warning people that it's slower if they want this feature, might help with any who grumble about performance.
  • I wonder if a separate, async task could be looking for data objects (yml/json) and combining them into one somehow. Then, people can write all the little ones they want, and it gets combined into the global data object before rendering. :looks at node programmer:

These are the sort of things that non-programmers like me think about!

I'm bummed that I don't think we can use Harp for our projects - I can't sell a monolithic data file that's manually managed. Writers & designers really don't want to hand-edit JSON files.

Harp is an amazing project, I hope these suggestions spark new ideas and directions. Thank you for all you've done!

@sintaxi

This comment has been minimized.

Copy link
Owner

sintaxi commented Jul 5, 2016

@allanwhite are you aware that you can use _data.js instead of _data.json? I allows you to do execute any arbitrary code and return an object literal. So you can basically organize your data any way you want or even speak to a web service to get the data.

module.exports = {}
@egeozcan

This comment has been minimized.

Copy link

egeozcan commented Jul 6, 2016

@sintaxi can we return a promise of an object literal?

off-topic: wow, it's been nearly 3 years since I opened this issue^^ thanks for all the input, everyone!

@allanwhite

This comment has been minimized.

Copy link

allanwhite commented Jul 6, 2016

@sintaxi I was not - that's very interesting! Thanks for reading and responding.

@jimjkelly

This comment has been minimized.

Copy link

jimjkelly commented Jul 6, 2016

Hmm, is the use of _data.js in the docs? If so, I missed it - that's super useful! If not, could it get added?

@egeozcan

This comment has been minimized.

Copy link

egeozcan commented Jul 6, 2016

@callumflack

This comment has been minimized.

Copy link

callumflack commented Jul 20, 2016

Off topic: love @sintaxi 's use of Dr Octagon lyric as placeholder text.

@atav1k

This comment has been minimized.

Copy link

atav1k commented Apr 30, 2017

I know this thread is a bit old but I can't seem to kick my Harp habit. Not a deal breaker for me as I'm only building a small 20 product store but with a headless CMS like NetlifyCMS, that front-matter would be ideal even if a script parses all front matter to the _data.json file & CSS hides it.

@Yajo

This comment has been minimized.

Copy link
Contributor

Yajo commented Sep 27, 2018

Harp already supports CoffeScript. Why not using CSON?

YAML is way better, but if you don't want to support it and humans tend to hate writing JSON, it's a good possible solution (to one of the problems).

Of course, Harp could support all 3 formats, as it does with templating engines.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment