New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make pages and posts instances of Jekyll::Document. #3169

Closed
parkr opened this Issue Nov 29, 2014 · 14 comments

Comments

Projects
None yet
8 participants
@parkr
Member

parkr commented Nov 29, 2014

Can't take credit for this idea. @benbalter said it. The idea is that posts are a sort of collection, and pages are as well if you think kind of tilt your head and squint a little. It would make sense for them all to share a common interface. We abolish the Convertible module and make anything that was once convertible a Document, et voila.

@parkr parkr added this to the 3.0 milestone Nov 29, 2014

@alfredxing

This comment has been minimized.

Show comment
Hide comment
@alfredxing

alfredxing Nov 29, 2014

Member

This would also help make post/page/document behaviour a more consistent. But then we'd have to manage the special case where pages are in the root directory...

Perhaps we can have:

  • Everything is what a Page is now (though we can rename the class)
  • Collections are simply a directory with Pages in them
  • We abolish the special _ prefix for collections; all subdirectories are collections.
  • Posts are a collection
    • No more dates in post filenames, instead dates go in front matter
    • No more special _posts directory, it'll just be posts
  • Then, everything is iterable via site.[collection]

It's probably a bit too much, but this is definitely in the general direction I'd like to see Jekyll take.

Member

alfredxing commented Nov 29, 2014

This would also help make post/page/document behaviour a more consistent. But then we'd have to manage the special case where pages are in the root directory...

Perhaps we can have:

  • Everything is what a Page is now (though we can rename the class)
  • Collections are simply a directory with Pages in them
  • We abolish the special _ prefix for collections; all subdirectories are collections.
  • Posts are a collection
    • No more dates in post filenames, instead dates go in front matter
    • No more special _posts directory, it'll just be posts
  • Then, everything is iterable via site.[collection]

It's probably a bit too much, but this is definitely in the general direction I'd like to see Jekyll take.

@parkr

This comment has been minimized.

Show comment
Hide comment
@parkr

parkr Nov 30, 2014

Member

Whoa, whoa we can't break every site ever written with Jekyll. This should be an internal structural change only. I can almost jive with your proposal except abolishing explicit collections. Site.pages would be whatever isn't a collection (including posts), or a data or static file. It's more a catch-all for one-off pieces of content. 😄

Member

parkr commented Nov 30, 2014

Whoa, whoa we can't break every site ever written with Jekyll. This should be an internal structural change only. I can almost jive with your proposal except abolishing explicit collections. Site.pages would be whatever isn't a collection (including posts), or a data or static file. It's more a catch-all for one-off pieces of content. 😄

@alfredxing

This comment has been minimized.

Show comment
Hide comment
@alfredxing

alfredxing Nov 30, 2014

Member

Told you it was a bit far-fetched. Not sure how this is going to work out though. A Document must be a part of a Collection, and if the root directory is a Collection, we end up with just another directory structure...

I agree that posts are a collection, but I'm not so convinced about pages.

Member

alfredxing commented Nov 30, 2014

Told you it was a bit far-fetched. Not sure how this is going to work out though. A Document must be a part of a Collection, and if the root directory is a Collection, we end up with just another directory structure...

I agree that posts are a collection, but I'm not so convinced about pages.

@parkr

This comment has been minimized.

Show comment
Hide comment
@parkr

parkr Nov 30, 2014

Member

Hm, I generally agree that pages aren't collection documents. Each page is as separate from each other as each collection is separate from each other. They are their own entities. I think @benbalter was the one who proposed that everything be a collection, and I suppose I took it too literally. Generally:

  1. Posts should be an automatic collection
  2. If possible, _data should be a collection. This causes issues with data access (no more site.data.FILE_NAME), as site.collection is an Array, not a Hash. But if there is an agreeable solution, I'd love to explore it.
  3. I would like anything that we render to be put through the Renderer class, rather than having any knowledge whatsoever of how it is rendered and transformed. It just has content (raw, unchanged) and output (rendered & transformed).

Thoughts?

Member

parkr commented Nov 30, 2014

Hm, I generally agree that pages aren't collection documents. Each page is as separate from each other as each collection is separate from each other. They are their own entities. I think @benbalter was the one who proposed that everything be a collection, and I suppose I took it too literally. Generally:

  1. Posts should be an automatic collection
  2. If possible, _data should be a collection. This causes issues with data access (no more site.data.FILE_NAME), as site.collection is an Array, not a Hash. But if there is an agreeable solution, I'd love to explore it.
  3. I would like anything that we render to be put through the Renderer class, rather than having any knowledge whatsoever of how it is rendered and transformed. It just has content (raw, unchanged) and output (rendered & transformed).

Thoughts?

@benbalter

This comment has been minimized.

Show comment
Hide comment
@benbalter

benbalter Nov 30, 2014

Contributor

My original pitch (and what I still think is the best path forward) was to keep behavior identical, but internally, we'd dogfood our own abstractions. If I make a plugin that works with Posts, there's no reason I should have a parallel logic path for (other) collections.

Posts are very logically a specialized collection (with custom behavior for things like dates and categories).

Architecturally, I don't see why Pages can't be a collection. I understand that they're more logically distinct than e.g., Posts, but they're still used as a collection, in practice (how many times have you used for page in site.pages?). Yes, they don't have a logical relationship to each other internally like Posts do, but when you add Posts, and other custom collections to the mix, they are suddenly share a lot in common on a macro level, especially for most major use cases.

@parkr what's your argument against keeping behavior the same, but simply making Pages an automatic collection? I'm just trying to minimize wheel reinvention here.

an Array, not a Hash. But if there is an agreeable solution, I'd love to explore it.

Maybe this isn't practical, but the idea was that internal collections would be automatically created user-collections, but with special logic baked in. In the case of data, they wouldn't appear as site.collections (maybe they would?), and would appear as site.data. If this makes sense, I was hoping we could make Data (or Page, or Post), simply extend the collection class, and add custom logic / override defaults where necessary.

Contributor

benbalter commented Nov 30, 2014

My original pitch (and what I still think is the best path forward) was to keep behavior identical, but internally, we'd dogfood our own abstractions. If I make a plugin that works with Posts, there's no reason I should have a parallel logic path for (other) collections.

Posts are very logically a specialized collection (with custom behavior for things like dates and categories).

Architecturally, I don't see why Pages can't be a collection. I understand that they're more logically distinct than e.g., Posts, but they're still used as a collection, in practice (how many times have you used for page in site.pages?). Yes, they don't have a logical relationship to each other internally like Posts do, but when you add Posts, and other custom collections to the mix, they are suddenly share a lot in common on a macro level, especially for most major use cases.

@parkr what's your argument against keeping behavior the same, but simply making Pages an automatic collection? I'm just trying to minimize wheel reinvention here.

an Array, not a Hash. But if there is an agreeable solution, I'd love to explore it.

Maybe this isn't practical, but the idea was that internal collections would be automatically created user-collections, but with special logic baked in. In the case of data, they wouldn't appear as site.collections (maybe they would?), and would appear as site.data. If this makes sense, I was hoping we could make Data (or Page, or Post), simply extend the collection class, and add custom logic / override defaults where necessary.

@mattr-

This comment has been minimized.

Show comment
Hide comment
@mattr-

mattr- Nov 30, 2014

Member

My original pitch (and what I still think is the best path forward) was to keep behavior identical, but internally, we'd dogfood our own abstractions. If I make a plugin that works with Posts, there's no reason I should have a parallel logic path for (other) collections.

I'm with @benbalter on this one. One of the things I've been doing with my copious amounts of free time lately is looking for places in Jekyll that would be served by some refactoring, and this is one of those places. I'll start to explore some of these ideas in the next few days.

Member

mattr- commented Nov 30, 2014

My original pitch (and what I still think is the best path forward) was to keep behavior identical, but internally, we'd dogfood our own abstractions. If I make a plugin that works with Posts, there's no reason I should have a parallel logic path for (other) collections.

I'm with @benbalter on this one. One of the things I've been doing with my copious amounts of free time lately is looking for places in Jekyll that would be served by some refactoring, and this is one of those places. I'll start to explore some of these ideas in the next few days.

@mattr- mattr- added refactor and removed fix labels Nov 30, 2014

@alfredxing

This comment has been minimized.

Show comment
Hide comment
@alfredxing

alfredxing Nov 30, 2014

Member

@benbalter Thanks for explaining!

I think the best way to go about implementing this would be:

  1. Generalize the Document, Collection, and Renderer classes if needed to accommodate pages and posts
  2. Make Post and Page subclasses of Document
  3. Instead of initializing site.posts and site.pages as Array objects, initialize them as Collections
  4. Read in pages and posts into the initialized Collections and read in any other user-defined collections

Another question coming out of this would be: should static files (currently in site.static_files be in a separate Collection (and make StaticFile a subclass of Document), or do we keep them in an array like we do now?

Member

alfredxing commented Nov 30, 2014

@benbalter Thanks for explaining!

I think the best way to go about implementing this would be:

  1. Generalize the Document, Collection, and Renderer classes if needed to accommodate pages and posts
  2. Make Post and Page subclasses of Document
  3. Instead of initializing site.posts and site.pages as Array objects, initialize them as Collections
  4. Read in pages and posts into the initialized Collections and read in any other user-defined collections

Another question coming out of this would be: should static files (currently in site.static_files be in a separate Collection (and make StaticFile a subclass of Document), or do we keep them in an array like we do now?

@parkr

This comment has been minimized.

Show comment
Hide comment
@parkr

parkr Nov 30, 2014

Member

@parkr what's your argument against keeping behavior the same, but simply making Pages an automatic collection? I'm just trying to minimize wheel reinvention here.

@benbalter My argument is that they aren't in any way the same, so having a site.pages array/collection at all is counter to the semantics we should be following. If our code doesn't mimic semantics of the usage, then contributions to the codebase become more difficult to make. I want Page to be a subclass of the Document class, but I don't think we should be linking Document so closely to the Collection the way we are now. A Document is any set of content and front matter that can be converted and rendered. It shouldn't say anything about its relationship with any other Document. Rather, membership in a Collection defines the Document's relationship to the other constituent Documents in that Collection. The Collection defines the relationship – if the Document moves into another Collection or outside any Collection at all, then the original relationship is broken and nullified.

I don't want to reinvent the wheel, I just don't think Page objects should belong to a Collection. My perspective is that having no relationship at all is not a relationship, the way atheism is not a religion. The lack of something is not a special version of the something lacking.

Instead of initializing site.posts and site.pages as Array objects, initialize them as Collections

@alfredxing Let's make Collection include the Enumerable module. Just have to define #each and it'll all work. A Collection is essentially an Array with a couple special methods attached (and semantic meaning).

should static files (currently in site.static_files be in a separate Collection (and make StaticFile a subclass of Document), or do we keep them in an array like we do now?

They aren't really a Collection either, per se. And they don't convert or render. So I think they're separate from Document and Collection altogether. But we should make sure Collections can include static files.

Member

parkr commented Nov 30, 2014

@parkr what's your argument against keeping behavior the same, but simply making Pages an automatic collection? I'm just trying to minimize wheel reinvention here.

@benbalter My argument is that they aren't in any way the same, so having a site.pages array/collection at all is counter to the semantics we should be following. If our code doesn't mimic semantics of the usage, then contributions to the codebase become more difficult to make. I want Page to be a subclass of the Document class, but I don't think we should be linking Document so closely to the Collection the way we are now. A Document is any set of content and front matter that can be converted and rendered. It shouldn't say anything about its relationship with any other Document. Rather, membership in a Collection defines the Document's relationship to the other constituent Documents in that Collection. The Collection defines the relationship – if the Document moves into another Collection or outside any Collection at all, then the original relationship is broken and nullified.

I don't want to reinvent the wheel, I just don't think Page objects should belong to a Collection. My perspective is that having no relationship at all is not a relationship, the way atheism is not a religion. The lack of something is not a special version of the something lacking.

Instead of initializing site.posts and site.pages as Array objects, initialize them as Collections

@alfredxing Let's make Collection include the Enumerable module. Just have to define #each and it'll all work. A Collection is essentially an Array with a couple special methods attached (and semantic meaning).

should static files (currently in site.static_files be in a separate Collection (and make StaticFile a subclass of Document), or do we keep them in an array like we do now?

They aren't really a Collection either, per se. And they don't convert or render. So I think they're separate from Document and Collection altogether. But we should make sure Collections can include static files.

@afeld

This comment has been minimized.

Show comment
Hide comment
@afeld

afeld Jan 9, 2015

Contributor

Ha, just came to this same epiphany independently... which I guess makes me the Leibniz (the less well-known or handsome one) to your Newton 📈

Really, I think everything could represented as collections (pages, posts, and data). Content is data! Everything is data!

data all the things

Making the APIs between those existing data types completely consistent is really appealing, and it would simplify the internals a lot too. Might be worth thinking through what that implementation (rewrite?) would look like, and layering on APIs (maybe via a plugin gem?) to provide backwards compatibility.

Contributor

afeld commented Jan 9, 2015

Ha, just came to this same epiphany independently... which I guess makes me the Leibniz (the less well-known or handsome one) to your Newton 📈

Really, I think everything could represented as collections (pages, posts, and data). Content is data! Everything is data!

data all the things

Making the APIs between those existing data types completely consistent is really appealing, and it would simplify the internals a lot too. Might be worth thinking through what that implementation (rewrite?) would look like, and layering on APIs (maybe via a plugin gem?) to provide backwards compatibility.

@alfredxing alfredxing referenced this issue Jan 18, 2015

Closed

3.0 RELEASE GAMEPLAN #3324

7 of 7 tasks complete
@alfredxing

This comment has been minimized.

Show comment
Hide comment
@alfredxing

alfredxing Jan 19, 2015

Member

I'm experimenting with this (posts only right now), and got it to somewhat work. A big issue I came across is how we are going to handle all of the post-specific features (like categories, tags, dates, permalink styles, etc.)?

Member

alfredxing commented Jan 19, 2015

I'm experimenting with this (posts only right now), and got it to somewhat work. A big issue I came across is how we are going to handle all of the post-specific features (like categories, tags, dates, permalink styles, etc.)?

@parkr

This comment has been minimized.

Show comment
Hide comment
@parkr

parkr Jan 19, 2015

Member

post-specific features (like categories, tags, dates, permalink styles, etc.)?

For categories & tags, we need an idea of group_by filters. Essentially that's what they do. The permalink styles could be applied with front matter defaults (or even just like them, i.e. collection name-based). As for dates, we'd need to bring over the MATCHER and valid? ideas to Document.

Member

parkr commented Jan 19, 2015

post-specific features (like categories, tags, dates, permalink styles, etc.)?

For categories & tags, we need an idea of group_by filters. Essentially that's what they do. The permalink styles could be applied with front matter defaults (or even just like them, i.e. collection name-based). As for dates, we'd need to bring over the MATCHER and valid? ideas to Document.

@fulldecent

This comment has been minimized.

Show comment
Hide comment
@fulldecent

fulldecent Feb 16, 2015

Contributor

I'm chiming in as a user IMHO to discuss our API.

The _ prefix is good, it clearly demarcates Jekyll-land and asset-land

For example, just looking at it, I know that /assets/bear.jpg will wind up at example.com/assets/bear.jpg. Consider a world without _ prefixes, and you are editing index.html and wanted to link to cars/redCar.html... you would need to first check _config.yml to see if cars/ is a "special Jekyll folder".

"Pages" and "Folders" are preferred terminology, with "Posts" being a specific type of "Page"

This is intuitive and keeps convention web authors have used for decades. Here's an example of something that makes sense:

{{ site.books | where:"ISBN","4109147509740" | map:"url" }}

This searches /path/to/site/_books/* and the Front Matter of each page there. Which is no more special than searching JSON files in the _data/ directory:

{{ site.data.bookList | where:"ISBN","4109147509740" | map:"author" }}

The only difference is that url is magically calculated for site.books because those are pages, but not for the data file bookList.json because that's not a page.

Fuck backwards compatibility

That why we use Semantic Versioning. After 10 years, PHP still makes an average of one backwards-compatibility-breaking release per year. By year 2023 we should be in a better position than PHP, that start by making good, hard decisions now.

Contributor

fulldecent commented Feb 16, 2015

I'm chiming in as a user IMHO to discuss our API.

The _ prefix is good, it clearly demarcates Jekyll-land and asset-land

For example, just looking at it, I know that /assets/bear.jpg will wind up at example.com/assets/bear.jpg. Consider a world without _ prefixes, and you are editing index.html and wanted to link to cars/redCar.html... you would need to first check _config.yml to see if cars/ is a "special Jekyll folder".

"Pages" and "Folders" are preferred terminology, with "Posts" being a specific type of "Page"

This is intuitive and keeps convention web authors have used for decades. Here's an example of something that makes sense:

{{ site.books | where:"ISBN","4109147509740" | map:"url" }}

This searches /path/to/site/_books/* and the Front Matter of each page there. Which is no more special than searching JSON files in the _data/ directory:

{{ site.data.bookList | where:"ISBN","4109147509740" | map:"author" }}

The only difference is that url is magically calculated for site.books because those are pages, but not for the data file bookList.json because that's not a page.

Fuck backwards compatibility

That why we use Semantic Versioning. After 10 years, PHP still makes an average of one backwards-compatibility-breaking release per year. By year 2023 we should be in a better position than PHP, that start by making good, hard decisions now.

@fj

This comment has been minimized.

Show comment
Hide comment
@fj

fj Apr 14, 2015

I wrote the short SpicyJekyll plugin (blog post, repo) to address some of the limitations that custom collections have in Jekyll and to make them more fully-featured.

I think it could serve as a useful template for unifying collection behavior in Jekyll 3 and future versions as per the subject of this issue, since it's fairly opinionated about reasonable minimums for a default collection:

  • a total ordering over the elements of a collection
  • a name for the collection
  • a filename format for members of that collection

and in exchange you get:

  • next and previous links for each element of the collection
  • fail-fast checking for required properties rather than silent failures
  • stable, filename-based permalinks
  • logging of collection actions during build time

fj commented Apr 14, 2015

I wrote the short SpicyJekyll plugin (blog post, repo) to address some of the limitations that custom collections have in Jekyll and to make them more fully-featured.

I think it could serve as a useful template for unifying collection behavior in Jekyll 3 and future versions as per the subject of this issue, since it's fairly opinionated about reasonable minimums for a default collection:

  • a total ordering over the elements of a collection
  • a name for the collection
  • a filename format for members of that collection

and in exchange you get:

  • next and previous links for each element of the collection
  • fail-fast checking for required properties rather than silent failures
  • stable, filename-based permalinks
  • logging of collection actions during build time
@parkr

This comment has been minimized.

Show comment
Hide comment
@parkr

parkr Oct 26, 2015

Member

Closed by #4055.

Member

parkr commented Oct 26, 2015

Closed by #4055.

@parkr parkr closed this Oct 26, 2015

@jekyll jekyll locked and limited conversation to collaborators Feb 27, 2017

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.