Use `Jekyll::Post`s for both LSI indexing and lookup. #3629

Merged
merged 1 commit into from Apr 14, 2015

Conversation

Projects
None yet
4 participants
@justinweiss
Contributor

justinweiss commented Mar 30, 2015

When looking for related posts, Jekyll was indexing Jekyll::Post
objects, but finding related posts based on Jekyll::Post#content. This
caused two problems:

  1. Ruby 2.2 will warn on == if <=> throws an exception (and future Ruby
    versions will surface that exception). Because Strings can't be
    compared with Jekyll::Posts, this warning was appearing all the time
    while searching for related posts.
  2. LSI won't return a post itself when searching for related posts. But
    LSI could never tell that we were searching on a post, since Jekyll
    passed post content, not a post object. With this fix, we can remove the
    - [post] from Jekyll::RelatedPosts#find_related.

This is a more accurate fix for #3484.

Use `Jekyll::Post`s for both LSI indexing and lookup.
When looking for related posts, Jekyll was indexing `Jekyll::Post`
objects, but finding related posts based on `Jekyll::Post#content`. This
caused two problems:

1. Ruby 2.2 will warn on == if <=> throws an exception (and future Ruby
versions will surface that exception). Because `String`s can't be
compared with `Jekyll::Post`s, this warning was appearing all the time
while searching for related posts.

2. LSI won't return a post itself when searching for related posts. But
LSI could never tell that we were searching on a post, since Jekyll
passed post content, not a post object. With this fix, we can remove the
`- [post]` from `Jekyll::RelatedPosts#find_related`.

This is a more accurate fix for #3484.
@envygeeks

This comment has been minimized.

Show comment
Hide comment
@envygeeks

envygeeks Mar 30, 2015

Contributor

👍

Contributor

envygeeks commented Mar 30, 2015

👍

@parkr

This comment has been minimized.

Show comment
Hide comment
@parkr

parkr Mar 30, 2015

Member

Not sure if lsi should be this coupled to the model, considering that Jekyll::Post will be a goner by the launch of 3.0.

@mattr-, thoughts?

Member

parkr commented Mar 30, 2015

Not sure if lsi should be this coupled to the model, considering that Jekyll::Post will be a goner by the launch of 3.0.

@mattr-, thoughts?

@Ch4s3 Ch4s3 referenced this pull request in jekyll/classifier-reborn Apr 11, 2015

Closed

Warning on lsi.rb line 237 #34

@justinweiss

This comment has been minimized.

Show comment
Hide comment
@justinweiss

justinweiss Apr 14, 2015

Contributor

Any other thoughts on this?

Contributor

justinweiss commented Apr 14, 2015

Any other thoughts on this?

@@ -42,7 +42,7 @@ def build_index
end
def lsi_related_posts
- self.class.lsi.find_related(post.content, 11) - [post]
+ self.class.lsi.find_related(post, 11)

This comment has been minimized.

@parkr

parkr Apr 14, 2015

Member

LSI should really be based on the content of the post, no? Doesn't this change the functionality of the indexer?

@parkr

parkr Apr 14, 2015

Member

LSI should really be based on the content of the post, no? Doesn't this change the functionality of the indexer?

This comment has been minimized.

@justinweiss

justinweiss Apr 14, 2015

Contributor

It shouldn't -- both add_item and lsi_related_posts will call #to_s on what you pass into it. I'd assume you could also leave this function the same, and change build_index to call add_item(x.content), but this way seemed clearer to me.

Regardless, I feel like build_index and lsi_related_posts should be using the same kind of object -- either both Posts or both Strings. That way, you won't have to - [post] at the end, because classifier-reborn is already doing that on its side.

@justinweiss

justinweiss Apr 14, 2015

Contributor

It shouldn't -- both add_item and lsi_related_posts will call #to_s on what you pass into it. I'd assume you could also leave this function the same, and change build_index to call add_item(x.content), but this way seemed clearer to me.

Regardless, I feel like build_index and lsi_related_posts should be using the same kind of object -- either both Posts or both Strings. That way, you won't have to - [post] at the end, because classifier-reborn is already doing that on its side.

This comment has been minimized.

@parkr

parkr Apr 14, 2015

Member

Oh, ok great! 👍

@parkr

parkr Apr 14, 2015

Member

Oh, ok great! 👍

parkr added a commit that referenced this pull request Apr 14, 2015

@parkr parkr merged commit 0072865 into jekyll:master Apr 14, 2015

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details

parkr added a commit that referenced this pull request Apr 14, 2015

@justinweiss

This comment has been minimized.

Show comment
Hide comment
@justinweiss

justinweiss Apr 14, 2015

Contributor

Thanks!

Contributor

justinweiss commented Apr 14, 2015

Thanks!

@parkr

This comment has been minimized.

Show comment
Hide comment
@parkr

parkr Apr 14, 2015

Member

Thank YOU! ❤️ Sorry it took so long.

Member

parkr commented Apr 14, 2015

Thank YOU! ❤️ Sorry it took so long.

@justinweiss justinweiss deleted the justinweiss:lsi_search_on_posts branch Apr 14, 2015

@jekyll jekyll locked and limited conversation to collaborators Feb 27, 2017

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.