New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding Algolia search #3

Merged
merged 1 commit into from Mar 28, 2018

Conversation

Projects
None yet
2 participants
@pixelastic
Contributor

pixelastic commented Feb 15, 2018

Hello,

I'm the author of the jekyll-algolia plugin. I've integrated the plugin in a fork of your website and would like to know what you think of it. It replaces your current search bar with one that will display results instantly as you type your keywords (and handling typos). It also re-uses the styling that already exists on your website.

ezgif-3-9043888d01

You can try it live on https://peaceful-stonebraker-517e10.netlify.com/

How it works:

I saw that you already gave the previous plugin (algoliasearch-jekyll) a try, so I assume you already are familiar with how Algolia works :)

In a nutshell, the plugin adds the jekyll algolia command. When executed, it will extract all relevant information from your website and push it to Algolia. Every time a user will type a character in the search bar, it will query the Algolia API and return the most relevant results. Those results will then be transformed into HTML elements and displayed in the page.

Why I am doing this:

I'm submitting this PR hoping that you will integrate it into your website.

I've been using your website as a test environment for the plugin, as this is one of the largest Jekyll website I've found. I also particularly enjoyed working on it, as your writing about technical documentation is very interesting. By having the plugin visible on your website, I'm also hoping that more developers / tech writers will know that it's possible to have fast and relevant search, even on static websites.

How to run it automatically:

Note that for this demo, I'm using Netlify to host a preview of the website and run the jekyll algolia command automatically each I push to GitHub. You're currently using GitHub Pages, and they don't allow running arbitrary third party plugins. There are ways to make it work on GitHub pages as well, using either Travis or some webhooks. If you think this PR is interesting, we can work out the details of making it run automatically on each push.

Last notes:

I had to change one of your post, as the inclusion of {{ content }} was causing it to re-render the page recursively, causing records sent to Algolia to fail, due to their large size.

I'm also currently pushing all data to my own Algolia account. We'll have to switch this to your own account if you accept the PR. It currently uses ~30k records (which is more than the 10k records included in our free plan), but I'd be happy to give you access to our Open-Source plan that can go as high as 100k records.

Anyway, enough talking, let me know what you think of this fork and any feedback is appreciated :)

@tomjoht

This comment has been minimized.

Owner

tomjoht commented Feb 15, 2018

Hi Tim, I appreciate the sample integration and demo. It looks sharp and has a quick response. However, excluding {{ content }} is problematic because without it, the search depends on keywords appearing in the frontmatter only. For example, if you search for "touchpoints," which is a keyword that appears in the body and not the frontmatter, you'll get different results between Algolia and Google.

Any thoughts on how to index {{ content }}? Isn't search going to deliver better results if all the content is actually indexed?

Also, I wasn't aware that my site could be indexed for free in Algolia. When I checked this previously, I found that I had too many records.

@pixelastic

This comment has been minimized.

Contributor

pixelastic commented Feb 15, 2018

Oh wait, I think there's some confusion about what I meant about me modifying {{ content }} :)

I talked about this post that looks "broken" on your current blog (as it recursively including the content of the page inside itself). I updated it on my fork to wrap the example calls to {{ content }} into {% raw %} and {% endraw %} tags to avoid the issue.

Current blog Fork
image image

Any thoughts on how to index {{ content }}? Isn't search going to deliver better results if all the content is actually indexed?

So to be clear, the whole content of the page is actually indexed, not just what is in the front-matter :). But you're right that your "touchpoint" query gives weird results. I think there is some issue created by the fact you have a weight attribute in your front-matter and I do use a weight attribute as well. I'm going to have a look at that :)

Also, I wasn't aware that my site could be indexed for free in Algolia. When I checked this previously, I found that I had too many records.

If we follow the pricing by book, you have too many records to fit in the free plan. But we also have an Open-Source plan that we can give to people promoting the sharing of knowledge, and I would be totally ok giving you access to this one.

@tomjoht

This comment has been minimized.

Owner

tomjoht commented Feb 15, 2018

Ahh, so you are indexing content. In that case, I'm much more interested and will want to take a closer look. Also, knowing that I could qualify under the open source plan is awesome.

I like the power and control that Algolia provides. It's something I'd like to become much more knowledgeable about.

Re the issue you described earlier, I do have a weight frontmatter tag. It's arbitrary, though, and I could potentially change it.

Also note that I have two other microsites:

These other sites are separate Jekyll projects, each in their own repos. However, a search on one site will surface content from the others, because I just kick off a site search for the root domain (idratherbewriting.com). I'm not sure what you think the best approach there is.

@pixelastic

This comment has been minimized.

Contributor

pixelastic commented Feb 15, 2018

Great! Happy to see this is something you would be interested in. Also, if you have any specific questions about how Algolia works, feel free to ask :)

For the weight key, I think it's more my responsability, as a plugin creator, to make sure I'm not messing with the users arbitrary front-matter options. I'll see how I can fix this problem directly from my side, but ideally you should not have to rename any keys on your side. I created an issue my the plugin and will handle it.

As for the two other websites, my suggestion would be to have 3 independent searches. As they are 3 different projects, they can each have their own jekyll-algolia configuration and push to 3 different indices (using the same Algolia account). A search on "Documenting APIs" will only surface results from that website.

If you want to have a "meta" search that allow searching into all 3 websites at once, it's doable, but I don't think that's the easiest path for your users.

In a second phase, I would also like to suggest adding DocSearch to your documentation theme. But let's not get too carried away ;)

@pixelastic

This comment has been minimized.

Contributor

pixelastic commented Mar 20, 2018

Hello again,

I updated the plugin to v1.2 that should fix all the issues we mentioned in this thread. The weight you have in your pages does not interfere with the plugin anymore. It makes the touchpoint query discussed much more relevant now. You can check the updated demo here.

I also added the plugin to Documenting APIs (demo, repo) and Simplifying Complexity (demo, repo) so you can see how it would look like.

How would you like to move from here?

If you merge the PR, you will need to replace the application id with your own. You will need to create an Algolia account for that if you don't already have one. Once you have it, you can send me an email (tim at algolia dot com) with your application ID and I'll upgrade it to the Open-Source version.

Let me know if you have any questions, I'd be happy to answer them :)

@pixelastic

This comment has been minimized.

Contributor

pixelastic commented Mar 28, 2018

I updated the PR to have all the changes in one commit only. You can check it live on https://idratherbewriting-search.netlify.com/.

I'm still using my application ID and search API key so the live demo correctly works but you should update them to your credentials when your merge it. I've also upgraded your account for I'd rather be writing to our Open-Source plan. You can now host up to 100k records (your website currently uses ~30k).

I've also submitted PR to your other two repositories, but those two already fit in the Community (free) plan as they only need ~2k records each (and the plan offers up to 10k).

Happy to help on any more questions you might have. I'm excited to see it live on your website soon :)

@tomjoht tomjoht merged commit d0fa1b0 into tomjoht:master Mar 28, 2018

@pixelastic pixelastic deleted the pixelastic:search branch Mar 28, 2018

@tomjoht

This comment has been minimized.

Owner

tomjoht commented Mar 28, 2018

Thanks for making these updates! I really appreciate it. I merged the 3 PRs, updated the app ID and search key in the config files, and also ran the index in each of the projects. I looked at the searches on each site, and they seem to be working as designed. If you want to confirm that everything looks good, feel free. I still need to explore all the code changes to understand the implementation better in the theme, and I'll do that within a couple of weeks or so, but so far it looks great. Thanks again.

@tomjoht

This comment has been minimized.

Owner

tomjoht commented Mar 28, 2018

Tim, when I search for "Simplified Technical English", I don't get any results. I'm looking for this post: http://idratherbewriting.com/2017/01/25/hyperste-simplified-technical-english-asd-ste100/

Do you know why this post isn't indexed in the Algolia index?

@tomjoht

This comment has been minimized.

Owner

tomjoht commented Mar 28, 2018

I think the issue is that the plugin isn't indexing the content correctly. I get errors warning me that some entries are too large. For example:

The jekyll-algolia plugin could not push one of your records as it exceeds the
10 Kb size limit.

The plugin will create one record for each element matching your
`nodes_to_index` value (currently set to "p,.summary, li").
Each record should not weight more than 10 Kb.
One of your records weights 37.02 Kb and has been rejected.

objectID: 26d1ffabbabdf93b16076f5127d7100e
title:    Twenty Usability Tips for Your Blog — Condensed from Dozens of
Bloggers' Experiences
url:      /2007/04/09/twenty-usability-tips-for-your-blog-condensed-from-dozens-of-bloggers-experiences/

Most probable keys causing the issue:
   html (17.84 Kb), content (17.82 Kb), summary (0.24 Kb)

Complete log of the record has been extracted to:
   /Users/tomjoht/projects/idratherbewriting/jekyll-algolia-record-too-big-26d1ffabbabdf93b16076f5127d7100e.log

This issue can be caused by malformed HTML preventing the parser to correctly
grab the content of the nodes. Double check that the page actually renders
correctly with a regular `jekyll build`.

If you're having trouble solving this issue, feel free to file a bug on GitHub,
ideally with a link to a repository where we can reproduce the issue.
  https://github.com/algolia/jekyll-algolia/issues
@pixelastic

This comment has been minimized.

Contributor

pixelastic commented Mar 29, 2018

Going to check that right now :)

@pixelastic

This comment has been minimized.

Contributor

pixelastic commented Mar 29, 2018

I understand what is going on. It seems that a few of your posts generate records that are too big. I was doing my tests on a privileged account with higher quotas, so I didn't see this issue until now.

The good news is that the "records too big" issue is actually something I can fix from inside the plugin, without requiring you (or others) to upgrade their account. I have working version on my machine, I now need to test it properly before releasing a new version of the gem. I might not be able to do it before the week-end, but I'll finish it early next week.

In the meantime, you can revert to using my account for the search, it still contains everything.

@tomjoht

This comment has been minimized.

Owner

tomjoht commented Mar 29, 2018

Sweet, thanks for finding a solution. I'll wait for your update to the plugin. In the meantime, I did switch the app ID and search key back to yours so that the search results would show more results.

@tomjoht

This comment has been minimized.

Owner

tomjoht commented Mar 31, 2018

I noticed an issue with the UI integration. When the site loads, the "Search Results" title and "Search by Algolia" image first appear for a split second before being hidden. As a result, the site jumps a bit.

Also, if I open the JS Console, I see an error related to the join code. I made a short little video (my voice is hardly audible) here: https://www.screencast.com/t/hQhX0mBN508

Would it be possible for you to address these two issues? I explored the site jumping issue a bit but couldn't figure it out.

@tomjoht

This comment has been minimized.

Owner

tomjoht commented Apr 4, 2018

Any news on updates? just thought I'd check in.

@pixelastic

This comment has been minimized.

Contributor

pixelastic commented Apr 4, 2018

I'm about to release a new version of the plugin that will fix this issues. I also discovered a performance bottleneck (pretty visible on your website), that I fixed as well. I'm double checking on your website, and I'll release the new version.

@pixelastic

This comment has been minimized.

Contributor

pixelastic commented Apr 4, 2018

I released the new version and also submitted PR to your other repositories to fix the flickering and JavaScript issues. Let me know how it works this time :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment