Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gitbook searching often misses references #387

Closed
aronatkins opened this issue Apr 10, 2017 · 9 comments
Closed

gitbook searching often misses references #387

aronatkins opened this issue Apr 10, 2017 · 9 comments
Milestone

Comments

@aronatkins
Copy link

I have noticed that the gitbook search often is unable to find obvious matches in text.

Here is a sample book:

---
title: "bookdown search"
output: bookdown::gitbook
---

# Introduction

This is a document containing Red Hat Enterprise Linux/CentOS references.

# Important Section

Here is a section that is quite important. It also mentions Red Hat Enterprise Linux/CentOS.

Searching for "cent" or "Cent" finds nothing. Searching for "Linux/Cent" does find a match.

I rendered with:

bookdown::render_book("index.Rmd")

Here is the search_index.json:

[
["index.html", "bookdown search 1 Introduction", " bookdown search 1 Introduction This is a document containing Red Hat Enterprise Linux/CentOS references. "],
["important-section.html", "2 Important Section", " 2 Important Section Here is a section that is quite important. It also mentions Red Hat Enterprise Linux/CentOS. "]
]

Here is my session information:

devtools::session_info('bookdown')
rmarkdown::pandoc_version()
system('pdflatex --version')
> devtools::session_info('bookdown')
Session info ----------------------------------------------------------------------------
 setting  value                       
 version  R version 3.2.4 (2016-03-10)
 system   x86_64, darwin13.4.0        
 ui       RStudio (1.1.138)           
 language (EN)                        
 collate  en_US.UTF-8                 
 tz       America/New_York            
 date     2017-04-10                  

Packages --------------------------------------------------------------------------------
 package   * version    date       source                            
 backports   1.0.5      2017-01-18 cran (@1.0.5)                     
 base64enc   0.1-3      2015-07-28 CRAN (R 3.2.0)                    
 bitops      1.0-6      2013-08-17 CRAN (R 3.2.0)                    
 bookdown    0.3.17     2017-04-10 Github (rstudio/bookdown@327e886) 
 caTools     1.17.1     2014-09-10 CRAN (R 3.2.0)                    
 digest      0.6.12     2017-01-27 cran (@0.6.12)                    
 evaluate    0.10       2016-10-11 CRAN (R 3.2.5)                    
 highr       0.6        2016-05-09 CRAN (R 3.2.5)                    
 htmltools   0.3.5      2016-03-21 CRAN (R 3.2.4)                    
 jsonlite    1.4        2017-04-08 cran (@1.4)                       
 knitr       1.15.19    2017-04-10 Github (yihui/knitr@6f166e2)      
 magrittr    1.5        2014-11-22 CRAN (R 3.2.0)                    
 markdown    0.7.7      2015-04-22 CRAN (R 3.2.0)                    
 mime        0.5        2016-07-07 CRAN (R 3.2.5)                    
 Rcpp        0.12.9     2017-01-14 cran (@0.12.9)                    
 rmarkdown   1.4.0.9001 2017-04-10 Github (rstudio/rmarkdown@b7434dc)
 rprojroot   1.2        2017-01-16 cran (@1.2)                       
 stringi     1.1.5      2017-04-07 cran (@1.1.5)                     
 stringr     1.2.0      2017-02-18 cran (@1.2.0)                     
 yaml        2.1.14     2016-11-12 cran (@2.1.14)                    
> rmarkdown::pandoc_version()
[1] ‘1.17.2’
> 
> system('pdflatex --version')
pdfTeX 3.14159265-2.6-1.40.16 (TeX Live 2015)
kpathsea version 6.2.1
Copyright 2015 Peter Breitenlohner (eTeX)/Han The Thanh (pdfTeX).
There is NO warranty.  Redistribution of this software is
covered by the terms of both the pdfTeX copyright and
the Lesser GNU General Public License.
For more information about these matters, see the file
named COPYING and the pdfTeX source.
Primary author of pdfTeX: Peter Breitenlohner (eTeX)/Han The Thanh (pdfTeX).
Compiled with libpng 1.6.17; using libpng 1.6.17
Compiled with zlib 1.2.8; using zlib 1.2.8
Compiled with xpdf version 3.04
@aronatkins
Copy link
Author

Even simpler: If my book has the sentence:

I walked by the foo/bar store.

I can search for "foo/bar" but not "bar".

@yihui
Copy link
Member

yihui commented Apr 11, 2017

The (client-side) searching engine is based on lunr.js https://lunrjs.com. I didn't tweak any options in this library, and it seems there were certain problems with tokenizing (perhaps slash was not recognized as a word separator). I'll take a look at the lunr docs.

@aronatkins
Copy link
Author

@yihui yihui added this to the v0.4 milestone Apr 11, 2017
@yihui yihui closed this as completed in 966e6cf Apr 11, 2017
@yihui
Copy link
Member

yihui commented Apr 11, 2017

Thanks! Should be fixed now.

@nicholaelaw
Copy link

Hi @yihui, it seems that the search function does not support Chinese at all. Searching for Chinese characters returns nothing.

Can you confirm this?

@yihui
Copy link
Member

yihui commented May 13, 2017

You are right. It does not support CJK languages.

@olivernn
Copy link

@nicholaelaw @yihui there are plugins for many languages, but not Chinese. A great way to contribute would be to help with adding a Chinese language extension.

@homerhanumat
Copy link
Contributor

homerhanumat commented Sep 8, 2017

Update: Ignore this post, the issue is with the gitbook search, not with bookdown.

It seems this issue is cropping up again.

Sample book: file is index.Rmd:

--- 
title: "Testing"
author: "me"
output: bookdown::gitbook
---

# Testing

Let's take as much time as we have.

abcdefghijklmnopqrstuvwxyz

Contents of _bookdown.yml:

book_filename: "testing"
output_dir: docs

Now run bookdown::render_book('index.Rmd').

Searching docs/testing.html for "m", we get matches. For "a" we get no matches. (We get matches for only a few letters of the alphabet.)

Contents of docs/search_indes.json:

[
["index.html", "Testing 1 Testing", " Testing me 1 Testing Let’s take as much time as we have. abcdefghijklmnopqrstuvwxyz "]
]

Session info:

R version 3.4.1 (2017-06-30)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Sierra 10.12.6

Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

loaded via a namespace (and not attached):
 [1] compiler_3.4.1  backports_1.1.0 magrittr_1.5    bookdown_0.5    rprojroot_1.2  
 [6] htmltools_0.3.6 tools_3.4.1     rstudioapi_0.6  yaml_2.1.14     Rcpp_0.12.12   
[11] stringi_1.1.5   rmarkdown_1.6   knitr_1.17      stringr_1.2.0   digest_0.6.12  
[16] evaluate_0.10.1

Bookdown version is 0.5.

@github-actions
Copy link

github-actions bot commented Nov 6, 2020

This old thread has been automatically locked. If you think you have found something related to this, please open a new issue by following the issue guide (https://yihui.org/issue/), and link to this old issue if necessary.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Nov 6, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants