Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

search does not find results for distopia (and others...) #202

Open
orbeckst opened this issue Sep 27, 2021 · 12 comments
Open

search does not find results for distopia (and others...) #202

orbeckst opened this issue Sep 27, 2021 · 12 comments
Labels
bug search site search, SEO

Comments

@orbeckst
Copy link
Member

orbeckst commented Sep 27, 2021

I searched for "distopia" and "CalcBondsOrtho" and no hits (even though its sitemap has been added in #201 (Issue #200) .

Probably similar issues as #199 but not clear yet why the algolia search does not seem to index these parts of the site. Perhaps the config options need to be adjusted.

EDIT: While investigating the issue it is becoming clear that other docs are also not being indexed or the index is not being updated. I am making this issue a catch all.

@orbeckst orbeckst added bug search site search, SEO labels Sep 27, 2021
@orbeckst orbeckst mentioned this issue Sep 27, 2021
2 tasks
@orbeckst
Copy link
Member Author

I fixed a missing tag in eb82c82 but I doubt that this will fix everything... let's give it 24h.

@hmacdope
Copy link
Member

Thanks @orbeckst! I am not much help here ...

@orbeckst
Copy link
Member Author

orbeckst commented Oct 6, 2021

I raised algolia/docsearch-configs#4699, perhaps we can get some help from the algolia folks.

I also created PR algolia/docsearch-configs#4700 to include code samples (pre tag) and definition items in definition lists (dt tags) in the index.

@orbeckst
Copy link
Member Author

orbeckst commented Oct 9, 2021

I installed the docsearch-scraper locally and I’m able to run it so I can now debug more easily.

@orbeckst
Copy link
Member Author

orbeckst commented Oct 9, 2021

Well... maybe not that simple:

$ ./docsearch run ../docsearch-configs/configs/mdanalysis.json
...
algoliasearch.exceptions.RequestException: Record quota exceeded. Change plan or delete records.

Nb hits: 10415
previous nb_hits: 85975

Will need to see how to work within these limitations.

@hmacdope
Copy link
Member

hmacdope commented Oct 9, 2021 via email

@orbeckst
Copy link
Member Author

I don't think this applies when you're running through their infrastructure as an approved open source documentation site.

What I am trying is to use their "commercial" analytics infrastructure on the free plan with our own scraped index. I am really only interested in running the scraper and seeing how changing the config file changes what it picks up. If I find some time I'll just try to disable the sending of results and retain the local index building.

@orbeckst
Copy link
Member Author

To run docsearch without commiting changes to the actual index, see orbeckst/docsearch-scraper#1 for the required code change.

@orbeckst orbeckst changed the title search does not find results for distopia search does not find results for distopia (and others...) Oct 10, 2021
@orbeckst
Copy link
Member Author

Sitemaps for multiple projects are now broken because a release string was inserted into the URL even though this does not reflect the deployment URL. My suspicion is that something changed in the sphinx_sitemap plugin or in our Sphinx configuration.

It seems unlikely that this is due to the GH actions workflow because PMDA (which has been using Travis and not actions) also hast the same problem.

@orbeckst
Copy link
Member Author

Looking at the docs for the sitemap plugin, the solution appears to be set in conf.py

sitemap_url_scheme = "{link}"

so that the version is not included.

@orbeckst
Copy link
Member Author

New PR algolia/docsearch-configs#4751 but there are still many pages that are not showing up. See the PR for notes.

@orbeckst
Copy link
Member Author

With #211 , the changes to the config are done through the Crawler web interface. Initial testing indicates that even after switching to v3 (see PR #212) we still have the same issues that are noted here. Furthermore, I see multiple versions of the docs (1.0.1, 1.1.1, 2.0.0, ...) show up. This should really only be stable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug search site search, SEO
Projects
None yet
Development

No branches or pull requests

2 participants