-
Notifications
You must be signed in to change notification settings - Fork 590
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
docs(search): append scraped API records to algolia index in CI #9366
Conversation
|
I ran this script locally and generated a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All code is hacks and this is an effective one. +1 from me.
| "docs/reference/expression-temporal.qmd", | ||
| ] | ||
|
|
||
| HORRID_REGEX = re.compile(r"\|\s*\[(\w+)\]\((#[\w.]+)\)\s*\|\s*(.*?)\s*\|") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
😱
| env: | ||
| ALGOLIA_WRITE_API_KEY: ${{ secrets.ALGOLIA_WRITE_API_KEY }} | ||
| ALGOLIA_APP_ID: HS77W8GWM1 | ||
| ALGOLIA_INDEX: prod_ibis |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a way to avoid duplicating these envs across steps? Like a top-level env mapping? Not a big deal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
probably, I'll try to consolidate in a follow-up when i add a few more tweaks to the algolia index creation.
| @@ -0,0 +1,72 @@ | |||
| from __future__ import annotations # noqa: INP001 | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This honestly isn't that horrid of a script. A few more comments/docstrings to explain the method to the method madness would help though. "This script generates records for algolia to search for all methods/functions because ...., the records look like ...."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tried to document a bit more. Time to 🚢
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with Jim, I've seen and done much worse 😂
There's a larger issue here, which is that the quarto `search.json` doesn't seem to include a bunch of items which we generate using `quartodoc`, which makes the docs very unhelpful for someone trying to search (especially) method names. This is definitely a hack, but I've tried uploading these records manually and it does make a noticeable improvement. And yes, I am scraping through QMD files to grab the anchors, and descriptions, and method names, and yes that's gross, but computers are gross. I'm planning to spend a bit more time to try to better understand how we can better augment the algolia index so our search is more useful, but this is both a start, and a proof-of-concept that we can append to our existing index.
9f48233
to
15c86e1
Compare
There's a larger issue here, which is that the quarto
search.jsondoesn't seem to include a bunch of items which we generate using
quartodoc, which makes the docs very unhelpful for someone trying tosearch (especially) method names.
This is definitely a hack, but I've tried uploading these records
manually and it does make a noticeable improvement.
And yes, I am scraping through QMD files to grab the anchors, and
descriptions, and method names, and yes that's gross, but computers are gross.
I'm planning to spend a bit more time to try to better understand how we
can better augment the algolia index so our search is more useful, but
this is both a start, and a proof-of-concept that we can append to our
existing index.