-
Notifications
You must be signed in to change notification settings - Fork 147
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add documentation for spellchecker and spellcheck docs #2025
Conversation
Note: I briefly messed around with the spellchecker, and wasn't immediately able to get it working on Linux. It looks like the script as it currently stands is Mac-only, but even manually installing Aspell, I couldn't get it to use |
Thanks @zmbc for drawing this to our attention! Will have a look into it :) |
Thanks again @zmbc. Can I just check which Linux distribution and version you're running? I'm assuming you tried to run the spellchecker following the instructions in this PR - is that correct? If so, it is probably not working because extra dependencies are required for Linux, e.g. we've been able to run the spellchecker on a Docker file using alpine 3.14 but this required installing some additional dependencies: |
@zslade Sorry -- I should have given a bit more information. I'm on Ubuntu 20.04 via Windows Subsystem for Linux. I installed aspell via conda. I can run the spellchecker script (if I comment out the Mac-specific part) but I have no reason to believe that it is using the LibreOffice dictionaries. |
Running on |
Thanks @zmbc. Just so I'm really clear - the spellchecker runs for you (spellchecks the docs) but doesn't pass (finds spelling errors)? We are still in the process of updating As a rough check, could you please Would you also be able to tell us which Mac-specific parts you commented out? This will help us if we decide built a system-agnostic version :) Many thanks! |
I get
Correct, and a lot of them, so I figured that wasn't working correctly. And, I'm pretty sure it's not using the
I commented out these lines, since they use Homebrew, which is Mac-specific: splink/scripts/pyspelling/spellchecker.sh Lines 13 to 15 in 1d942f0
I replaced them with conda package installation (aspell and go-yq), which is cross-platform. Happy to contribute this if you are interested. Would you be open to switching to that, or would you want to have both?
Ah okay, I think this was the crux of my misunderstanding. I was working on doc updates in #2083 and thought I needed to get this spellcheck passing. The name of this PR indicated to me that the feature was done, just not documented. |
@ThomasHepworth, I think all mistakes have been corrected now 🙌 . See the latest diff for the decisions I made related to your PR #2102.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two minor comments and then you can merge this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is really great, thank you so much for pulling it together! Sorry it was so much work.
Type of PR
Is your Pull Request linked to an existing Issue or Pull Request?
Docs spellchecker update: #2000
Original spellchecker PR: #1588
Give a brief description for the solution you have provided
I have updated the documentation with guidance on how to use the docs spellchecker, following on from this issue #2000.
All docs have been spellchecked to set a baseline for 'spellchecked' docs. This involves
custom_dictonary.txt
which aren't being picked up by LibreOffice dictionaries, e.g. words like Splinkpyspelling.yml
to bypass certain patterns or sections in the docs, e.g. code blocks@ThomasHepworth and I decided not to make the spellchecker part of CD/CI with a GitHub Action at this stage as would require some non-trivial configuration and a new check box on the PR template will probably suffice.
Code for testing ignore rules in pyspelling.yml
General text
This is some text that does not contain a mistake
startoffilemistake
colons
With a massive thanks to external contributor @hanslemm, Splink now supports :simple-postgresql: Postgres. To get started, check out the Postgres Topic Guide.
[:colon-mistake: RSS feed]
:octicons-duplicate-24-mistake:
Code block ignore test
python code
pycodeblockmistakeee
alt python blocks
inline code
inlinecodemistkakeeee
l.first_name = r.first_name and substr(l.surname,1,1) = substr(r.surname,1,1)
.blog post headers
date: 2024-01-23
authors:
categories:
anchor tags
anchormistake
Test text between
diff
blocks??? info "Example of convergence output"
links
create_function
functiontext between
$$
$$\text{}
mathcodeblockmistake
\something $$
Almost 100%, say 98%$inlinemathmistake m \approx 0.98$
compound words
Equi-join
Card content ignore test
::cards::
[
{
cardblockmistakee
},
]
::/cards::
auto generated code
::: splink.linker.Linker
handler: python
selection:
members:
- cluster_pairwise_predictions_at_threshold
- compare_two_records
- compute_tf_table
- deterministic_link
- find_matches_to_new_records
- load_settings
- load_model
- load_settings_from_json
- predict
rendering:
show_root_heading: false
show_source: true
BibTeX blocks
error logger
To enable the logging of multiple errors in a singular check, or across multiple checks, an
ErrorLogger
class is available for use.The
ErrorLogger
operates in a similar way to working with a list, allowing you to add additional errors using theappend
method. Once you've logged all of your errors, you can raise them with theraise_and_log_all_errors
method.??? note "
ErrorLogger
in practice"```py
from splink.exceptions import ErrorLogger
endoffilemistake
PR Checklist