Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Typos in Numba Documentation #9234

Closed
2 tasks done
SridharCR opened this issue Oct 10, 2023 · 5 comments · Fixed by #9236
Closed
2 tasks done

Typos in Numba Documentation #9234

SridharCR opened this issue Oct 10, 2023 · 5 comments · Fixed by #9236

Comments

@SridharCR
Copy link
Contributor

Typos in Numba Documentation

  • I have tried using the latest released version of Numba (most recent is
    visible in the change log (https://github.com/numba/numba/blob/main/CHANGE_LOG).
  • I have included a self contained code sample to reproduce the problem.
    i.e. it's possible to run as 'python bug.py'.

There are multiple typos with Numba documentation such as, I have listed down few of them.

  1. .\source\user\installing.rst:8: compatability ==> compatibility (https://numba.readthedocs.io/en/stable/user/installing.html#compatibility)

  2. .\source\user\parallel.rst:133: noticable ==> noticeable
    (https://numba.readthedocs.io/en/stable/user/parallel.html#explicit-parallel-loops)

  3. .\source\proposals\jit-classes.rst:198: inhertance ==> inheritance
    (https://numba.readthedocs.io/en/stable/proposals/jit-classes.html#inheritance)

etc.

Happy to fix all the typos and add some tests to prevent typos in the future.

@esc
Copy link
Member

esc commented Oct 10, 2023

@SridharCR thank you for raising this! How many typos are we talking here an what would be scope of your proposed PR? Also, what would you suggest to use to prevent such typos in the future?

@SridharCR
Copy link
Contributor Author

@esc In docs folder, there are approximately 76 typos are listed down. I'm thinking of integrating the codespell to the project's precommit or github workflow to prevent the typos in the future. It also helps in listing the issues and fixing these typos with utils, instead of manual fixes.

Share your comments on the same...

@esc
Copy link
Member

esc commented Oct 10, 2023

@SridharCR thank you for pointing us in this direction and thank you for your efforts to improve Numba. I tried codespell just now and will summarize my findings here:

I ran the following (all in directory docs):

 💣 zsh» codespell **/*.rst | wc -l
75

So this suggests that there are 75 typos.

The are:

 💣 zsh» codespell **/*.rst
source/cuda-reference/kernel.rst:657: hge ==> he
source/cuda/cuda_array_interface.rst:515: documen ==> document
source/cuda/examples.rst:349: integeration ==> integration
source/cuda/minor_version_compatibility.rst:3: Compatiblity ==> Compatibility
source/extending/overloading-guide.rst:173: re-use ==> reuse
source/proposals/jit-classes.rst:198: inhertance ==> inheritance
source/proposals/jit-classes.rst:205: HSA ==> HAS
source/reference/jit-compilation.rst:45: re-use ==> reuse
source/reference/jit-compilation.rst:46: re-use ==> reuse
source/reference/jit-compilation.rst:439: nin ==> inn, min, bin, nine
source/reference/jit-compilation.rst:441: nin ==> inn, min, bin, nine
source/reference/jit-compilation.rst:450: nin ==> inn, min, bin, nine
source/reference/jit-compilation.rst:500: nin ==> inn, min, bin, nine
source/reference/jit-compilation.rst:500: nin ==> inn, min, bin, nine
source/reference/jit-compilation.rst:500: nin ==> inn, min, bin, nine
source/release-notes.rst:336: precesion ==> precision, precession
source/release-notes.rst:348: optmized ==> optimized
source/release-notes.rst:678: Sargeant ==> Sergeant
source/release-notes.rst:680: Collison ==> Collision, Collusion
source/release-notes.rst:780: Collison ==> Collision, Collusion
source/release-notes.rst:850: Sargeant ==> Sergeant
source/release-notes.rst:872: Collison ==> Collision, Collusion
source/release-notes.rst:1043: infomation ==> information
source/release-notes.rst:1059: Collison ==> Collision, Collusion
source/release-notes.rst:1099: rquest ==> request, quest
source/release-notes.rst:1122: Collison ==> Collision, Collusion
source/release-notes.rst:1149: Collison ==> Collision, Collusion
source/release-notes.rst:1202: Collison ==> Collision, Collusion
source/release-notes.rst:1302: seperate ==> separate
source/release-notes.rst:1334: Collison ==> Collision, Collusion
source/release-notes.rst:1356: Collison ==> Collision, Collusion
source/release-notes.rst:1361: Collison ==> Collision, Collusion
source/release-notes.rst:1381: Collison ==> Collision, Collusion
source/release-notes.rst:1392: hsa ==> has
source/release-notes.rst:1407: Collison ==> Collision, Collusion
source/release-notes.rst:1415: encounted ==> encountered, encounter
source/release-notes.rst:1442: Collison ==> Collision, Collusion
source/release-notes.rst:1464: Collison ==> Collision, Collusion
source/release-notes.rst:1548: Collison ==> Collision, Collusion
source/release-notes.rst:1598: Collison ==> Collision, Collusion
source/release-notes.rst:1600: Collison ==> Collision, Collusion
source/release-notes.rst:1720: Collison ==> Collision, Collusion
source/release-notes.rst:1727: Collison ==> Collision, Collusion
source/release-notes.rst:1732: Collison ==> Collision, Collusion
source/release-notes.rst:1733: Collison ==> Collision, Collusion
source/release-notes.rst:1793: Collison ==> Collision, Collusion
source/release-notes.rst:1862: Collison ==> Collision, Collusion
source/release-notes.rst:1993: Collison ==> Collision, Collusion
source/release-notes.rst:1995: Collison ==> Collision, Collusion
source/release-notes.rst:2003: Collison ==> Collision, Collusion
source/release-notes.rst:2024: initilising ==> initialising
source/release-notes.rst:2058: Collison ==> Collision, Collusion
source/release-notes.rst:2227: implmentation ==> implementation
source/release-notes.rst:2272: HSA ==> HAS
source/release-notes.rst:2568: Manuel ==> Manual
source/release-notes.rst:3778: simplication ==> simplification, implication
source/release-notes.rst:4163: HSA ==> HAS
source/release-notes.rst:4306: Unneccessary ==> Unnecessary
source/release-notes.rst:4444: unsiged ==> unsigned
source/release-notes.rst:4588: tranpose ==> transpose
source/release-notes.rst:4890: proceses ==> processes
source/release-notes.rst:4977: recipies ==> recipes
source/release-notes.rst:5379: HSA ==> HAS
source/release-notes.rst:5440: HSA ==> HAS
source/release-notes.rst:5488: HSA ==> HAS
source/release-notes.rst:5795: superceded ==> superseded
source/release-notes.rst:5835: exeception ==> exception
source/release-notes.rst:5842: ambigous ==> ambiguous
source/user/faq.rst:251: HSA ==> HAS
source/user/installing.rst:8: compatability ==> compatibility
source/user/parallel.rst:133: noticable ==> noticeable
source/user/talks.rst:18: Manuel ==> Manual
source/user/talks.rst:19: Tunnell ==> Tunnel
source/user/vectorize.rst:147: nin ==> inn, min, bin, nine
upcoming_changes/README.rst:11: relase ==> release

Looking closer at this list, there are several duplicates and false positives. For example "Collison" is a name and "HSA" is a real term: https://en.wikipedia.org/wiki/Heterogeneous_System_Architecture

Following this, I crafted an ignore file as such:

 💣 zsh» cat IGNORE                                                                                                                                                     :(
hsa
hge
collison
nin
sargeant
manuel
tunnell

Running this again yields:

 💣 zsh» codespell -I IGNORE  **/*.rst | wc -l
27

Which reduces the total number of typos by almost 2/3.

Looking closer at the results themselves:

 💣 zsh» codespell -I IGNORE  **/*.rst
source/cuda/cuda_array_interface.rst:515: documen ==> document
source/cuda/examples.rst:349: integeration ==> integration
source/cuda/minor_version_compatibility.rst:3: Compatiblity ==> Compatibility
source/extending/overloading-guide.rst:173: re-use ==> reuse
source/proposals/jit-classes.rst:198: inhertance ==> inheritance
source/reference/jit-compilation.rst:45: re-use ==> reuse
source/reference/jit-compilation.rst:46: re-use ==> reuse
source/release-notes.rst:336: precesion ==> precision, precession
source/release-notes.rst:348: optmized ==> optimized
source/release-notes.rst:1043: infomation ==> information
source/release-notes.rst:1099: rquest ==> request, quest
source/release-notes.rst:1302: seperate ==> separate
source/release-notes.rst:1415: encounted ==> encountered, encounter
source/release-notes.rst:2024: initilising ==> initialising
source/release-notes.rst:2227: implmentation ==> implementation
source/release-notes.rst:3778: simplication ==> simplification, implication
source/release-notes.rst:4306: Unneccessary ==> Unnecessary
source/release-notes.rst:4444: unsiged ==> unsigned
source/release-notes.rst:4588: tranpose ==> transpose
source/release-notes.rst:4890: proceses ==> processes
source/release-notes.rst:4977: recipies ==> recipes
source/release-notes.rst:5795: superceded ==> superseded
source/release-notes.rst:5835: exeception ==> exception
source/release-notes.rst:5842: ambigous ==> ambiguous
source/user/installing.rst:8: compatability ==> compatibility
source/user/parallel.rst:133: noticable ==> noticeable
upcoming_changes/README.rst:11: relase ==> release

Many of the typos are in the release notes, which are a snapshot of what happened. I don't think those are too bad. Discounting the stuff in release-notes and also the re-use vs reuse (stylistic choice IMHO), leaves us:

source/cuda/cuda_array_interface.rst:515: documen ==> document
source/cuda/examples.rst:349: integeration ==> integration
source/cuda/minor_version_compatibility.rst:3: Compatiblity ==> Compatibility
source/proposals/jit-classes.rst:198: inhertance ==> inheritance
source/user/installing.rst:8: compatability ==> compatibility
source/user/parallel.rst:133: noticable ==> noticeable
upcoming_changes/README.rst:11: relase ==> release

I think these 7 are legitimate typos that you could make a PR for, that would be fine. It's less than 10% of what codespell reported by default, but still nice finds! 🙌

About the Github action or other automation: given the large number of false positives, I am not convinced that this could be automated very well and I would like to keep a human in the loop to be on the safe side. So, instead I think it will be best for the release manager (RM) to run codespell manually on the docs before a release to catch anything that was missed during review. This is a 5-10 minute task once every release so I think that is feasible. Moving forward on this idea, an additional bullet would be added to the release checklists to make sure codespell will be run.

@esc
Copy link
Member

esc commented Oct 10, 2023

@SridharCR thank you again for pointing out codespell, I went and used it on the llvmlite project too and found two typos there:

numba/llvmlite#996

❤️

@SridharCR
Copy link
Contributor Author

SridharCR commented Oct 10, 2023

Cool @esc, then I can fix the typos for now, and we can skip the automated spell checks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants