-
Notifications
You must be signed in to change notification settings - Fork 182
[CI] fix spelling mistakes in the codebase by adding codespell #2550
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
daal4py/sklearn/cluster/dbscan.py
Outdated
alg = daal4py.dbscan( | ||
method="defaultDense", | ||
fptype=fpt, | ||
fptype=getFPType(XX), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@icfaust Could this be made as an optional check producing some report like the code coverage ones? Otherwise, I don't think it'd be helpful to require changes like this one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could only find this: https://stackoverflow.com/questions/75500530/how-to-avoid-disabling-a-pre-commit-hook so it looks to not be possible. Let me know why you think not fixing these issues is considered better (given this will reduce review burden). If something is truly desired there are ways to in a case-by-case basis to disable the check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you make this specific case of 'fpt' disable the check as an example here? Could be a good test for the CI logic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There will be a file under .github/.codespellrc
which can deselect words for the whole codebase, the whole list is currently : als,coo,inout,nd,ons
. I have changed that specific circumstance to an inline ignore to show the case-by-case basis as was desired.
@icfaust Since this appears to take a lot longer to run than the usual linting, could it be moved into a separate job? |
The last 5 merges to main The run for this PR: 54s https://dev.azure.com/daal/daal4py/_build/results?buildId=47748&view=results I don't understand? Could you link me to where you were looking? |
Yes, seems it's fine - I might have been looking at the waiting time which was longer. |
gotcha. it seems that a bunch of PRs were merged today to oneDAL and sklearnex all at once so wait times for runners went up quite a lot |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR integrates codespell into the repository to automatically fix spelling mistakes across the codebase and documentation. The fixes improve text clarity and enhance the professional quality of the project.
- Corrects various misspellings in comments and documentation.
- Adds a pre-commit hook configuration for codespell.
- Provides a configuration file for codespell.
Reviewed Changes
Copilot reviewed 75 out of 75 changed files in this pull request and generated no comments.
Show a summary per file
File | Description |
---|---|
examples/daal4py/decision_forest_classification_default_dense.py | Corrects spelling in comments (e.g. from "Traiing" to "Trailing") |
examples/daal4py/covariance.py | Fixes comment typo ("ourselfs" to "ourselves"). |
examples/daal4py/association_rules.py | Corrects multiple misspellings in comments. |
doc/sources/quick-start.rst | Fixes typo in documentation text. |
doc/sources/kaggle/regression.rst | Corrects misspellings ("paramters" to "parameters"). |
doc/sources/kaggle/classification.rst | Corrects multiple spelling errors in steps description. |
doc/sources/kaggle/automl.rst | Fixes spelling mistakes in the kernel example documentation. |
doc/sources/ideas.rst | Corrects typos and improves phrasing in project ideas. |
doc/sources/daal4py.rst | Fixes comment typos in code example references. |
doc/sources/array_api.rst | Corrects typo ("witout" to "without") in documentation. |
doc/sources/about_daal4py.rst | Fixes spelling mistakes in several commentary sections. |
deselected_tests.yaml | Corrects comment typo regarding tree count behavior. |
daal4py/sklearn/utils/validation.py | Fixes typos in comment descriptions. |
daal4py/sklearn/tree/decision_tree.py | Fixes a comment typo ("corrsponding" to "corresponding"). |
daal4py/sklearn/svm/svm.py | Corrects spelling in error message text. |
daal4py/sklearn/manifold/_t_sne.py | Fixes typo in exception message ("suported" to "supported"). |
daal4py/sklearn/cluster/dbscan.py | Adds codespell ignore comments for variable naming. |
.pre-commit-config.yaml | Adds codespell hook configuration. |
.github/scripts/doc_release.sh | Corrects typo in comment ("verison" to "version"). |
.github/.codespellrc | New configuration file for codespell. |
Comments suppressed due to low confidence (1)
examples/daal4py/decision_forest_classification_default_dense.py:50
- The comment now uses 'Trailing' which might be a typo. If this comment is intended to refer to the result produced by a training algorithm, consider changing 'Trailing' to 'Training' for better clarity.
# Trailing result provides (depending on parameters) model,
/intelci: run |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great to add this, thanks for doing it! On commented stuff - excellent. On variable names, I am fine to leave or keep as is. On functions and args - we should be careful but looks like theyre used internally here. But I don't think changing inpt->inp is necessary anyway.
@@ -972,7 +972,7 @@ def prepare_hlwrapper(self, ns, mode, func, no_dist, no_stream): | |||
itype, | |||
"const", | |||
dflt, | |||
inpt=True, | |||
inp=True, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not so sure about this one
/intelci: run |
Description
Root out all the misspellings in the codebase by using
codespell
(https://github.com/codespell-project/codespell) a tool currently also in use in oneTBB. This can be simply added as a pre-commit hook along with our other linters. This should improve the quality of the codebase and docs at little cost (5 seconds or less). This will improve the overall professional quality we show externally.This PR includes fixes for all hits which were found (73 files).
No performance benchmarks necessary as it is a lint.
PR should start as a draft, then move to ready for review state after CI is passed and all applicable checkboxes are closed.
This approach ensures that reviewers don't spend extra time asking for regular requirements.
You can remove a checkbox as not applicable only if it doesn't relate to this PR in any way.
For example, PR with docs update doesn't require checkboxes for performance while PR with any change in actual code should have checkboxes and justify how this code change is expected to affect performance (or justification should be self-evident).
Checklist to comply with before moving PR from draft:
PR completeness and readability
Testing
Performance