[DOC] Programmatically fix (all) typos #5424

kianmeng · 2023-10-14T10:13:26Z

Reference Issues/PRs

What does this implement/fix? Explain your changes.

This PR fixes typos found via:

codespell -S docs,./sktime/datasets -L bu,fpr,binay,afer,nd,methodd,strat,mape,lik,mapp,ser,te,ot,ue,fof,fwe,oen,vie,yur,ned,ths,als,mor,foor,reasy,dum,fo,yhe,mke,bui,som,caf,wqs,slq,veiw,yse
typos --format brief

Does your contribution introduce a new dependency? If yes, which one?

What should a reviewer concentrate their feedback on?

Did you add any tests for the change?

Any other comments?

PR checklist

For all contributions

I've added myself to the list of contributors with any new badges I've earned :-)
How to: add yourself to the all-contributors file in the sktime root directory (not the CONTRIBUTORS.md). Common badges: code - fixing a bug, or adding code logic. doc - writing or improving documentation or docstrings. bug - reporting or diagnosing a bug (get this plus code if you also fixed the bug in the PR).maintenance - CI, test framework, release.
See here for full badge reference
Optionally, I've added myself and possibly others to the CODEOWNERS file - do this if you want to become the owner or maintainer of an estimator you added.
See here for further details on the algorithm maintainer role.
The PR title starts with either [ENH], [MNT], [DOC], or [BUG]. [BUG] - bugfix, [MNT] - CI, test framework, [ENH] - adding or improving code, [DOC] - writing or improving documentation or docstrings.

For new estimators

I've added the estimator to the API reference - in docs/source/api_reference/taskname.rst, follow the pattern.
I've added one or more illustrative usage examples to the docstring, in a pydocstyle compliant Examples section.
If the estimator relies on a soft dependency, I've set the python_dependencies tag and ensured
dependency isolation, see the estimator dependencies guide.

This PR fixes typos found via: - `codespell -S docs,./sktime/datasets -L bu,fpr,binay,afer,nd,methodd,strat,mape,lik,mapp,ser,te,ot,ue,fof,fwe,oen,vie,yur,ned,ths,als,mor,foor,reasy,dum,fo,yhe,mke,bui,som,caf,wqs,slq,veiw,yse` - `typos --format brief`

review-notebook-app · 2023-10-14T10:13:31Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

fkiraly · 2023-10-14T10:57:54Z

docs/source/api_reference/dists_kernels.rst

@@ -181,7 +181,7 @@ See docstring of ``AggrDist`` and ``FlatDist``.
    AggrDist
    FlatDist

-Advanced time series kernels that cannot be expressed as aggrgates or flat applicates:


aggrgates is

no, applicates

I mean, it is correcting applicates to applicants which is a false positive.

It is not correcting aggrgates to aggregates, which a false negative.

I mean on the same line "aggrgates" is a typo, "aggregates" is the correct spelling

yes, that's what I try to say in my last comment.

"applicates" -> "applicants", should not be corrected. "applicates" was correct.
So this is a false positive, because the spellchecker detects it but should not.
In my initial comment I was referring to this one.

"aggrgates" -> "aggregates" should be corrected, "aggrgates" is incorrect.
So this is a false negative, because the spellchecker does not detect it but should.

In my initial comment I overlooked this one, thank you for pointing this out, @szepeviktor!

Having said that, this is a very very specialized PR and a very peculiar location to help out at, but thanks nonetheless. Might I convince you to contribute to sktime in the future?

Might I convince you to contribute to sktime in the future?

I am highly blind to business features - what is the most important for you.
All software is the same cake to me, they just have different (thin) icing on top.

What I can do is

care about every byte in the repo

listen to what the (Python) ecosystem says

be in contact with dependency authors

run never before seen tests

find design errors

find implementation errors

invent end-to-end tests
All my thoughts and tools are public here on GitHub.

The submerged part of your ice berg is my territory.

🥊 cakes vs. ice bergs

all of these sound very nice - everyone wants to eat the cake but not enough iceberg miners...

the largest impact is perhaps bugfixes?
https://github.com/sktime/sktime/projects/18

There are also some larger projects and design questions if interested - we can chat at one of the meetups.

fkiraly · 2023-10-14T10:59:42Z

sktime/base/_meta.py

@@ -783,7 +783,7 @@ def is_flat(obj):


 class _ColumnEstimator:
-    """Mixin class with utilities for by-column applicates."""
+    """Mixin class with utilities for by-column applicants."""


fkiraly · 2023-10-14T11:00:07Z

sktime/benchmarking/benchmarks.py

@@ -10,7 +10,7 @@
 from sktime.utils.validation._dependencies import _check_soft_dependencies


-def is_initalised_estimator(estimator: BaseEstimator) -> bool:
+def is_initialised_estimator(estimator: BaseEstimator) -> bool:


public function names should not be changed, even if they contain a typo

fkiraly · 2023-10-14T11:02:25Z

sktime/tests/_config.py

@@ -39,7 +39,7 @@
    "EditDist",
    "CNNClassifier",
    "FCNClassifier",
-    "InceptionTimeClassifer",
+    "InceptionTimeClassifier",


this has an impact on testing, but it is fine (the estimator should be excluded but was not)

fkiraly · 2023-10-14T11:03:30Z

sktime/utils/_testing/series.py

@@ -79,7 +79,7 @@ def _make_series(
    index = _make_index(n_timepoints, index_type)
    if n_columns == 1 and return_mtype is None or return_mtype == "pd.Series":
        return pd.Series(data.ravel(), index)
-    elif return_mtype is None or return_mtype == "pd.DataFrane":
+    elif return_mtype is None or return_mtype == "pd.DataFrame":


oh, this changes the logic. I hope it does not break the test...

fkiraly

Thanks a lot, very useful!

Blocking:

a small number of suggested changes are small positives, I marked them
small number of instances: we must not change names of public functions or variables without deprecation, that's perhaps something for a separate PR where we respect the deprecation procedure: https://www.sktime.net/en/stable/developer_guide/deprecation.html
but just to reinforce, changes to private function and variable names, including tests that do not appear in public checker modules, are fine.

Non-blocking, but perhaps an idea for another PR:

it would be nice to have this as an automated pre-commit action or regular PR opened automatically! Where we would of course have to think about protecting function names, known false positives, etc.

fkiraly · 2023-10-14T11:07:18Z

@kianmeng, I'd say you can get both the doc and maintenance badges for this - the maintenance one as this is a programmatic way to fix all the typos.

szepeviktor · 2023-10-15T12:56:48Z

@kianmeng There is a mistake in the last commit.
"applicates" is okay.

fkiraly

Approved, thanks a lot!

Would be nice to have this as a regular CI element.

fkiraly · 2023-10-21T21:38:52Z

FYI @yarnabrina, I updated this PR with the new CI workflow, and (as expected) it triggers tests for all modules.

However, there are a number of timeout failures which I cannot diagnose - any idea what is going on here?
Noting, that the timeouts do not seem to occur on the full diagnostic run, or other PRs.

yarnabrina · 2023-10-22T04:29:26Z

@fkiraly, as far as I can see, these are the errors in the new CI workflow: https://github.com/sktime/sktime/actions/runs/6599294180

There are two clear patterns:

transformations failed in python 3.10 on all operating systems, complaining SignatureTransformer do not support 3.10 and above. This looks like a proper error, as python_version tag seems ignored.
networks failed on all python versions on all operating systems, as it failed to collect any unit tests. But it seems that directory indeed has no unit tests in tests sub-directory, which is very surprising to me. I'm not much familiar with this component, so can you comment on that please? If it's decided to skip tests on this, we can remove it from CI.
I couldn't find any timeout, can you please share a link?

Regarding failures in old CI workflow, one seems to be coming from occasional pytest-xdist parallelisation problem, the other from classification looks like proper error. But given it didn't occur in new CI or in any other job even in old CI, it also suggests may be something of sporadic nature? I'm not sure, so cannot comment really.

Also, I do note few other things:

New CI is not testing all directories, as I created extras for few major ones only. Should we create new extras and/or plan how to integrate these in new CI, if you and other core developers agree to eventually replace old CI?
This PR fixes a lot of typos automatically in all components. I did same thing in two of my open diagnostic PR to test the new CI: [MNT] fix typos in base module #5313 and [MNT] fix typos in forecasting module #5314 , which I planned to merge. Do you want me to close those in favour of this PR ([DOC] Programmatically fix (all) typos #5424) which handles a lot more to avoid conflict/confusion/etc.?
This is a question for @kianmeng. I see you are using both codespell and typos. I'm not familiar with typos, but codespell does have pre-commit hooks. But in my experience, it's difficult to control skipping spell check for few words or lines with codespell automatically as some word may be indeed wrong by some typographical mistake where it may be intentionally introduced somewhere else (not saying it's recommended, but I know it happens sometimes and indeed I've seen in our codebase as well). As @fkiraly has asked above as well, do you have any suggestions how to handle all these cases automatically? Does typos help here in any way?

kianmeng · 2023-10-22T06:56:27Z

@yarnabrina Unfortunately, nothing I can think off, perhaps @szepeviktor have a better approach?

fkiraly · 2023-10-22T07:28:26Z

hm, since the "other things" discussion is off-topic for this PR, should we move to #5101 ? I'll reply there to the discussion that is not specific to failures in CI of this PR.

fkiraly · 2023-10-22T07:35:13Z

@fkiraly, as far as I can see, these are the errors in the new CI workflow: https://github.com/sktime/sktime/actions/runs/6599294180

Yes.

transformations failed in python 3.10 on all operating systems, complaining SignatureTransformer do not support 3.10 and above. This looks like a proper error, as python_version tag seems ignored.

Hm, yes. So either it is an error with the new CI or at least two errors, as it does not error out on the old CI. I.e., it is either:

error in the new CI, it collects estimators that it should not
error in the old CI, it does not collect estimators that it should, so it did not detect a second error, in relation to the python version check.

A third scenario, where the trafo is used as component, is unlikely as the error message is not from the constructor.

networks failed on all python versions on all operating systems, as it failed to collect any unit tests. But it seems that directory indeed has no unit tests in tests sub-directory, which is very surprising to me. I'm not much familiar with this component, so can you comment on that please? If it's decided to skip tests on this, we can remove it from CI.

The networks module is under construction, @luca-miniati and @achieveordie are afaik the most knoledgeable about it. It has networks used in estimators in other modules, so most - if not all - tests will be indirect by integration in other estimators.

Mid-term, clearly, we want to have systematic tests for the network module, too.

I couldn't find any timeout, can you please share a link?

Sorry, confusion on my side - I did not interpret the error logs correctly.

fkiraly · 2023-10-22T07:36:09Z

FYI @kianmeng, most of the above does not have to do with your PR, so there is no action needed on your side - we are just using it to test new CI elements.

fkiraly · 2023-10-22T07:44:53Z

Do you want me to close those in favour of this PR (#5424) which handles a lot more to avoid conflict/confusion/etc.?

I was aware, @yarnabrina - my idea was, we first use this PR to test the full CI - like you were doing in the two other PR, and then merge the "typo fixing" PRs in this sequence:

this
the two of yours

In particular, I would not close yours.
(I also hoped this would work silently if there are no issues coming from the new CI, but it didn't...)

Why: the fixes may not be identical, and in merging any conflicts have a good chance to indicate actual errors in the typo fixes.
We also test the new CI triggers for file changes.

fkiraly · 2023-10-22T07:46:50Z

Re SignatureTransformer, this check works as it should:

from sktime.utils.validation._dependencies import _check_estimator_deps
from sktime.transformations.panel.signature_based import SignatureTransformer

_check_estimator_deps(SignatureTransformer)

so I now find it more likely that there's something in collection in the new CI?

achieveordie · 2023-10-22T07:49:05Z

It has networks used in estimators in other modules, so most - if not all - tests will be indirect by integration in other estimators.

That's correct. The network module is an abstraction of the Tensorflow / pytorch implementation of various estimators that are used by the classifier/regressor modules. They contain the model implementation which will be the same for both kinds of estimators, minus the head.

Therefore, there are no tests for this module.

fkiraly · 2023-10-22T07:52:23Z

Therefore, there are no tests for this module.

Well, there could be...

testing build_network runs
testing tags and the like

achieveordie · 2023-10-22T07:58:52Z

Wouldn't we end up testing the same module twice during a single run? Once directly via networks/tests and once indirectly by other modules that use these networks?

yarnabrina · 2023-10-22T07:58:54Z

so I now find it more likely that there's something in collection in the new CI?

I am biased for sure, so if you don't mind, can you please share your python version and operating system? And, whether esig is installed in th environment or not?

My reasoning of why old CI do not fail is this:

'esig==0.9.7; python_version < "3.10"'

This is from all_extras_pandas2, which does not even install esig in case of python version is >=3.10. So, the test is not collected.

Based on my understanding, this should not be the general solution as same soft dependency may be required for other estimators. I was thinking that python version support in extras should not be affected by specific estimators, and it should be handled based on tags only, which does not seem to be working. Let me know if I am making sense.

fkiraly · 2023-10-22T08:29:55Z

can you please share your python version and operating system? And, whether esig is installed in th environment or not?

heres my show_versions output:

System:
    python: 3.11.5 | packaged by Anaconda, Inc. | (main, Sep 11 2023, 13:26:23) [MSC v.1916 64 bit (AMD64)]
executable: [c:\ProgramData\anaconda3\envs\sktime-skbase-311\python.exe](file:///C:/ProgramData/anaconda3/envs/sktime-skbase-311/python.exe)
   machine: Windows-10-10.0.22621-SP0

Python dependencies:
          pip: 23.2.1
       sktime: 0.24.0
      sklearn: 1.3.1
       skbase: 0.6.0
        numpy: 1.25.2
        scipy: 1.11.2
       pandas: 2.1.1
   matplotlib: 3.8.0
       joblib: 1.3.2
        numba: None
  statsmodels: 0.14.0
     pmdarima: 2.0.3
statsforecast: None
      tsfresh: None
      tslearn: None
        torch: None
   tensorflow: None
tensorflow_probability: None

I do not have esig installed.

The error message I get is this, as expected:
ModuleNotFoundError: type requires python version to be <3.10, but system python version is 3.11.5 | packaged by Anaconda, Inc. | (main, Sep 11 2023, 13:26:23) [MSC v.1916 64 bit (AMD64)].

(there seems to be an issue with the print, it should be estimator name, but logic seems ok)

yarnabrina · 2023-10-22T08:44:35Z

(there seems to be an issue with the print, it should be estimator name, but logic seems ok)

I agree. The new CI fails with same as well, so I assumed the logic in test collection is okay.

=========================== short test summary info ============================
FAILED sktime/transformations/panel/signature_based/tests/test_method.py::test_generalised_signature_method - ModuleNotFoundError: SignatureTransformer requires python version to be <3.10, but system python version is 3.10.13 (main, Aug 28 2023, 08:28:42) [GCC 11.4.0].
FAILED sktime/transformations/panel/signature_based/tests/test_method.py::test_window_error - ModuleNotFoundError: SignatureTransformer requires python version to be <3.10, but system python version is 3.10.13 (main, Aug 28 2023, 08:28:42) [GCC 11.4.0].

My interpretation is this needs to be changed:

sktime/sktime/transformations/panel/signature_based/tests/test_method.py

Lines 10 to 14 in f48b1f5

    
           @pytest.mark.skipif( 
        
               not _check_soft_dependencies("esig", severity="none"), 
        
               reason="skip test if required soft dependency esig not available", 
        
           ) 
        
           def test_generalised_signature_method():

Since SignatureTransformer has a specific python version requirement, viz <3.10, it should use _check_estimator_deps instead of _check_soft_dependencies, and that should solve this error.

fkiraly · 2023-10-22T08:47:31Z

it should use _check_estimator_deps instead of _check_soft_dependencies, and that should solve this error.

Yes, it should use run_test_for_class, actually.

My explanation though would be, can the python bound on SignatureTransformer be relaxed? But That's off-topic.

Fix is here: #5474

fkiraly

Failures are due to new CI, and we have now understood them - so, merging.

fkiraly · 2023-10-22T20:41:53Z

@kianmeng, you could get the doc and maintenance badges if you would like.

kianmeng · 2023-10-23T04:40:58Z

🥳 🥳 🥳 🥳 🥳

…heck * origin/main: [ENH] Add tecator dataset for time series regression as `sktime` onboard dataset (sktime#5428) [MNT] fix typos in `forecasting` module (sktime#5314) [MNT] fix typos in `base` module (sktime#5313) [DOC] Programmatically fix (all) typos (sktime#5424)

This PR fixes a minor issue in one of the error messages in `_check_python_version`. Example failure here: #5424 (comment)

@kianmeng

#### Reference Issues/PRs Follow-up #5424 Thank you @kianmeng #### What does this implement/fix? Explain your changes. Found few more misspellings.

fkiraly · 2023-10-29T14:23:06Z

@kianmeng, you could get the doc and maintenance badges if you would like.

For the badges, you have to make a PR to .all-contributorsrc, @kianmeng

[DOC] Fix typos

cff1c30

This PR fixes typos found via: - `codespell -S docs,./sktime/datasets -L bu,fpr,binay,afer,nd,methodd,strat,mape,lik,mapp,ser,te,ot,ue,fof,fwe,oen,vie,yur,ned,ths,als,mor,foor,reasy,dum,fo,yhe,mke,bui,som,caf,wqs,slq,veiw,yse` - `typos --format brief`

kianmeng requested review from miraep8, achieveordie, benHeid, fkiraly and yarnabrina as code owners October 14, 2023 10:13

fkiraly reviewed Oct 14, 2023

View reviewed changes

fkiraly requested changes Oct 14, 2023

View reviewed changes

fkiraly changed the title ~~[DOC] Fix typos~~ [DOC] Programmatically fix (all) typos Oct 14, 2023

fkiraly added documentation Documentation & tutorials maintenance Continuous integration, unit testing & package distribution labels Oct 14, 2023

Code review fixes

c71e9db

kianmeng force-pushed the fix-typos branch from 45cbe32 to c71e9db Compare October 15, 2023 12:52

kianmeng requested review from fkiraly and szepeviktor October 15, 2023 12:53

Further code review fixes

0288ff0

szepeviktor approved these changes Oct 15, 2023

View reviewed changes

Merge branch 'main' into pr/5424

7fe9b07

fkiraly previously approved these changes Oct 21, 2023

View reviewed changes

Update eagglo.py

2ef1ab8

fkiraly dismissed their stale review via 2ef1ab8 October 21, 2023 12:59

fkiraly added 2 commits October 21, 2023 15:02

fixes

39b87ab

linting

6ecfa38

fkiraly mentioned this pull request Oct 22, 2023

[MNT] split optional dependencies and CI testing per components #5101

Closed

4 tasks

fkiraly mentioned this pull request Oct 22, 2023

[BUG] fix error message in _check_python_version #5473

Merged

fkiraly approved these changes Oct 22, 2023

View reviewed changes

fkiraly merged commit 416acec into sktime:main Oct 22, 2023
147 of 165 checks passed

szepeviktor mentioned this pull request Oct 22, 2023

[DOC] Fix more typos #5478

Merged

yarnabrina pushed a commit that referenced this pull request Oct 27, 2023

[BUG] fix error message in _check_python_version (#5473)

9d08f87

This PR fixes a minor issue in one of the error messages in `_check_python_version`. Example failure here: #5424 (comment)

fkiraly pushed a commit that referenced this pull request Oct 29, 2023

[DOC] Fix more typos (#5478)

4be3b36

#### Reference Issues/PRs Follow-up #5424 Thank you @kianmeng #### What does this implement/fix? Explain your changes. Found few more misspellings.

yarnabrina mentioned this pull request Oct 31, 2023

Existing typos (including false positives) in the repository as detected by typos and codespell #5507

Open

[DOC] Programmatically fix (all) typos #5424

[DOC] Programmatically fix (all) typos #5424

Conversation

kianmeng commented Oct 14, 2023

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Does your contribution introduce a new dependency? If yes, which one?

What should a reviewer concentrate their feedback on?

Did you add any tests for the change?

Any other comments?

PR checklist

For all contributions

For new estimators

review-notebook-app bot commented Oct 14, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

szepeviktor Oct 14, 2023 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fkiraly left a comment

Choose a reason for hiding this comment

fkiraly commented Oct 14, 2023

szepeviktor commented Oct 15, 2023

fkiraly left a comment

Choose a reason for hiding this comment

fkiraly commented Oct 21, 2023

yarnabrina commented Oct 22, 2023

kianmeng commented Oct 22, 2023

fkiraly commented Oct 22, 2023 • edited

fkiraly commented Oct 22, 2023

fkiraly commented Oct 22, 2023

fkiraly commented Oct 22, 2023 • edited

fkiraly commented Oct 22, 2023

achieveordie commented Oct 22, 2023

fkiraly commented Oct 22, 2023

achieveordie commented Oct 22, 2023

yarnabrina commented Oct 22, 2023 • edited

fkiraly commented Oct 22, 2023

yarnabrina commented Oct 22, 2023

fkiraly commented Oct 22, 2023 • edited

fkiraly left a comment

Choose a reason for hiding this comment

fkiraly commented Oct 22, 2023 • edited

kianmeng commented Oct 23, 2023

fkiraly commented Oct 29, 2023

szepeviktor Oct 14, 2023 •

edited

fkiraly commented Oct 22, 2023 •

edited

fkiraly commented Oct 22, 2023 •

edited

yarnabrina commented Oct 22, 2023 •

edited

fkiraly commented Oct 22, 2023 •

edited

fkiraly commented Oct 22, 2023 •

edited