Merge pull request #1120 from andife/master

#DC spelling
EpistasisLab · Oct 5, 2020 · e1ab570 · e1ab570
2 parents 593763d + 5865228
commit e1ab570
Show file tree

Hide file tree

Showing 6 changed files with 9 additions and 9 deletions.
diff --git a/README.md b/README.md
@@ -54,7 +54,7 @@ Click on the corresponding links to find more information on TPOT usage in the d
 
 ### Classification
 
-Below is a minimal working example with the the optical recognition of handwritten digits dataset.
+Below is a minimal working example with the optical recognition of handwritten digits dataset.
 
 ```python
 from tpot import TPOTClassifier

diff --git a/docs_sources/api.md b/docs_sources/api.md
@@ -227,7 +227,7 @@ Flag indicating whether the TPOT version checker should be disabled.
 The update checker will tell you when a new version of TPOT has been released.
 </blockquote>
 
-<strong>log_file</strong>: io.TextIOWrapper or io.StringIO, optional (defaul: sys.stdout)
+<strong>log_file</strong>: io.TextIOWrapper or io.StringIO, optional (default: sys.stdout)
 <br /><br />
 <blockquote>
 Save progress content to a file.

diff --git a/docs_sources/releases.md b/docs_sources/releases.md
@@ -75,7 +75,7 @@
 
 - We refined parameters in VarianceThreshold and FeatureAgglomeration.
 
-- TPOT now supports using memory caching within a Pipeline via a optional `memory` parameter.
+- TPOT now supports using memory caching within a Pipeline via an optional `memory` parameter.
 
 - We improved documentation of TPOT.
 

diff --git a/docs_sources/using.md b/docs_sources/using.md
@@ -576,7 +576,7 @@ If a specific operator, e.g. `SelectPercentile`, is preferred for usage in the 1
 
 ## FeatureSetSelector in TPOT
 
-`FeatureSetSelector` is a special new operator in TPOT. This operator enables feature selection based on *priori* expert knowledge. For example, in RNA-seq gene expression analysis, this operator can be used to select one or more gene (feature) set(s) based on GO (Gene Ontology) terms or annotated gene sets Molecular Signatures Database ([MSigDB](http://software.broadinstitute.org/gsea/msigdb/index.jsp)) in the 1st step of pipeline via `template` option above, in order to reduce dimensions and TPOT computation time. This operator requires a dataset list in csv format. In this csv file, there are only three columns: 1st column is feature set names, 2nd column is the total number of features in one set and 3rd column is a list of feature names (if input X is pandas.DataFrame) or indexes (if input X is numpy.ndarray) delimited by ";". Below is a example how to use this operator in TPOT.
+`FeatureSetSelector` is a special new operator in TPOT. This operator enables feature selection based on *priori* expert knowledge. For example, in RNA-seq gene expression analysis, this operator can be used to select one or more gene (feature) set(s) based on GO (Gene Ontology) terms or annotated gene sets Molecular Signatures Database ([MSigDB](http://software.broadinstitute.org/gsea/msigdb/index.jsp)) in the 1st step of pipeline via `template` option above, in order to reduce dimensions and TPOT computation time. This operator requires a dataset list in csv format. In this csv file, there are only three columns: 1st column is feature set names, 2nd column is the total number of features in one set and 3rd column is a list of feature names (if input X is pandas.DataFrame) or indexes (if input X is numpy.ndarray) delimited by ";". Below is an example how to use this operator in TPOT.
 
 Please check our [preprint paper](https://www.biorxiv.org/content/10.1101/502484v1.article-info) for more details.
 
@@ -665,7 +665,7 @@ To use your Dask cluster to fit a TPOT model, specify the ``use_dask`` keyword w
 estimator = TPOTEstimator(use_dask=True, n_jobs=-1)
 ```
 
-This will use use all the workers on your cluster to do the training, and use [Dask-ML's pipeline rewriting](https://dask-ml.readthedocs.io/en/latest/hyper-parameter-search.html#avoid-repeated-work) to avoid re-fitting estimators multiple times on the same set of data.
+This will use all the workers on your cluster to do the training, and use [Dask-ML's pipeline rewriting](https://dask-ml.readthedocs.io/en/latest/hyper-parameter-search.html#avoid-repeated-work) to avoid re-fitting estimators multiple times on the same set of data.
 It will also provide fine-grained diagnostics in the [distributed scheduler UI](https://distributed.readthedocs.io/en/latest/web.html).
 
 Alternatively, Dask implements a joblib backend.

diff --git a/tpot/base.py b/tpot/base.py
@@ -1632,7 +1632,7 @@ def _operator_count(self, individual):
         ----------
         individual: list
             A grown tree with leaves at possibly different depths
-            dependending on the condition function.
+            depending on the condition function.
 
         Returns
         -------
@@ -1683,7 +1683,7 @@ def _generate(self, pset, min_, max_, condition, type_=None):
         min_: int
             Minimum height of the produced trees.
         max_: int
-            Maximum Height of the produced trees.
+            Maximum height of the produced trees.
         condition: function
             The condition is a function that takes two arguments,
             the height of the tree to build and the current
@@ -1696,7 +1696,7 @@ def _generate(self, pset, min_, max_, condition, type_=None):
         -------
         individual: list
             A grown tree with leaves at possibly different depths
-            dependending on the condition function.
+            depending on the condition function.
         """
         if type_ is None:
             type_ = pset.ret

diff --git a/tpot/config/regressor.py b/tpot/config/regressor.py
@@ -116,7 +116,7 @@
         'power_t': [0.5, 0.0, 1.0, 0.1, 100.0, 10.0, 50.0]
     },
 
-    # Preprocesssors
+    # Preprocessors
     'sklearn.preprocessing.Binarizer': {
         'threshold': np.arange(0.0, 1.01, 0.05)
     },