Merge pull request #60 from rodrigo-arenas/0.6.X

[PR] Grammar improvements
rodrigo-arenas · Jul 2, 2021 · 7c15bf4 · 7c15bf4
2 parents da53f41 + b0e8d8f
commit 7c15bf4
Show file tree

Hide file tree

Showing 9 changed files with 84 additions and 84 deletions.
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
@@ -7,38 +7,38 @@ You can contribute with documentation, examples/tutorials, reviewing pull reques
 helping answer questions in issues, creating visualizations, maintaining project
 infrastructure, and creating new tests. 
 
-Code contributions are always welcome, from simple bug fixes, to new features.
-Also consider contributing to the documentation, 
+Code contributions are always welcome, from simple bug fixes to new features.
+Also, consider contributing to the documentation, 
 and reviewing open issues, it is the easiest way to get started.
 
 When working on your local computer, make sure to install the development dependencies with:
 ```bash
 pip install -r dev-requirements.txt
 ```
 
-If you have questions, you can open an issue (tag it as question).
+If you have questions, you can open an issue (tag it as a question).
 
 We encourage you to follow these guidelines:
 
 * Fork this project, make the changes you expect to merge and make a pull request 
 * If the work you are making is related to some issue, please mention in the comments 
-  that you are working on it, so other people know and no duplicate you work.
-* If you are working in a new feature, or have an idea, consider first opening an issue
-  so people know in what you are working on and possible give some guidelines
+  that you are working on it, so other people know and no duplicate your work.
+* If you are working on a new feature, or have an idea, consider first opening an issue
+  so people know what you are working on and possibly give some guidelines
 * Commit all changes by pull request (PR)
-* A PR solves one problem (do not mix problems together in one PR) with the
+* A PR solves one problem (do not mix problems in one PR) with the
   minimal set of changes
 * The changes should come with their respective tests and documentation
 * Describe why you are proposing those changes 
 * Please run black on top of the package to keep the formatting style
     ```bash
     black .
     ```
-* Make sure all the test are passing, by running in the root of the project
+* Make sure all the tests are passing, by running in the root of the project
     ```bash
     pytest sklearn_genetic
     ```
-* We can not merge if the tests fails.
+* We can not merge if the tests fail.
 
 # External References
 

diff --git a/README.rst b/README.rst
@@ -37,8 +37,8 @@ Documentation is available `here <https://sklearn-genetic-opt.readthedocs.io/>`_
 Main Features:
 ##############
 
-* **GASearchCV**: Principal class of the package, holds the evolutionary cross validation optimization routine.
-* **Algorithms**: Set of different evolutionary algorithms to use as optimization procedure.
+* **GASearchCV**: Principal class of the package, holds the evolutionary cross-validation optimization routine.
+* **Algorithms**: Set of different evolutionary algorithms to use as an optimization procedure.
 * **Callbacks**: Custom evaluation strategies to generate early stopping rules,
   logging (into TensorBoard, .pkl files, etc) or your custom logic.
 * **Plots**: Generate pre-defined plots to understand the optimization process.
@@ -51,7 +51,7 @@ Visualize the progress of your training:
 
 .. image:: docs/images/progress_bar.gif
 
-Real time metrics visualization and comparison across runs:
+Real-time metrics visualization and comparison across runs:
 
 .. image:: https://github.com/rodrigo-arenas/Sklearn-genetic-opt/blob/master/docs/images/tensorboard_log.png?raw=true
 
@@ -165,10 +165,10 @@ Contributing
 ############
 
 Contributions are more than welcome!
-There are lots of opportunities on the on going project, so please get in touch if you would like to help out.
+There are lots of opportunities on the ongoing project, so please get in touch if you would like to help out.
 Also check the `Contribution guide <https://github.com/rodrigo-arenas/Sklearn-genetic-opt/blob/master/CONTRIBUTING.md>`_
 
-Big thanks to the people who are helping this project!
+Big thanks to the people who are helping with this project!
 
 |Contributors|_
 

diff --git a/docs/index.rst b/docs/index.rst
@@ -10,8 +10,8 @@ scikit-learn models hyperparameters tuning, using evolutionary algorithms.
 
 This is meant to be an alternative from popular methods inside scikit-learn such as Grid Search and Randomized Grid Search.
 
-Sklearn-genetic-opt uses evolutionary algorithms from the deap package to choose set of hyperparameters
-that optimizes (max or min) the cross validation scores, it can be used for both regression and classification problems.
+Sklearn-genetic-opt uses evolutionary algorithms from the deap package to choose a set of hyperparameters
+that optimizes (max or min) the cross-validation scores, it can be used for both regression and classification problems.
 
 Installation:
 #############

diff --git a/docs/release_notes.rst b/docs/release_notes.rst
@@ -29,15 +29,15 @@ Features:
 
   - **on_start**: When the evolutionary algorithm is called from the GASearchCV.fit method.
 
-  - **on_step:** When the evolutionary algorithm finish a generation (no change here).
+  - **on_step:** When the evolutionary algorithm finishes a generation (no change here).
 
   - **on_end:** At the end of the last generation.
 
 ^^^^^^^^^^
 Bug Fixes:
 ^^^^^^^^^^
 
-* A missing statement was making that the callbacks starts to get evaluated from generation 1, ignoring generation 0.
+* A missing statement was making that the callbacks start to get evaluated from generation 1, ignoring generation 0.
   Now this is properly handled and callbacks work from generation 0.
 
 ^^^^^^^^^^^^
@@ -115,7 +115,7 @@ Docs:
 * Added user guide on "Understanding the evaluation process"
 * Several guides on contributing, code of conduct
 * Added important links
-* Docs requirement are now independent of package requirements
+* Docs requirements are now independent of package requirements
 
 ^^^^^^^^^
 Internal:
@@ -187,15 +187,15 @@ Features:
 * Enabled deap's eaMuPlusLambda algorithm for the optimization process, now is the default routine
 * Added a logbook and history properties to the fitted GASearchCV  to make post-fit analysis
 * ``Elitism=False`` now implements a roulette selection instead of ignoring the parameter
-* Added the parameter keep_top_k to control the amount of solutions if the hall of fame (hof)
+* Added the parameter keep_top_k to control the number of solutions if the hall of fame (hof)
 
 ^^^^^^^^^^^^
 API Changes:
 ^^^^^^^^^^^^
 
 * Refactored the optimization algorithm to use DEAP package instead
   of a custom implementation, this causes the removal of several methods, properties and variables inside the GASearchCV class
-* The parameter encoding_length has been removed, it's not longer required to the GASearchCV class
+* The parameter encoding_length has been removed, it's no longer required to the GASearchCV class
 * Renamed the property of the fitted estimator from `best_params_` to `best_params`
 * The verbosity now prints the deap log of the fitness function,
   it's standard deviation, max and min values from each generation

diff --git a/docs/tutorials/basic_usage.rst b/docs/tutorials/basic_usage.rst
@@ -21,12 +21,12 @@ The optimization is made by evolutionary algorithms with the help of the
 It works by defining the set of hyperparameters to tune, it starts with a randomly sampled set of options (population).
 Then by using evolutionary operators as the mating, mutation, selection and evaluation,
 it generates new candidates looking to improve the cross-validation score in each generation.
-It'll continue with this process until a number of generations is reached or until a callback criteria is met.
+It'll continue with this process until a number of generations is reached or until a callback criterion is met.
 
 Example
 -------
 
-First lets import some dataset and others scikit-learn standard modules, we'll use
+First let's import some dataset and other scikit-learn standard modules, we'll use
 the `digits dataset <https://scikit-learn.org/stable/modules/generated/sklearn.datasets.load_digits.html>`__.
 This is a classification problem, we'll fine-tune a Random Forest Classifier for this task.
 
@@ -40,7 +40,7 @@ This is a classification problem, we'll fine-tune a Random Forest Classifier for
     from sklearn.datasets import load_digits
     from sklearn.metrics import accuracy_score
 
-Lets first read the data, split it in our training and test set and visualize some of the data points:
+Let's first read the data, split it in our training and test set and visualize some of the data points:
 
 .. code:: python3
 
@@ -60,7 +60,7 @@ We should see something like this:
 
 .. image:: ../images/basic_usage_digits_0.png
 
-Now, we must define our param_grid, similar to scikit-learn, is a dictionary with the models hyperparameters.
+Now, we must define our param_grid, similar to scikit-learn, which is a dictionary with the model's hyperparameters.
 The main difference with for example sckit-learn's GridSearchCv,
 is that we don't pre-define the values to use in the search,
 but rather, the boundaries of each parameter.
@@ -81,7 +81,7 @@ Notice that in the case of *'boostrap'*, as it is a categorical variable, we do
 As well, in the 'min_weight_fraction_leaf', we used an additional parameter named distribution,
 this is useful to tell the optimizer from which data distribution it can sample some random values during the optimization.
 
-Now, we are ready to set the GASearchCV, its the object that will allow us to run the fitting process using evolutionary algorihtms
+Now, we are ready to set the GASearchCV, its the object that will allow us to run the fitting process using evolutionary algorithms
 It has several options that we can use, for this first example, we'll keep it very simple:
 
 .. code:: python3
@@ -100,9 +100,9 @@ It has several options that we can use, for this first example, we'll keep it ve
                                   n_jobs=-1,
                                   verbose=True)
 
-So now the setup in ready, note that are others parameters that can be specified in GASearchCV,
+So now the setup is ready, note that are other parameters that can be specified in GASearchCV,
 the ones we used, are equivalents to the meaning in scikit-learn, besides the one already explained,
-is worth to mention that the "metric" is going to be used as the optimization variable,
+is worth mentioning that the "metric" is going to be used as the optimization variable,
 so the algorithm will try to find the set of parameters that maximizes this metric.
 
 We are ready to run the optimization routine:
@@ -127,8 +127,8 @@ This log, shows us the metrics obtained in each iteration (generation), this is
 * **fitness_max:** The maximum individual score of all the models in this generation.
 * **fitness_min:** The minimum individual score of all the models in this generation.
 
-After fitting the model, we have some extra methos to use the model right away.
-It will use by default the best set of hyperparameters it found, based in the cross-validation score:
+After fitting the model, we have some extra methods to use the model right away.
+It will use by default the best set of hyperparameters it found, based on the cross-validation score:
 
 .. code:: python3
 
@@ -142,8 +142,8 @@ In this case, we got an accuracy score in the test set of 0.93
 
 .. image:: ../images/basic_usage_accuracy_2.jpeg
 
-Now lets use a couple more functions available in the package.
-The first one, will help us to see the evolution of our metric over the generations
+Now, let's use a couple more functions available in the package.
+The first one will help us to see the evolution of our metric over the generations
 
 .. code:: python3
 
@@ -165,10 +165,10 @@ sklearn-genetic-opt comes with a plot function to analyze this log:
 
 .. image:: ../images/basic_usage_plot_space_4.png
 
-What this plot shows us, is the distribution of the sampled values for each hyperparameter.
-We can see for example in the *'min_weight_fraction_leaf'* that the algorithm mostly sampled values bellow 0.15.
+What this plot shows us, is the distributione of the sampled values for each hyperparameter.
+We can see for example in the *'min_weight_fraction_leaf'* that the algorithm mostly sampled values below 0.15.
 You can also check every single combination of variables and the contour plot that represents the sampled values.
 
 This concludes our introduction to the basic sklearn-genetic-opt usage.
 Further tutorials will cover the GASearchCV parameters, callbacks,
-different optimization algorithms and more advanced usage.
+different optimization algorithms and more advanced use cases.
diff --git a/docs/tutorials/callbacks.rst b/docs/tutorials/callbacks.rst
@@ -6,11 +6,11 @@ Introduction
 
 Callbacks can be defined to take actions or decisions over the optimization
 process while it is still running.
-Common callbacks includes different rules to stop the algorithm or log artifacts.
+Common callbacks include different rules to stop the algorithm or log artifacts.
 The callbacks are passed to the ``.fit`` method
 of the :class:`~sklearn_genetic.GASearchCV` class.
 
-The callbacks are evaluated at start of the training using the `on_start` method,
+The callbacks are evaluated at the start of the training using the `on_start` method,
 at the end of each generation fit using `on_step` method and at the
 end of the training using `on_end`, so it looks like this:
 
@@ -22,7 +22,7 @@ until that training point.
 
 .. image:: ../images/callbacks_log_0.png
 
-Now lets see how to use them, we'll take
+Now let's see how to use them, we'll take
 the data set and model used in :ref:`basic-usage`. The available callbacks are:
 
 * ProgressBar
@@ -69,14 +69,14 @@ ConsecutiveStopping
 -------------------
 
 This callback stops the optimization if the current metric value
-is no greater that at least one metric from the last N generations.
+is no greater than at least one metric from the last N generations.
 
 It requires us to define the number of generations to compare
 against the current generation and the name of the metric we want
 to track.
 
 For example, if we want to stop the optimization after 5 iterations
-where the current iteration (sixth) fitness value is worst that all
+where the current iteration (sixth) fitness value is worst than all
 the previous ones (5), we define it like this:
 
 .. code:: python3
@@ -94,7 +94,7 @@ Now we just have to pass it to the estimator during the fitting
 DeltaThreshold
 --------------
 This callback stops the optimization if the absolute difference
-between the current and last metric is less or equals than a threshold.
+between the current and last metric is less or equals to a threshold.
 
 It just requires the threshold and the metric name, for example
 using the 'fitness_min' value:
@@ -112,11 +112,11 @@ This callback stops the optimization if the difference in seconds between the st
 first set of hyperparameters fit, and the current generation time is greater than a time threshold.
 
 Remember that this is checked after each generation fit, so if the first (or any) generation fit takes
-longer that the threshold, it won't stop the fitting process until is done with the current generation
+longer than the threshold, it won't stop the fitting process until is done with the current generation
 population.
 
 It requires the total_seconds parameters, for example stopping if the time is greater
-that one minute:
+than one minute:
 
 .. code:: python3
 
@@ -128,7 +128,7 @@ that one minute:
 ThresholdStopping
 -----------------
 It stops the optimization if the current metric
-is greater or equals than the define threshold.
+is greater or equals to the defined threshold.
 
 For example, if we want to stop the optimization
 if the 'fitness_max' is above 0.98:
@@ -151,9 +151,9 @@ within this package due it's usually a sensitive and heavy dependency::
 
     pip install tensorflow
 
-It only requires to define the folder where you want to log your run, and optionally, a run_id, so
-your consecutive runs doesn't mix up.
-If the run_id is not provided, it will create a subfolder with the current datetime of your run.
+It only requires defining the folder where you want to log your run, and optionally, a run_id, so
+your consecutive runs don't mix up.
+If the run_id is not provided, it will create a subfolder with the current date-time of your run.
 
 .. code:: python3
 
@@ -162,8 +162,8 @@ If the run_id is not provided, it will create a subfolder with the current datet
 
     evolved_estimator.fit(X, y, callbacks=callback)
 
-While the model is being trained you can see in real time the metrics in Tensorboard.
-If you have run more that 1 GASearchCV model and use the TensordBoard callback using
+While the model is being trained you can see in real-time the metrics in Tensorboard.
+If you have run more than one GASearchCV model and use the TensordBoard callback using
 the same log_dir but different run_id, you can compare the metrics of each run, it looks
 like this for the fitness in three different runs:
 

diff --git a/docs/tutorials/custom_callback.rst b/docs/tutorials/custom_callback.rst
@@ -19,7 +19,7 @@ or ``False``. It expects the parameter `estimator`.
 ``True`` means that the optimization must stop, ``False``, means it can continue.
 It expects the parameters `record`, `logbook` and `estimator`.
 
-**on_end:** This method is called at the end of the las generation or after an stopping
+**on_end:** This method is called at the end of the las generation or after a stopping
 callback meets its criteria. It expects the parameters `logbook` and `estimator`,
 it should return ``None`` or ``False``.
 
@@ -29,7 +29,7 @@ Example
 -------
 
 In this example, we are going to define a dummy callback that
-stops the process if there have been more that `N` fitness values
+stops the process if there have been more than `N` fitness values
 bellow a threshold value.
 
 The callback must have three parameters: `record`, `logbook` and `estimator`.
@@ -61,7 +61,7 @@ So to check inside the logbook, we could define a function like this:
         return False
 
 As sklearn-genetic-opt expects all this logic in a single object, we must define a class
-that will have all this parameters, so we can rewrite it like this:
+that will have all these parameters, so we can rewrite it like this:
 
 
 .. code-block:: python
@@ -121,7 +121,7 @@ Now, let's expend it to add the others method, just to print a message:
            print("I'm done with training!")
 
 So that is it, now you can initialize the DummyThreshold
-and pass it to a in the ``fit`` method of a
+and pass it to in the ``fit`` method of a
 :class:`~sklearn_genetic.GASearchCV` instance:
 
 .. code-block:: python