DOC: Fix many spelling errors

statsmodels · Jul 21, 2019 · 6afbf66 · 6afbf66
1 parent ec4aa25
commit 6afbf66
Show file tree

Hide file tree

Showing 342 changed files with 1,082 additions and 983 deletions.
diff --git a/.github/ISSUE_TEMPLATE/feature_request.md b/.github/ISSUE_TEMPLATE/feature_request.md
@@ -13,8 +13,8 @@ A clear and concise description of what the problem is. Ex. I'm always frustrate
 #### Describe the solution you'd like
 A clear and concise description of what you want to happen.
 
-#### Describe alternatives you've considered
-A clear and concise description of any alternative solutions or features you've considered.
+#### Describe alternatives you have considered
+A clear and concise description of any alternative solutions or features you have considered.
 
 #### Additional context
 Add any other context about the feature request here.
diff --git a/CONTRIBUTING.rst b/CONTRIBUTING.rst
@@ -35,7 +35,7 @@ For a pull request to be accepted, you must meet the below requirements. This gr
 Linting
 ~~~~~~~
 
-Due to the way we have the CI builds set up, the linter won't do anything unless the environmental variable $LINT is set to a truthy value.
+Due to the way we have the CI builds set up, the linter will not do anything unless the environmental variable $LINT is set to a truthy value.
 
 - On MacOS/Linux
 
@@ -46,7 +46,7 @@ Due to the way we have the CI builds set up, the linter won't do anything unless
 How to Submit a Pull Request
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
-So you want to submit a patch to `statsmodels` but aren't too familiar with github? Here are the steps you need to take.
+So you want to submit a patch to `statsmodels` but are not too familiar with github? Here are the steps you need to take.
 
 1. `Fork <https://help.github.com/articles/fork-a-repo>`_ the `statsmodels repository <https://github.com/statsmodels/statsmodels>`_ on Github.
 2. `Create a new feature branch <https://git-scm.com/book/en/Git-Branching-Basic-Branching-and-Merging>`_. Each branch must be self-contained, with a single new feature or bugfix.

diff --git a/COPYRIGHTS.txt b/COPYRIGHTS.txt
@@ -5,7 +5,7 @@ statsmodels contains code or derivative code from several other
 packages. Some modules also note the author of individual contributions, or
 author of code that formed the basis for the derived or translated code.
 The copyright statements for the datasets are attached to the individual
-datasets, most datasets are in public domain, and we don't claim any copyright
+datasets, most datasets are in public domain, and we do not claim any copyright
 on any of them.
 
 In the following, we collect copyright statements of code from other packages,

diff --git a/docs/make.bat b/docs/make.bat
@@ -28,7 +28,7 @@ if errorlevel 9009 (
 	echo.to the full path of the 'sphinx-build' executable. Alternatively you
 	echo.may add the Sphinx directory to PATH.
 	echo.
-	echo.If you don't have Sphinx installed, grab it from
+	echo.If you do not have Sphinx installed, grab it from
 	echo.http://sphinx-doc.org/
 	exit /b 1
 )

diff --git a/docs/source/_static/mktree.js b/docs/source/_static/mktree.js
@@ -3,7 +3,7 @@
  *
  * Dual licensed under the MIT and GPL licenses.
  * This basically means you can use this code however you want for
- * free, but don't claim to have written it yourself!
+ * free, but do not claim to have written it yourself!
  * Donations always accepted: https://www.JavascriptToolbox.com/donate/
  *
  * Please do not link to the .js files on javascripttoolbox.com from
@@ -103,7 +103,7 @@ function convertTrees() {
 	setDefault("nodeLinkClass","bullet");
 	setDefault("preProcessTrees",true);
 	if (preProcessTrees) {
-		if (!document.createElement) { return; } // Without createElement, we can't do anything
+		if (!document.createElement) { return; } // Without createElement, we cannot do anything
 		var uls = document.getElementsByTagName("ul");
 		if (uls==null) { return; }
 		var uls_length = uls.length;

diff --git a/docs/source/_templates/autosummary/class.rst b/docs/source/_templates/autosummary/class.rst
@@ -2,7 +2,7 @@
 
 {% block methods %}
 {% if methods %}
-   .. HACK -- the point here is that we don't want this to appear in the output, but the autosummary should still generate the pages.
+   .. HACK -- the point here is that we do not want this to appear in the output, but the autosummary should still generate the pages.
       .. autosummary::
          :toctree:
       {% for item in all_methods %}
@@ -15,7 +15,7 @@
 
 {% block attributes %}
 {% if attributes %}
-   .. HACK -- the point here is that we don't want this to appear in the output, but the autosummary should still generate the pages.
+   .. HACK -- the point here is that we do not want this to appear in the output, but the autosummary should still generate the pages.
       .. autosummary::
          :toctree:
       {% for item in all_attributes %}

diff --git a/docs/source/conf.py b/docs/source/conf.py
@@ -148,7 +148,7 @@
 # html_theme = 'default'
 
 if 'htmlhelp' in sys.argv:
-    # html_theme = 'statsmodels_htmlhelp'  #doesn't look nice yet
+    # html_theme = 'statsmodels_htmlhelp'  #does not look nice yet
     html_theme = 'default'
     print('################# using statsmodels_htmlhelp ############')
 else:

diff --git a/docs/source/contrasts.rst b/docs/source/contrasts.rst
@@ -7,7 +7,7 @@ Patsy: Contrast Coding Systems for categorical variables
 
 A categorical variable of K categories, or levels, usually enters a regression as a sequence of K-1 dummy variables. This amounts to a linear hypothesis on the level means. That is, each test statistic for these variables amounts to testing whether the mean for that level is statistically significantly different from the mean of the base category. This dummy coding is called Treatment coding in R parlance, and we will follow this convention. There are, however, different coding methods that amount to different sets of linear hypotheses.
 
-In fact, the dummy coding is not technically a contrast coding. This is because the dummy variables add to one and are not functionally independent of the model's intercept. On the other hand, a set of *contrasts* for a categorical variable with `k` levels is a set of `k-1` functionally independent linear combinations of the factor level means that are also independent of the sum of the dummy variables. The dummy coding isn't wrong *per se*. It captures all of the coefficients, but it complicates matters when the model assumes independence of the coefficients such as in ANOVA. Linear regression models do not assume independence of the coefficients and thus dummy coding is often the only coding that is taught in this context.
+In fact, the dummy coding is not technically a contrast coding. This is because the dummy variables add to one and are not functionally independent of the model's intercept. On the other hand, a set of *contrasts* for a categorical variable with `k` levels is a set of `k-1` functionally independent linear combinations of the factor level means that are also independent of the sum of the dummy variables. The dummy coding is not wrong *per se*. It captures all of the coefficients, but it complicates matters when the model assumes independence of the coefficients such as in ANOVA. Linear regression models do not assume independence of the coefficients and thus dummy coding is often the only coding that is taught in this context.
 
 To have a look at the contrast matrices in Patsy, we will use data from UCLA ATS. First let's load the data.
 
@@ -72,7 +72,7 @@ Here we used `reference=0`, which implies that the first level, Hispanic, is the
 
    contrast.matrix[hsb2.race-1, :][:20]
 
-This is a bit of a trick, as the `race` category conveniently maps to zero-based indices. If it does not, this conversion happens under the hood, so this won't work in general but nonetheless is a useful exercise to fix ideas. The below illustrates the output using the three contrasts above
+This is a bit of a trick, as the `race` category conveniently maps to zero-based indices. If it does not, this conversion happens under the hood, so this will not work in general but nonetheless is a useful exercise to fix ideas. The below illustrates the output using the three contrasts above
 
 .. ipython:: python
 
@@ -113,7 +113,7 @@ Sum coding compares the mean of the dependent variable for a given level to the
    res = mod.fit()
    print(res.summary())
 
-This correspons to a parameterization that forces all the coefficients to sum to zero. Notice that the intercept here is the grand mean where the grand mean is the mean of means of the dependent variable by each level.
+This corresponds to a parameterization that forces all the coefficients to sum to zero. Notice that the intercept here is the grand mean where the grand mean is the mean of means of the dependent variable by each level.
 
 .. ipython:: python
 

diff --git a/docs/source/datasets/dataset_proposal.rst b/docs/source/datasets/dataset_proposal.rst
@@ -128,7 +128,7 @@ Remaining problems:
       we want to avoid loading all the data in memory? Can we use memory
       mapped arrays ?
     - Missing data: I thought about subclassing both record arrays and
-      masked arrays classes, but I don't know if this is feasable, or even
+      masked arrays classes, but I do not know if this is feasible, or even
       makes sense. I have the feeling that some Data mining software use
       Nan (for example, weka seems to use float internally), but this
       prevents them from representing integer data.

diff --git a/docs/source/dev/git_notes.rst b/docs/source/dev/git_notes.rst
@@ -157,7 +157,7 @@ change history by::
     git log --oneline --graph
 
 It pays to take care of things locally before you push them to github. So when
-in doubt, don't push.  Also see the advice on keeping your history clean in
+in doubt, do not push.  Also see the advice on keeping your history clean in
 :ref:`merge-vs-rebase`.
 
 .. _pull-requests:
@@ -193,7 +193,7 @@ One last thing to note. If there has been a lot of work in upstream/master
 since you started your patch, you might want to rebase. However, you can
 probably get away with not rebasing if these changes are unrelated to the work
 you have done in the `shiny-new-feature` branch. If you can avoid it, then
-don't rebase. If you have to, try to do it once and when you are at the end of
+do not rebase. If you have to, try to do it once and when you are at the end of
 your changes. Read on for some notes on :ref:`merge-vs-rebase`.
 
 Advanced Topics
@@ -221,7 +221,7 @@ the warnings
 Namely, **always make a new branch before doing a rebase**. This is good
 general advice for working with git. I would also add **never use rebase on
 work that has already been published**. If another developer is using your
-work, don't rebase!!
+work, do not rebase!!
 
 As for merging, **never merge from trunk into your feature branch**. You will,
 however, want to check that your work will merge cleanly into trunk. This will
@@ -253,7 +253,7 @@ however. To delete the branch on github, do::
 .. Squashing with Rebase
 .. ^^^^^^^^^^^^^^^^^^^^^
 
-.. You've made a bunch of incremental commits, but you think they might be better off together as one
+.. You have made a bunch of incremental commits, but you think they might be better off together as one
 .. commit. You can do this with an interactive rebase. As usual, **only do this when you have local
 .. commits. Do not edit the history of changes that have been pushed.**
 

diff --git a/docs/source/dev/index.rst b/docs/source/dev/index.rst
@@ -60,7 +60,7 @@ greatly helps the job of maintaining and releasing the software a shared effort.
 How to Submit a Pull Request
 ----------------------------
 
-So you want to submit a patch to `statsmodels` but aren't too familiar with
+So you want to submit a patch to `statsmodels` but are not too familiar with
 github? Here are the steps you need to take.
 
 1. `Fork <https://help.github.com/articles/fork-a-repo/>`_ the

diff --git a/docs/source/dev/maintainer_notes.rst b/docs/source/dev/maintainer_notes.rst
@@ -34,7 +34,7 @@ If there are only a few commits, you can rebase to keep a linear history::
     git rebase upstream-rw/master
 
 Rebasing will not automatically close the pull request however, if there is one,
-so don't forget to do this.
+so do not forget to do this.
 
 .. _merging:
 

diff --git a/docs/source/dev/naming_conventions.rst b/docs/source/dev/naming_conventions.rst
@@ -41,7 +41,7 @@ Our directory tree stripped down looks something like::
 The submodules are arranged by topic, `discrete` for discrete choice models, or `tsa` for time series
 analysis. The submodules that can be import heavy contain an empty __init__.py, except for some testing
 code for running tests for the submodules. The namespace to be imported is in `api.py`. That way, we
-can import selectively and do not have to import a lot of code that we don't need. Helper functions are
+can import selectively and do not have to import a lot of code that we do not need. Helper functions are
 usually put in files named `tools.py` and statistical functions, such as statistical tests are placed
 in `stattools.py`. Everything has directories for :ref:`tests <testing>`.
 
@@ -83,7 +83,7 @@ time-series ARMA model we have::
 Options
 ~~~~~~~
 We are using similar options in many classes, methods and functions. They
-should follow a standardized pattern if they recurr frequently. ::
+should follow a standardized pattern if they recur frequently. ::
 
     `missing` ['none', 'drop', 'raise'] define whether inputs are checked for
         nans, and how they are treated

diff --git a/docs/source/diagnostic.rst b/docs/source/diagnostic.rst
@@ -113,7 +113,7 @@ Unknown Change Point
 :py:func:`recursive_olsresiduals <statsmodels.stats.diagnostic.recursive_olsresiduals>`
   Calculate recursive ols with residuals and cusum test statistic. This is
   currently mainly helper function for recursive residual based tests.
-  However, since it uses recursive updating and doesn't estimate separate
+  However, since it uses recursive updating and does not estimate separate
   problems it should be also quite efficient as expanding OLS function.
 
 missing
@@ -122,7 +122,7 @@ missing
   - test on recursive parameter estimates, which are there?
 
 
-Mutlicollinearity Tests
+Multicollinearity Tests
 --------------------------------
 
 conditionnum (statsmodels.stattools)

diff --git a/docs/source/faq.rst b/docs/source/faq.rst
@@ -39,7 +39,7 @@ takes this keyword. You can find more information in the docstring of
 
 .. _build-faq:
 
-Why won't statsmodels build?
+Why will not statsmodels build?
 ----------------------------
 
 Remember that to build, you must have:
@@ -75,7 +75,7 @@ get involved. We accept Pull Requests on our GitHub page for bugfixes and
 topics germane to statistics and statistical modeling. In addition, usability
 and quality of life enhancements are greatly appreciated as well.
 
-What if my question isn't answered here?
+What if my question is not answered here?
 ----------------------------------------
 
 You may find answers for questions that have not yet been added here on GitHub

diff --git a/docs/source/names_wordlist.txt b/docs/source/names_wordlist.txt
@@ -89,3 +89,11 @@ Longley
 Koenker
 gliptak
 Spector
+Wes
+statawriter
+Nonparameteric
+prerotated
+uniq
+exceedance
+separatevar
+
diff --git a/docs/source/nonparametric.rst b/docs/source/nonparametric.rst
@@ -11,7 +11,7 @@ includes kernel density estimation for univariate and multivariate data,
 kernel regression and locally weighted scatterplot smoothing (lowess).
 
 sandbox.nonparametric contains additional functions that are work in progress
-or don't have unit tests yet. We are planning to include here nonparametric
+or do not have unit tests yet. We are planning to include here nonparametric
 density estimators, especially based on kernel or orthogonal polynomials,
 smoothers, and tools for nonparametric models and methods in other parts of
 statsmodels.

diff --git a/docs/source/plots/graphics_gofplots_qqplot_qqline.py b/docs/source/plots/graphics_gofplots_qqplot_qqline.py
@@ -1,5 +1,5 @@
 '''
-    Import the food expenditure dataset.  Plot annual food expendeture on
+    Import the food expenditure dataset.  Plot annual food expenditure on
     x-axis and household income on y-axis.  Use qqline to add regression line
     into the plot.
 '''

diff --git a/docs/source/plots/graphics_mosaicplot_mosaic.py b/docs/source/plots/graphics_mosaicplot_mosaic.py
@@ -25,7 +25,7 @@
 mosaic(data, title='hierarchical index series')
 plt.show()
 
-# The third accepted data structureis the np array, for which a very simple
+# The third accepted data structure is the np array, for which a very simple
 # index will be created.
 rand = np.random.random
 data = 1+rand((2, 2))

diff --git a/docs/source/plots/graphics_plot_fit_ex.py b/docs/source/plots/graphics_plot_fit_ex.py
@@ -8,7 +8,7 @@
 """
 
 # Load the Statewide Crime data set and perform linear regression with
-#    'poverty' and 'hs_grad' as variables and 'muder' as the response
+#    'poverty' and 'hs_grad' as variables and 'murder' as the response
 
 
 import statsmodels.api as sm

diff --git a/docs/source/plots/graphics_regression_regress_exog.py b/docs/source/plots/graphics_regression_regress_exog.py
@@ -3,7 +3,7 @@
 Load the Statewide Crime data set and build a model with regressors
 including the rate of high school graduation (hs_grad), population in urban
 areas (urban), households below poverty line (poverty), and single person
-households (single).  Outcome variable is the muder rate (murder).
+households (single).  Outcome variable is the murder rate (murder).
 
 Build a 2 by 2 figure based on poverty showing fitted versus actual murder
 rate, residuals versus the poverty rate, partial regression plot of poverty,

diff --git a/docs/source/release/old_changes.rst b/docs/source/release/old_changes.rst
@@ -59,7 +59,7 @@ This is a bug-fix release, that affects mainly Big-Endian machines.
 *Bug Fixes*
 
 * discrete_model.MNLogit fix summary method
-* tsa.filters.hp_filter don't use umfpack on Big-Endian machine (scipy bug)
+* tsa.filters.hp_filter do not use umfpack on Big-Endian machine (scipy bug)
 * the remaining fixes are in the test suite, either precision problems
   on some machines or incorrect testing on Big-Endian machines.