lots of changes, see changelog for 0.8.0

MatthewReid854 · Dec 15, 2021 · 9fafba7 · 9fafba7
1 parent f9fce70
commit 9fafba7
Show file tree

Hide file tree

Showing 13 changed files with 2,754 additions and 1,752 deletions.
diff --git a/docs/Changelog.rst b/docs/Changelog.rst
@@ -10,16 +10,20 @@ Changelog
 
 **Summary of changes**
 
-This will be written closer to the release date.
+The major changes in this release include enabling confidence bounds to be extracted programatically from the distribution object, and a complete rewrite of the reliability_growth function.
+There are also several minor changes, mainly to the documentation and a minor bugfix.
 
 **New features**
 
--    New dataset called system_growth has been added to the Datasets module.
+-    Extracting confidence bounds on CDF, SF, and CHF for bounds on time or bounds on reliability can now be done directly from the distribution object that is created by the fitter, as shown `here <https://reliability.readthedocs.io/en/latest/Working%20with%20fitted%20distributions.html>`_. This required a large number of functions to be rewritten and resulted in several API changes (see below).
 -    Repairable_systems.reliability growth has been completely rewritten. This function now includes both the Duane and Crow-AMSAA reliability growth models. The parametrisation of the Duane model has been modified to match what reliasoft uses.
+-    New dataset called system_growth has been added to the Datasets module.
 
 **API Changes**
 
 -    Repairable_systems.reliability growth has been completely rewritten, so the inputs and outputs are completely different. The old version of the Duane model has been replaced without deprecation. Users needing to use the old version should use v0.7.1 of reliability.
+-    All references to percentiles have now been replaced by quantiles. Note that the previous percentiles argument in the Fitters mandated values between 0 and 100. The quantiles argument mandates values between 0 and 1. They are otherwise the same, just a factor of 100 different. This has been changed without deprecation, so it may cause your code to break if you are using the percentiles argument.
+-    The subfunctions (.CDF(), .SF(), .CHF()) for each Distribution (that has confidence intervals) now have all relevant arguments visible as args rather than kwargs. This refers to plot_CI, CI_type, CI, CI_y, CI_x. Previously plot_CI, CI_type, and CI were kwargs so your IDE would not show you these. They have been converted to args for ease of use. The arguments CI_x and CI_y are new, and are used to extract the confidence bounds from the plot of a fitted distribution object.
 
 **Bug Fixes**
 
@@ -31,6 +35,7 @@ This will be written closer to the release date.
 -    The required version of matplotlib has been upgraded to 3.5.0 to enable the above bugfix for the computed_zorder in ALT life stress plots.
 -    Theory documents are finished for `censored data <https://reliability.readthedocs.io/en/latest/What%20is%20censored%20data.html>`_, `plotting positions <https://reliability.readthedocs.io/en/latest/How%20are%20the%20plotting%20positions%20calculated.html>`_, `Least Squares Estimation <https://reliability.readthedocs.io/en/latest/How%20does%20Least%20Squares%20Estimation%20work.html>`_, `Maximum Likelihood Estimation <https://reliability.readthedocs.io/en/latest/How%20does%20Maximum%20Likelihood%20Estimation%20work.html>`_, and `Confidence Intervals <https://reliability.readthedocs.io/en/latest/How%20are%20the%20confidence%20intervals%20calculated.html>`_.
 -    Updates pytests for new reliability_growth function.
+-    New document on `working with fitted distributions <https://reliability.readthedocs.io/en/latest/Working%20with%20fitted%20distributions.html>`_.
 
 **Version: 0.7.1 --- Released: 26 Oct 2021**
 ''''''''''''''''''''''''''''''''''''''''''''

diff --git a/docs/Fitting a specific distribution to data.rst b/docs/Fitting a specific distribution to data.rst
@@ -78,13 +78,9 @@ To learn how we can fit a distribution, we will start by using a simple example
 
 The above probability plot is the typical way to visualise how the CDF (the blue line) models the failure data (the black points). If you would like to view the failure points alongside the PDF, CDF, SF, HF, or CHF without the axis being scaled then you can generate the scatter plot using the function plot_points which is available within reliability.Probability_plotting. In the example below we create some data, then fit a Weibull distribution to the data (ensuring we turn off the probability plot). From the fitted distribution object we plot the Survival Function (SF). We then use plot_points to generate a scatter plot of the plotting positions for the survival function.
 
-For the function plot_points the inputs are:
+.. admonition:: API Reference
 
--   failures - an array or list of failure data
--   right_censored - an array or list of right censored data. Optional input
--   func - the function to be plotted. Must be 'PDF', 'CDF', 'SF', 'HF', or 'CHF'. Default is 'CDF'. Note that the options for 'PDF' and 'HF' will look much more scattered as they are found using the integral of a non-continuous function.
--   a - this is the plotting heuristic. Default is 0.3. See `probability plotting <https://reliability.readthedocs.io/en/latest/Probability%20plots.html>`_ and `Wikipedia <https://en.wikipedia.org/wiki/Q%E2%80%93Q_plot#Heuristics>`_ for more details.
--   keywords for the scatter plot are also accepted.
+   For inputs and outputs of the plot_points function see the `API reference <https://reliability.readthedocs.io/en/latest/API/Probability_plotting/plot_points.html>`_.
 
 Example 2
 ---------
@@ -205,29 +201,33 @@ As another example, we will fit a Gamma_2P distribution to some partially right
 Example 5
 ---------
 
-To obtain details of the percentiles (lower estimate, point estimate, upper estimate), we can use the percentiles input for each Fitter. In this example, we will create some data and fit a Weibull_2P distribution. When percentiles are requested the results printed includes both the table of results and the table of percentiles. Setting percentiles as True will use a default list of percentiles (as shown in the first output). Alternatively we can specify the exact percentiles to use (as shown in the second output). The use of the `crosshairs <https://reliability.readthedocs.io/en/latest/Crosshairs.html>`_ function is also shown which was used to annotate the plot manually. Note that the percentiles provided are the percentiles of the confidence intervals on time. Percentiles for the confidence intervals on reliability are not implemented, but can be accessed manually from the plots using the crosshairs function when confidence intervals on reliability have been plotted.
+To obtain details of the quantiles (y-values from the CDF) which include the lower estimate, point estimate, and upper estimate, we can use the quantiles input for each Fitter. In this example, we will create some data and fit a Weibull_2P distribution. When quantiles is specified the results printed includes both the table of results and the table of quantiles. Setting quantiles as True will use a default list of quantiles (as shown in the first output). Alternatively we can specify the exact quantiles to use (as shown in the second output). The use of the `crosshairs <https://reliability.readthedocs.io/en/latest/Crosshairs.html>`_ function is also shown which was used to annotate the plot manually. Note that the quantiles provided are the quantiles of the confidence bounds on time. You can extract the confidence bounds on on reliability using the fitted distribution object as shown `here <https://reliability.readthedocs.io/en/latest/Working%20with%20fitted%20distributions.html>`_.
 
 .. code:: python
 
     from reliability.Distributions import Weibull_Distribution
     from reliability.Fitters import Fit_Weibull_2P
     from reliability.Other_functions import crosshairs
     import matplotlib.pyplot as plt
-
+    
     dist = Weibull_Distribution(alpha=500, beta=6)
-    data = dist.random_samples(50, seed=1) # generate some data
+    data = dist.random_samples(50, seed=1)  # generate some data
     # this will produce the large table of percentiles below the first table of results
-    Fit_Weibull_2P(failures=data, percentiles=True, CI=0.8, show_probability_plot=False)
+    Fit_Weibull_2P(failures=data, quantiles=True, CI=0.8, show_probability_plot=False)
     print('----------------------------------------------------------')
     # repeat the process but using specified percentiles.
-    output = Fit_Weibull_2P(failures=data, percentiles=[5, 50, 95], CI=0.8)
+    output = Fit_Weibull_2P(failures=data, quantiles=[0.05, 0.5, 0.95], CI=0.8)
     # these points have been manually annotated on the plot using crosshairs
     crosshairs()
     plt.show()
     
-    #the values from the percentiles dataframe can be extracted as follows:
-    lower_estimates = output.percentiles['Lower Estimate'].values
-    print('Lower estimates:',lower_estimates)
+    # the values from the quantiles dataframe can be extracted using pandas:
+    lower_estimates = output.quantiles['Lower Estimate'].values
+    print('Lower estimates:', lower_estimates)
+    
+    #alternatively, the bounds can be extracted from the distribution object
+    lower,point,upper = output.distribution.CDF(CI_y=[0.05, 0.5, 0.95], CI=0.8)
+    print('Upper estimates:', upper)
 
     '''
     Results from Fit_Weibull_2P (80% CI):
@@ -245,19 +245,19 @@ To obtain details of the percentiles (lower estimate, point estimate, upper esti
                 BIC   611.14
                  AD  0.48267 
     
-    Table of percentiles (80% CI bounds on time):
-     Percentile  Lower Estimate  Point Estimate  Upper Estimate
-              1         175.215         202.212         233.368
-              5         250.235         276.521         305.569
-             10         292.686         317.508         344.435
-             20         344.277         366.719         390.623
-             25         363.578          385.05          407.79
-             50          437.69         455.879         474.824
-             75          502.94         520.776         539.245
-             80         517.547         535.917         554.938
-             90         553.267         574.068         595.651
-             95         580.174          603.82          628.43
-             99         625.682          655.79         687.347 
+    Table of quantiles (80% CI bounds on time):
+     Quantile  Lower Estimate  Point Estimate  Upper Estimate
+         0.01         175.215         202.212         233.368
+         0.05         250.235         276.521         305.569
+          0.1         292.686         317.508         344.435
+          0.2         344.277         366.719         390.623
+         0.25         363.578          385.05          407.79
+          0.5          437.69         455.879         474.824
+         0.75          502.94         520.776         539.245
+          0.8         517.547         535.917         554.938
+          0.9         553.267         574.068         595.651
+         0.95         580.174          603.82          628.43
+         0.99         625.682          655.79         687.347 
     
     ----------------------------------------------------------
     Results from Fit_Weibull_2P (80% CI):
@@ -275,13 +275,14 @@ To obtain details of the percentiles (lower estimate, point estimate, upper esti
                 BIC   611.14
                  AD  0.48267 
     
-    Table of percentiles (80% CI bounds on time):
-     Percentile  Lower Estimate  Point Estimate  Upper Estimate
-              5         250.235         276.521         305.569
-             50          437.69         455.879         474.824
-             95         580.174          603.82          628.43 
+    Table of quantiles (80% CI bounds on time):
+     Quantile  Lower Estimate  Point Estimate  Upper Estimate
+         0.05         250.235         276.521         305.569
+          0.5          437.69         455.879         474.824
+         0.95         580.174          603.82          628.43 
     
     Lower estimates: [250.23461473 437.69015375 580.17421254]
+    Upper estimates: [305.56872227 474.82362169 628.43042835]
     '''
 
 .. image:: images/weibull_percentiles.png

diff --git a/docs/How are the confidence intervals calculated.rst b/docs/How are the confidence intervals calculated.rst
@@ -257,33 +257,41 @@ Some distributions (such as the Gamma Distribution) are particularly difficult a
 How can I extract the confidence bounds from the plot
 -----------------------------------------------------
 
-For bounds on time, this can be done programatically using the percentiles option in the fitter like this:
+For bounds on time, this can be done using the quantiles option in the fitter as shown below in option 1.
+Alternatively, once we have the fitted distribution object, we can extract values from bounds on time or reliability directly from the CDF, SF, or CHF. This is shown below in option 2.
+Multiple examples of the second method are provided in the document on `working with fitted distributions <https://reliability.readthedocs.io/en/latest/Working%20with%20fitted%20distributions.html>`_.
 
 .. code:: python
 
     from reliability.Fitters import Fit_Weibull_2P
     import matplotlib.pyplot as plt
-    import numpy as np
     
     data = [43, 81, 41, 44, 52, 99, 64, 25, 41, 7]
-    q = np.array([0.2,0.3,0.4])
+    q = [0.2, 0.3, 0.4]
     
-    fit = Fit_Weibull_2P(failures=data,show_probability_plot=False,print_results=False,CI=0.95,percentiles=q*100)
-    lower = fit.percentiles['Lower Estimate']
-    point = fit.percentiles['Point Estimate']
-    upper = fit.percentiles['Upper Estimate']
-
+    # option 1 using quantiles argument
+    fit = Fit_Weibull_2P(failures=data, show_probability_plot=False, print_results=False, CI=0.95, quantiles=q)
+    lower = fit.quantiles['Lower Estimate']
+    point = fit.quantiles['Point Estimate']
+    upper = fit.quantiles['Upper Estimate']
     fit.distribution.CDF()
-    plt.scatter(point,q,color='darkorange')
-    plt.scatter(lower,q,color='blue')
-    plt.scatter(upper,q,color='red')
+    
+    # option 2 extracting values directly from the CDF
+    lower2, point2, upper2 = fit.distribution.CDF(CI_y=q,show_plot=False)
+    
+    plt.scatter(lower, q, color='blue')
+    plt.scatter(point, q, color='purple')
+    plt.scatter(upper, q, color='red')
+    
+    plt.scatter(lower2, q, color='white',marker='x')
+    plt.scatter(point2, q, color='lime',marker='x')
+    plt.scatter(upper2, q, color='yellow',marker='x')
+    
     plt.show()
 
 .. image:: images/CI_example2.png
 
-For bounds on reliability, extracting the parameters programatically is not currently enabled. It will be part of a future release (likely in January 2021), and will be available directly from the plotting method (avoiding the complicated method shown above).
-
-For bounds on either time or reliability, the ``Other_functions.crosshairs`` function provides an interactive set of crosshairs which can be used to find the values using the mouse.
+Lastly, for bounds on either time or reliability, the ``Other_functions.crosshairs`` function provides an interactive set of crosshairs which can be used to find the values using the mouse.
 A demo of how this works is shown `here <https://reliability.readthedocs.io/en/latest/Crosshairs.html>`_.
 
 Further reading