In [23]:
using Distributions, StatsPlots;
using Plots, Interact, Blink;

In [24]:
dplot = function(dd)
    lo, hi = quantile.(dd, [0.01, 0.99])
    x = range(lo, hi; length = 100)
    plot(x, pdf.(dd, x), lw=3, label = "f(t)")
    plot!(x, cdf.(dd, x), lw=3, label = "F(t)")
    plot!(x, 1 .- cdf.(dd, x), lw=3, label = "C(t)")
    plot!(x, pdf.(dd, x) ./ (1 .- cdf.(dd, x)), lw=3, label = "λ(t)")
end

#53 (generic function with 1 method)

# Exponencial

This is a widely used probability distribution in maintainability, maintenance, and reliability work. Two basic reasons for its widespread use are that it is easy to handle in performing various types of analyses and the constant failure rate of many engineering items during their useful lives, particularly the electronic ones.

$$f(x; \lambda) = \lambda e^{-\lambda x}, \quad x > 0$$

$$\begin{aligned}
F(t) &=\int_{0}^{t} \lambda e^{-\lambda x} d x \\
&=1-e^{-\lambda t}
\end{aligned}$$

In [25]:
ui = @manipulate for λ = 0.1:0.1:10
    dplot(Exponential(λ))
end;
ui #body!(Window(), ui)

# Rayleigh


This distribution is known after its originator, John Rayleigh (1842-1919), and is sometime used in reliability-related studies.

$$f(x; \sigma) = \frac{x}{\sigma^2} e^{-\frac{x^2}{2 \sigma^2}}, \quad x > 0$$

$$F(t, \sigma)=1-e^{-(t / \sigma)^{2}}$$

In [26]:
ui = @manipulate for σ = 0.1:0.1:10
    dplot(Rayleigh(σ))
end;
ui #body!(Window(), ui)

# Weibull

This distribution was developed by W. Weibull in the early 1950s and it can be used to represent many different physical phenomena.

$$f(x; \beta, \eta, t_0) = \frac{\beta}{\eta} \left( \frac{x-t_0}{\eta} \right)^{\beta-1} e^{-(\frac{x-t_0}{\eta})^\beta}, \quad x \ge 0$$
   
   
$$F(t; \beta, \eta, t_0=0)=1-e^{-(t / \eta)^{\beta}}$$

![image.png](attachment:image.png)

In [27]:
ui = @manipulate for β = 0.1:0.1:10, η = 0.1:0.1:10
    dplot(Weibull(β,η))
end;
ui #body!(Window(), ui)

In [28]:
bathtub_plot = function(d1, d2, d3)
    lo, hi = quantile.(d1, [0.01, 0.99])
    x = range(lo, hi; length = 100)
    plot(x, pdf.(d1, x) ./ (1 .- cdf.(d1, x)), lw=3, label = "λ1(t)")
    lo, hi = quantile.(d2, [0.01, 0.99])
    x = range(lo, hi; length = 100)
    plot!(x, pdf.(d2, x) ./ (1 .- cdf.(d2, x)), lw=3, label = "λ2(t)")
    lo, hi = quantile.(d3, [0.01, 0.99])
    x = range(lo, hi; length = 100)
    plot!(x, pdf.(d3, x) ./ (1 .- cdf.(d3, x)), lw=3, label = "λ3(t)")
end

ui = @manipulate for β1 = 0.1:0.1:10, η1 = 0.1:0.1:10, 
    β2 = 0.1:0.1:10, η2 = 0.1:0.1:10, 
    β3 = 0.1:0.1:10, η3 = 0.1:0.1:10
    bathtub_plot(Weibull(β1,η1), Weibull(β2,η2), Weibull(β3,η3))
end;

ui #body!(Window(), ui)

# Normal

This is a widely used probability distribution and is also known as the Gaussian distribution after Carl Friedrich Gauss (1777-1855).

$$f(x; \mu, \sigma) = \frac{1}{\sqrt{2 \pi \sigma^2}}
\exp \left( - \frac{(x - \mu)^2}{2 \sigma^2} \right)$$

$$F(t; \mu, \sigma)=\frac{1}{\sigma \sqrt{2 \pi}} \int_{-\infty}^{t} \exp \left[-\frac{(t-\mu)^{2}}{2 \sigma^{2}}\right] d y$$

In [29]:
ui = @manipulate for μ = 0:0.1:10, σ = 0.1:0.1:10
    dplot(Normal(μ,σ))
end;
ui #body!(Window(), ui)

# LogNormal

This distribution is quite useful to represent the distribution of failed equipment repair times.

$$f(x; \mu, \sigma) = \frac{1}{x \sqrt{2 \pi \sigma^2}}
\exp \left( - \frac{(\log(x) - \mu)^2}{2 \sigma^2} \right),
\quad x > 0$$

$$F(t; \mu, \sigma)=\frac{1}{\alpha \sqrt{2 \Pi}} \int_{0}^{t} \frac{1}{x} \exp \left[-\frac{(\ln x-\mu)^{2}}{2 \alpha^{2}}\right] d x$$

In [30]:
ui = @manipulate for μ = 0:0.1:10, σ = 0.1:0.1:10
    dplot(LogNormal(μ,σ))
end;
ui #body!(Window(), ui)

Similar (and sometimes identical) statistical methods were being developed in
other fields. A wide suite of statistical methods for recovering from censoring emerged during the 1960–80s under the rubric of survival analysis where the longevity of a population was a common theme in many applications.

To illustrate how survival problems arise in diverse fields, consider the following regression problems:

1. An epidemiologist wishes to determine how the human longevity or survival time (dependent variable) depends on the number of cigarettes smoked per day (independent variable). The experiment lasts five years, during which some individuals die and others do not. The survival time of the living individuals is only known to be greater than their age when the experiment ends; these lower limits to longevity are called right-censored data points. The technique of Cox regression is typically used to quantify the dependence of mortality on cigarette smoking.

2. An industrial manufacturing company wishes to know the average time between breakdowns of a new engine as a function of axle rotation speed to determine the optimal operating range. Test engines are set running until 20% of them fail. The average lifetime dependence on speed is then calculated with 80% of the data points right-censored. The technique of accelerated life testing is typically used, often based on an assumed Weibull (a parametric U-shaped) distribution to model the prevalence of early and late failures. 

3. An economist seeks the relationship between education and income from a census survey. Household income is recorded down to the poverty level below which the value is not recorded. In some cases, even the number of impoverished households is unknown. In such econometric applications, censored and truncated variables are called limited dependent variables and methods like Tobit regression are used. 