<div class='alert alert-warning'>

SciPy's interactive examples with Jupyterlite are experimental and may not always work as expected. Execution of cells containing imports may result in large downloads (up to 60MB of content for the first import from SciPy). Load times when importing from SciPy may take roughly 10-20 seconds. If you notice any problems, feel free to open an [issue](https://github.com/scipy/scipy/issues/new/choose).

</div>

**Uncensored Data**

As in the example from [1] page 79, five boys were selected at random from
those in a single high school. Their one-mile run times were recorded as
follows.


In [None]:
sample = [6.23, 5.58, 7.06, 6.42, 5.20]  # one-mile run times (minutes)

The empirical distribution function, which approximates the distribution
function of one-mile run times of the population from which the boys were
sampled, is calculated as follows.


In [None]:
from scipy import stats
res = stats.ecdf(sample)
res.cdf.quantiles

array([5.2 , 5.58, 6.23, 6.42, 7.06])

In [None]:
res.cdf.probabilities

array([0.2, 0.4, 0.6, 0.8, 1. ])

To plot the result as a step function:


In [None]:
import matplotlib.pyplot as plt
ax = plt.subplot()
res.cdf.plot(ax)
ax.set_xlabel('One-Mile Run Time (minutes)')
ax.set_ylabel('Empirical CDF')
plt.show()

**Right-censored Data**

As in the example from [1] page 91, the lives of ten car fanbelts were
tested. Five tests concluded because the fanbelt being tested broke, but
the remaining tests concluded for other reasons (e.g. the study ran out of
funding, but the fanbelt was still functional). The mileage driven
with the fanbelts were recorded as follows.


In [None]:
broken = [77, 47, 81, 56, 80]  # in thousands of miles driven
unbroken = [62, 60, 43, 71, 37]

Precise survival times of the fanbelts that were still functional at the
end of the tests are unknown, but they are known to exceed the values
recorded in ``unbroken``. Therefore, these observations are said to be
"right-censored", and the data is represented using
`scipy.stats.CensoredData`.


In [None]:
sample = stats.CensoredData(uncensored=broken, right=unbroken)

The empirical survival function is calculated as follows.


In [None]:
res = stats.ecdf(sample)
res.sf.quantiles

array([37., 43., 47., 56., 60., 62., 71., 77., 80., 81.])

In [None]:
res.sf.probabilities

array([1.   , 1.   , 0.875, 0.75 , 0.75 , 0.75 , 0.75 , 0.5  , 0.25 , 0.   ])

To plot the result as a step function:


In [None]:
ax = plt.subplot()
res.sf.plot(ax)
ax.set_xlabel('Fanbelt Survival Time (thousands of miles)')
ax.set_ylabel('Empirical SF')
plt.show()