Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix typos in documentation. #252

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 7 additions & 7 deletions docs/Intro to lifelines.rst
Original file line number Diff line number Diff line change
Expand Up @@ -522,7 +522,7 @@ Smoothing the hazard curve

Interpretation of the cumulative hazard function can be difficult -- it
is not how we usually interpret functions. (On the other hand, most
survival analysis is done using the cumulative hazard fuction, so understanding
survival analysis is done using the cumulative hazard function, so understanding
it is recommended).

Alternatively, we can derive the more-interpretable hazard curve, but
Expand All @@ -544,7 +544,7 @@ intervals, similar to the traditional ``plot`` functionality.
ax = naf.plot_hazard(bandwidth=b)
naf.fit(T[~dem], event_observed=C[~dem], label="Non-democratic Regimes")
naf.plot_hazard(ax=ax, bandwidth=b)
plt.title("Hazard function of different global regimes | bandwith=%.1f"%b);
plt.title("Hazard function of different global regimes | bandwidth=%.1f"%b);
plt.ylim(0,0.4)
plt.xlim(0,25);

Expand All @@ -555,8 +555,8 @@ intervals, similar to the traditional ``plot`` functionality.
It is more clear here which group has the higher hazard, and like
hypothesized above, both hazard rates are close to being constant.

There is no obvious way to choose a bandwith, and different
bandwidth can produce different inferences, so best to be very careful
There is no obvious way to choose a bandwidth, and different
bandwidths can produce different inferences, so best to be very careful
here. (My advice: stick with the cumulative hazard function.)

.. code:: python
Expand All @@ -566,7 +566,7 @@ here. (My advice: stick with the cumulative hazard function.)
ax = naf.plot_hazard(bandwidth=b)
naf.fit(T[~dem], event_observed=C[~dem], label="Non-democratic Regimes")
naf.plot_hazard(ax=ax, bandwidth=b)
plt.title("Hazard function of different global regimes | bandwith=%.1f"%b);
plt.title("Hazard function of different global regimes | bandwidth=%.1f"%b);



Expand Down Expand Up @@ -607,7 +607,7 @@ of time to birth. This is available as the ``cumulative_density_`` property afte

.. code:: python

kmf.cumulative_density_
print kmf.cumulative_density_
kmf.plot() #will plot the CDF


Expand All @@ -617,7 +617,7 @@ Left Truncated Data
~~~~~~~~~~~~~~~~~~~~~~~~~~

Another form of bias that can be introduced into a dataset is called left-truncation. (Also a form of censorship).
This occurs when individuals may die even before ever entrying into the study. Both ``KaplanMeierFitter`` and ``NelsonAalenFitter`` have an optional arugment for ``entry``, which is an array of equal size to the duration array.
This occurs when individuals may die even before ever entering into the study. Both ``KaplanMeierFitter`` and ``NelsonAalenFitter`` have an optional arugment for ``entry``, which is an array of equal size to the duration array.
It describes the offset from birth to entering the study. This is also useful when subjects enter the study at different
points in their lifetime. For example, if you are measuring time to death of prisoners in
prison, the prisoners will enter the study at different ages.
Expand Down
4 changes: 2 additions & 2 deletions docs/Survival Analysis intro.rst
Original file line number Diff line number Diff line change
Expand Up @@ -112,11 +112,11 @@ information at :math:`t=10`).
Survival analysis was originally developed to solve this type of
problem, that is, to deal with estimation when our data is
right-censored. Even in the case where all events have been
observed, i.e. no censorship, survival analysis is still a very useful
observed, i.e. no censorship, survival analysis is still a very useful tool
to understand durations.

The observations need not always start at zero, either. This was done
only for understanding in the above example. Consider the example of
only for understanding in the above example. Consider the example where
a customer entering a store is a birth: a customer can enter at
any time, and not necessarily at time zero. In survival analysis, durations
are relative: individuals may start at different times. (We actually only need the *duration* of the observation, and not
Expand Down
14 changes: 7 additions & 7 deletions docs/Survival Regression.rst
Original file line number Diff line number Diff line change
Expand Up @@ -41,9 +41,9 @@ variables ``un_continent_name`` (eg: Asia, North America,...), the
``regime`` type (eg: monarchy, civilan,...) and the year the regime
started in, ``start_year``.

Aalens additive model typically does not estimate the individual
Aalen's additive model typically does not estimate the individual
:math:`b_i(t)` but instead estimates :math:`\int_0^t b_i(s) \; ds`
(similar to estimate of the hazard rate using ``NelsonAalenFitter``
(similar to the estimate of the hazard rate using ``NelsonAalenFitter``
above). This is important to keep in mind when analzying the output.

.. code:: python
Expand Down Expand Up @@ -156,7 +156,7 @@ above). This is important to keep in mind when analzying the output.



I'm using the lovely library ``patsy`` <https://github.com/pydata/patsy>`__ here to create a
I'm using the lovely library `patsy <https://github.com/pydata/patsy>`__ here to create a
covariance matrix from my original dataframe.

.. code:: python
Expand Down Expand Up @@ -391,9 +391,9 @@ Prime Minister Stephen Harper.
Cox's Proportional Hazard model
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

New in 0.4.0 is the implementation of the Propotional Hazard's regression model (implemented in
R under ``coxph``). It has a similar API to Aalen's Additive model. Like R, it has a ``print_summary``
function that prints a tabuluar view of coefficients and related stats.
New in 0.4.0 is the implementation of the Cox propotional hazards regression model (implemented in
R under ``coxph``). It has a similar API to Aalen's additive model. Like R, it has a ``print_summary``
function that prints a tabular view of coefficients and related stats.

This example data is from the paper `here <http://socserv.socsci.mcmaster.ca/jfox/Books/Companion/appendix/Appendix-Cox-Regression.pdf>`_.

Expand Down Expand Up @@ -459,7 +459,7 @@ Model Selection in Survival Regression
With censorship, it's not correct to use a loss function like mean-squared-error or
mean-absolute-loss. Instead, one measure is the c-index, or concordance-index. This measure
evaluates the ordering of predicted times: how correct is the ordering? It is infact a generalization
of AUC, another common loss function, and is interpretted similarly:
of AUC, another common loss function, and is interpreted similarly:

* 0.5 is the expected result from random predictions,
* 1.0 is perfect concordance and,
Expand Down