# Cohort Study: Preeclampsia and hypertension in later life

## Summary

The goal of this exercise was to investigate the association between a woman having suffered from pre-eclampsia or eclampsia during their first pregnancy, and them developing hyptertension in later life.
Data from a cohort study were obtained from a published source and analysed using standard epidemiological methods.
We found there was strong evidence of a positive association between hypertension during the first pregnancy and hypertension is later life.

## Introduction

Researchers followed **542** women who suffered from pre-eclampsia or eclampsia during their first pregnancy and **277** women who did not over a period of time, and counted those who developed hypertension in later life and those who did not.
The issue of interest was whether there was evidence of an association between hyptertension in a woman's first pregnancy, and hypertension in later life.
The data for the analysis were taken from Wilson, B.J., Watson, M.S., Prescott, G.J. *et al* (*BMJ* 2003;326:845).
(See description for full reference.)

## Method

## Analysis

In [1]:
from opyn.generic.pandasloader import PandasLoader
from opyn.stats import epidemeology as epyn
from pandas.api.types import CategoricalDtype

### Prepare the data

In [2]:
pdloader = PandasLoader()
f = "preeclampsia"
pdloader.get_description(f)

title: Pre-eclampsia and Eclampsia and Hypertension in later life

description:
    Results of a cohort study looking at the association between
    pre-eclampsia and eclampsia during a woman's first pregnancy, and
    the development of hypertension in later life.

reference: https://www.bmj.com/content/326/7394/845

fields:
    count:
        type: int
        desc: Number of observations
    exposure:
        type: str
        desc: Pre-eclampsia and eclampsia during first pregnancy
        values:
            - pre-eclampsia
            - no pre-eclampsia
    outcome
        type: str
        desc: Hypertension in later life
        values:
            - hypertension
            - no hypertension



In [3]:
dat = pdloader.get(f)
dat

Unnamed: 0,count,exposure,outcome
0,327,pre-eclampsia,hypertension
1,215,pre-eclampsia,no hypertension
2,76,no pre-eclampsia,hypertension
3,201,no pre-eclampsia,no hypertension


Copy the dataframe to preserve the immutable nature of the data, and then recode the `exposure` and `outcome` columns as ordered categorical data.

In [4]:
copieddat = dat.copy(deep=True)
copieddat["exposure"] = copieddat["exposure"].astype(
    CategoricalDtype(["no pre-eclampsia", "pre-eclampsia"], True)
)
copieddat["outcome"] = copieddat["outcome"].astype(
    CategoricalDtype(["no hypertension", "hypertension"], True)
)

Sort the dataframe to ensure the data is as expected.

In [5]:
sorteddat = copieddat.sort_values(by=["exposure", "outcome"])

Extract the `count` column as a `2x2` `ndarray`.

In [6]:
resarr = sorteddat["count"].to_numpy().reshape((2, 2))
resarr

array([[201,  76],
       [215, 327]], dtype=int64)

It is this new reshaped `ndarray` that we will pass to the various functions for analysis.

### Measures of Association

In [7]:
epyn.riskratio(resarr)

Unnamed: 0,riskratio,stderr,lower,upper
Exposed1 (-),1.0,0.0,,
Exposed2 (+),2.198946,0.103735,1.794385,2.69472


In [8]:
epyn.oddsratio(resarr)

Unnamed: 0,oddsratio,stderr,lower,upper
Exposed1 (-),1.0,0.0,,
Exposed2 (+),4.02246,0.160755,2.935327,5.512225


### Chi-squared test of no association

In [9]:
epyn.expectedfreq(resarr)

array([[140.6984127, 136.3015873],
       [275.3015873, 266.6984127]])

In [10]:
epyn.chisqcontribs(resarr)

array([[25.84450927, 26.67820312],
       [13.20835621, 13.63443222]])

In [11]:
epyn.chisqtest(resarr)

Unnamed: 0,chisq,pval,df
result,79.365501,5.161965e-19,1.0


## Discussion