In [1]:
%load_ext autoreload
%autoreload 2
import lib

In [2]:
dataset = lib.get_dataset()

# Defining ethnicity in OpenSAFELY
Ethnicity is most commonly defined using primary care data, but can also be defined using secondary care data. These can then be combined as needed. It might be advantageous to supplement the primary care definition with secondary care data because primary care alone tends to lead to 20-30% of people with a missing ethnicity. **However**, it is important to carefully consider the implications of this for your study. For example, the remaining people with a missing ethnicity are likely to be extremely unrepresentative of the population as a whole, in that they are much less likely to have frequently visited a GP or hospital.

## Codelists used

The ethnicity variable uses only [this codelist](https://www.opencodelists.org/codelist/opensafely/ethnicity-snomed-0removed/2e641f61/) for defining ethnicity in primary care. This was determined to be the best codelist to use in work published in [this paper](https://www.medrxiv.org/content/10.1101/2023.11.21.23298690v1). The codelist can be added to your study by adding this to your `codelists.txt` file:
```
opensafely/ethnicity-snomed-0removed/2e641f61
```
As described in the [codelist description](https://www.opencodelists.org/codelist/opensafely/ethnicity-snomed-0removed/2e641f61/), it cotains 2 levels of categorisation, one containing 5 categories, and another containing 16 categories.

Depending on what level of detail your study requires, you can use the 5 level categorisation by importing the codelist into to your dataset defintion like this:
```
ethnicity_6_category_codelist = codelist_from_csv(
    "codelists/opensafely-ethnicity-snomed-0removed.csv",
    column="snomedcode",
    category_column="Grouping_6",
)
```
...or the 16 category like this:
```
ethnicity_16_category_codelist = codelist_from_csv(
    "codelists/opensafely-ethnicity-snomed-0removed.csv",
    column="snomedcode",
    category_column="Grouping_16",
)
```

In [3]:
lib.count_by_category(dataset)


## 5 Category ethnicity
**This text describes the variable**
### It is defined using the following ehrQL:
```
ethnicity_6_category = (
    clinical_events.where(
        clinical_events.snomedct_code.is_in(ethnicity_6_category_codelist)
    )
    .sort_by(clinical_events.date)
    .last_for_patient()
    .snomedct_code.to_category(ethnicity_6_category_codelist)
)
```


ethnicity_6_category,count
cat,u32
,61
"""5""",244
"""4""",171
"""3""",170
"""2""",74
"""1""",280



## 16 Category ethnicity
**This text describes the variable**
### It is defined using the following ehrQL:
```
ethnicity_16_category = (
    clinical_events.where(
        clinical_events.snomedct_code.is_in(ethnicity_16_category_codelist)
    )
    .sort_by(clinical_events.date)
    .last_for_patient()
    .snomedct_code.to_category(ethnicity_16_category_codelist)
)
```


ethnicity_16_category,count
cat,u32
,61
"""16""",231
"""15""",13
"""14""",46
"""13""",88
"""12""",37
"""11""",114
"""10""",15
"""9""",14
"""8""",27



## Ethnicity from SUS
**This text describes the variable**

### It is defined using the following ehrQL:
```
ethnicity_sus = ethnicity_from_sus.code
```


ethnicity_sus,count
cat,u32
,504
"""A""",23
"""B""",37
"""C""",21
"""D""",30
"""E""",32
"""F""",31
"""G""",30
"""H""",29
"""J""",44



## 6 Category ethnicity combined with SUS
**This text describes the variable**
### It is defined using the following ehrQL (which includes some variables defined above):
```
dataset.ethnicity_gp_and_sus_5_category = case(
    when(
        (ethnicity_6_category == "1")
        | ((ethnicity_6_category.is_null()) & (ethnicity_sus.is_in(["A", "B", "C"])))
    ).then("White"),
    when(
        (ethnicity_6_category == "2")
        | (
            (ethnicity_6_category.is_null())
            & (ethnicity_sus.is_in(["D", "E", "F", "G"]))
        )
    ).then("Mixed"),
    when(
        (ethnicity_6_category == "3")
        | (
            (ethnicity_6_category.is_null())
            & (ethnicity_sus.is_in(["H", "J", "K", "L"]))
        )
    ).then("Asian or Asian British"),
    when(
        (ethnicity_6_category == "4")
        | ((ethnicity_6_category.is_null()) & (ethnicity_sus.is_in(["M", "N", "P"])))
    ).then("Black or Black British"),
    when(
        (ethnicity_6_category == "5")
        | ((ethnicity_6_category.is_null()) & (ethnicity_sus.is_in(["R", "S"])))
    ).then("Chinese or Other Ethnic Groups"),
    otherwise="Missing",
)
```


ethnicity_gp_and_sus_5_category,count
cat,u32
"""White""",285
"""Mixed""",77
"""Asian or Asian British""",177
"""Black or Black British""",178
"""Chinese or Other Ethnic Groups""",250
"""Missing""",33



## 6 Category ethnicity combined with SUS
**This text describes the variable**
### It is defined using the following ehrQL (which includes some variables defined above):
```
dataset.ethnicity_gp_and_sus_16_category = case(
    when(
        (ethnicity_16_category == "1")
        | ((ethnicity_16_category.is_null()) & (ethnicity_sus.is_in(["A"])))
    ).then("White - British"),
    when(
        (ethnicity_16_category == "2")
        | ((ethnicity_16_category.is_null()) & (ethnicity_sus.is_in(["B"])))
    ).then("White - Irish"),
    when(
        (ethnicity_16_category == "3")
        | ((ethnicity_16_category.is_null()) & (ethnicity_sus.is_in(["C"])))
    ).then("White - Any other White background"),
    when(
        (ethnicity_16_category == "4")
        | ((ethnicity_16_category.is_null()) & (ethnicity_sus.is_in(["D"])))
    ).then("Mixed - White and Black Caribbean"),
    when(
        (ethnicity_16_category == "5")
        | ((ethnicity_16_category.is_null()) & (ethnicity_sus.is_in(["E"])))
    ).then("Mixed - White and Black African"),
    when(
        (ethnicity_16_category == "6")
        | ((ethnicity_16_category.is_null()) & (ethnicity_sus.is_in(["F"])))
    ).then("Mixed - White and Asian"),
    when(
        (ethnicity_16_category == "7")
        | ((ethnicity_16_category.is_null()) & (ethnicity_sus.is_in(["G"])))
    ).then("Mixed - Any other mixed background"),
    when(
        (ethnicity_16_category == "8")
        | ((ethnicity_16_category.is_null()) & (ethnicity_sus.is_in(["H"])))
    ).then("Asian or Asian British - Indian"),
    when(
        (ethnicity_16_category == "9")
        | ((ethnicity_16_category.is_null()) & (ethnicity_sus.is_in(["J"])))
    ).then("Asian or Asian British - Pakistani"),
    when(
        (ethnicity_16_category == "10")
        | ((ethnicity_16_category.is_null()) & (ethnicity_sus.is_in(["K"])))
    ).then("Asian or Asian British - Bangladeshi"),
    when(
        (ethnicity_16_category == "11")
        | ((ethnicity_16_category.is_null()) & (ethnicity_sus.is_in(["L"])))
    ).then("Asian or Asian British - Any other Asian background"),
    when(
        (ethnicity_16_category == "12")
        | ((ethnicity_16_category.is_null()) & (ethnicity_sus.is_in(["M"])))
    ).then("Black or Black British - Caribbean"),
    when(
        (ethnicity_16_category == "13")
        | ((ethnicity_16_category.is_null()) & (ethnicity_sus.is_in(["N"])))
    ).then("Black or Black British - African"),
    when(
        (ethnicity_16_category == "14")
        | ((ethnicity_16_category.is_null()) & (ethnicity_sus.is_in(["P"])))
    ).then("Black or Black British - Any other Black background"),
    when(
        (ethnicity_16_category == "15")
        | ((ethnicity_16_category.is_null()) & (ethnicity_sus.is_in(["R"])))
    ).then("Other Ethnic Groups - Chinese"),
    when(
        (ethnicity_16_category == "16")
        | ((ethnicity_16_category.is_null()) & (ethnicity_sus.is_in(["S"])))
    ).then("Other Ethnic Groups - Any other ethnic group"),
    otherwise="Missing",
)
```


ethnicity_gp_and_sus_16_category,count
cat,u32
"""White - British""",55
"""White - Irish""",29
"""White - Any other White background""",201
"""Mixed - White and Black Caribbean""",10
"""Mixed - White and Black African""",12
"""Mixed - White and Asian""",8
"""Mixed - Any other mixed background""",47
"""Asian or Asian British - Indian""",32
"""Asian or Asian British - Pakistani""",16
"""Asian or Asian British - Bangladeshi""",15


In [4]:
lib.do_all_crosstabs(dataset)

## Comparing ethnicity_6_category with ethnicity_16_category

ethnicity_6_category,16,3,15,null,11,13,14,7,5,12,8,2,1,10,4,9,6
cat,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32
"""5""",231.0,,13.0,,,,,,,,,,,,,,
"""1""",,201.0,,,,,,,,,,25.0,54.0,,,,
,,,,61.0,,,,,,,,,,,,,
"""3""",,,,,114.0,,,,,,27.0,,,15.0,,14.0,
"""4""",,,,,,88.0,46.0,,,37.0,,,,,,,
"""2""",,,,,,,,45.0,12.0,,,,,,9.0,,8.0


## Comparing ethnicity_6_category with ethnicity_sus

ethnicity_6_category,null,J,A,D,M,F,E,B,N,R,G,C,K,H,S,L,P
cat,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32
"""5""",129,7,3,6.0,9.0,4.0,8.0,7,6,11,9,6.0,6.0,9,12,6.0,6
"""1""",141,13,8,5.0,10.0,10.0,15.0,8,10,13,11,5.0,4.0,5,7,8.0,7
,33,2,1,1.0,1.0,,,4,2,4,2,,,5,2,,4
"""3""",79,9,7,10.0,6.0,4.0,4.0,6,5,6,3,3.0,6.0,4,4,10.0,4
"""4""",86,8,2,8.0,10.0,9.0,3.0,10,7,3,1,7.0,6.0,3,2,3.0,3
"""2""",36,5,2,,,4.0,2.0,2,1,3,4,,3.0,3,1,4.0,4


## Comparing ethnicity_6_category with ethnicity_gp_and_sus_5_category

ethnicity_6_category,Chinese or Other Ethnic Groups,White,Missing,Asian or Asian British,Black or Black British,Mixed
cat,i32,i32,i32,i32,i32,i32
"""5""",244.0,,,,,
"""1""",,280.0,,,,
,6.0,5.0,33.0,7.0,7.0,3.0
"""3""",,,,170.0,,
"""4""",,,,,171.0,
"""2""",,,,,,74.0


## Comparing ethnicity_6_category with ethnicity_gp_and_sus_16_category

ethnicity_6_category,Other Ethnic Groups - Any other ethnic group,White - Any other White background,Other Ethnic Groups - Chinese,Missing,Asian or Asian British - Any other Asian background,Black or Black British - African,Black or Black British - Any other Black background,White - Irish,Mixed - Any other mixed background,Mixed - White and Black African,Black or Black British - Caribbean,Asian or Asian British - Indian,White - British,Asian or Asian British - Bangladeshi,Mixed - White and Black Caribbean,Asian or Asian British - Pakistani,Mixed - White and Asian
cat,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32
"""5""",231.0,,13.0,,,,,,,,,,,,,,
"""1""",,201.0,,,,,,25.0,,,,,54.0,,,,
,2.0,,4.0,33.0,,2.0,4.0,4.0,2.0,,1.0,5.0,1.0,,1.0,2.0,
"""3""",,,,,114.0,,,,,,,27.0,,15.0,,14.0,
"""4""",,,,,,88.0,46.0,,,,37.0,,,,,,
"""2""",,,,,,,,,45.0,12.0,,,,,9.0,,8.0


## Comparing ethnicity_16_category with ethnicity_sus

ethnicity_16_category,null,J,A,D,M,F,E,B,N,R,G,C,K,H,S,L,P
cat,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32
"""16""",121,7.0,3.0,4.0,9.0,4.0,8.0,6.0,6.0,11.0,9.0,6.0,6.0,9.0,10.0,6.0,6.0
"""3""",101,9.0,4.0,5.0,9.0,7.0,13.0,6.0,9.0,8.0,9.0,3.0,1.0,3.0,5.0,5.0,4.0
"""15""",8,,,2.0,,,,1.0,,,,,,,2.0,,
,33,2.0,1.0,1.0,1.0,,,4.0,2.0,4.0,2.0,,,5.0,2.0,,4.0
"""11""",52,5.0,5.0,6.0,4.0,2.0,3.0,3.0,3.0,5.0,2.0,2.0,5.0,3.0,4.0,7.0,3.0
"""13""",47,4.0,2.0,1.0,4.0,4.0,,5.0,4.0,2.0,,6.0,4.0,,1.0,3.0,1.0
"""14""",19,1.0,,3.0,6.0,1.0,2.0,3.0,1.0,1.0,1.0,1.0,2.0,3.0,1.0,,1.0
"""7""",22,4.0,1.0,,,1.0,1.0,1.0,,1.0,3.0,,2.0,3.0,,3.0,3.0
"""5""",7,,,,,1.0,,1.0,1.0,,,,,,1.0,,1.0
"""12""",20,3.0,,4.0,,4.0,1.0,2.0,2.0,,,,,,,,1.0


## Comparing ethnicity_16_category with ethnicity_gp_and_sus_5_category

ethnicity_16_category,Chinese or Other Ethnic Groups,White,Missing,Asian or Asian British,Black or Black British,Mixed
cat,i32,i32,i32,i32,i32,i32
"""16""",231.0,,,,,
"""3""",,201.0,,,,
"""15""",13.0,,,,,
,6.0,5.0,33.0,7.0,7.0,3.0
"""11""",,,,114.0,,
"""13""",,,,,88.0,
"""14""",,,,,46.0,
"""7""",,,,,,45.0
"""5""",,,,,,12.0
"""12""",,,,,37.0,


## Comparing ethnicity_16_category with ethnicity_gp_and_sus_16_category

ethnicity_16_category,Other Ethnic Groups - Any other ethnic group,White - Any other White background,Other Ethnic Groups - Chinese,Missing,Asian or Asian British - Any other Asian background,Black or Black British - African,Black or Black British - Any other Black background,White - Irish,Mixed - Any other mixed background,Mixed - White and Black African,Black or Black British - Caribbean,Asian or Asian British - Indian,White - British,Asian or Asian British - Bangladeshi,Mixed - White and Black Caribbean,Asian or Asian British - Pakistani,Mixed - White and Asian
cat,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32
"""16""",231.0,,,,,,,,,,,,,,,,
"""3""",,201.0,,,,,,,,,,,,,,,
"""15""",,,13.0,,,,,,,,,,,,,,
,2.0,,4.0,33.0,,2.0,4.0,4.0,2.0,,1.0,5.0,1.0,,1.0,2.0,
"""11""",,,,,114.0,,,,,,,,,,,,
"""13""",,,,,,88.0,,,,,,,,,,,
"""14""",,,,,,,46.0,,,,,,,,,,
"""7""",,,,,,,,,45.0,,,,,,,,
"""5""",,,,,,,,,,12.0,,,,,,,
"""12""",,,,,,,,,,,37.0,,,,,,


## Comparing ethnicity_sus with ethnicity_gp_and_sus_5_category

ethnicity_sus,Chinese or Other Ethnic Groups,White,Missing,Asian or Asian British,Black or Black British,Mixed
cat,i32,i32,i32,i32,i32,i32
,129,141,33.0,79,86,36.0
"""J""",7,13,,11,8,5.0
"""A""",3,9,,7,2,2.0
"""D""",6,5,,10,8,1.0
"""M""",9,10,,6,11,
"""F""",4,10,,4,9,4.0
"""E""",8,15,,4,3,2.0
"""B""",7,12,,6,10,2.0
"""N""",6,10,,5,9,1.0
"""R""",15,13,,6,3,3.0


## Comparing ethnicity_sus with ethnicity_gp_and_sus_16_category

ethnicity_sus,Other Ethnic Groups - Any other ethnic group,White - Any other White background,Other Ethnic Groups - Chinese,Missing,Asian or Asian British - Any other Asian background,Black or Black British - African,Black or Black British - Any other Black background,White - Irish,Mixed - Any other mixed background,Mixed - White and Black African,Black or Black British - Caribbean,Asian or Asian British - Indian,White - British,Asian or Asian British - Bangladeshi,Mixed - White and Black Caribbean,Asian or Asian British - Pakistani,Mixed - White and Asian
cat,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32
,121,101,8.0,33.0,52,47.0,19.0,12.0,22.0,7.0,20.0,15.0,28.0,5.0,5.0,7.0,2.0
"""J""",7,9,,,5,4.0,1.0,,4.0,,3.0,,4.0,3.0,,3.0,1.0
"""A""",3,4,,,5,2.0,,1.0,1.0,,,1.0,4.0,1.0,1.0,,
"""D""",4,5,2.0,,6,1.0,3.0,,,,4.0,1.0,,2.0,1.0,1.0,
"""M""",9,9,,,4,4.0,6.0,,,,1.0,1.0,1.0,1.0,,,
"""F""",4,7,,,2,4.0,1.0,2.0,1.0,1.0,4.0,2.0,1.0,,1.0,,1.0
"""E""",8,13,,,3,,2.0,1.0,1.0,,1.0,1.0,1.0,,,,1.0
"""B""",6,6,1.0,,3,5.0,3.0,4.0,1.0,1.0,2.0,1.0,2.0,1.0,,1.0,
"""N""",6,9,,,3,6.0,1.0,1.0,,1.0,2.0,2.0,,,,,
"""R""",11,8,4.0,,5,2.0,1.0,1.0,1.0,,,,4.0,1.0,1.0,,1.0


## Comparing ethnicity_gp_and_sus_5_category with ethnicity_gp_and_sus_16_category

ethnicity_gp_and_sus_5_category,Other Ethnic Groups - Any other ethnic group,White - Any other White background,Other Ethnic Groups - Chinese,Missing,Asian or Asian British - Any other Asian background,Black or Black British - African,Black or Black British - Any other Black background,White - Irish,Mixed - Any other mixed background,Mixed - White and Black African,Black or Black British - Caribbean,Asian or Asian British - Indian,White - British,Asian or Asian British - Bangladeshi,Mixed - White and Black Caribbean,Asian or Asian British - Pakistani,Mixed - White and Asian
cat,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32
"""Chinese or Other Ethnic Groups""",233.0,,17.0,,,,,,,,,,,,,,
"""White""",,201.0,,,,,,29.0,,,,,55.0,,,,
"""Missing""",,,,33.0,,,,,,,,,,,,,
"""Asian or Asian British""",,,,,114.0,,,,,,,32.0,,15.0,,16.0,
"""Black or Black British""",,,,,,90.0,50.0,,,,38.0,,,,,,
"""Mixed""",,,,,,,,,47.0,12.0,,,,,10.0,,8.0
