# COVID-19 Second Generation Surveillance System Dataset
## 1. Summary
The information below is retrieved from the Health Data Gateway API developed by NHS England, with additional fields added by UK LLC.  

In [1]:
# define target dataset to document
schema = 'nhsd'
table = 'COVIDSGSS'
version = 'v0002'
# import functions from script helper
import sys
script_fp = "../../../../scripts/"
sys.path.insert(0, script_fp)
from data_doc_helper import DocHelper
# create instance
document = DocHelper(schema, table, version, script_fp)
# markdown/code hybrid cell module requirement
from IPython.display import display, Markdown

In [2]:
# get api data
dataset = document.get_api_data()
display(Markdown("**NHS England title of dataset:** "+dataset['datasetfields']['metadataquality']['title']))
display(Markdown("**Dataset name in UK LLC TRE:** nhsd.COVIDSGSS"))  
display(Markdown("**Short abstract:** "+dataset['datasetfields']['abstract']))
display(Markdown("**Extended abstract:** "+dataset['datasetv2']['documentation']['description']))
display(Markdown("**Geographical coverage:** "+dataset['datasetfields']['geographicCoverage'][0]))
display(Markdown("**Temporal coverage:** "+dataset['datasetfields']['datasetStartDate']))
display(Markdown("**Data available from:** 06/04/2020 onwards"))
display(Markdown("**Typical age range:** "+dataset['datasetfields']['ageBand']))
display(Markdown("**Collection situation:** "+dataset['datasetv2']['provenance']['origin']['collectionSituation'][0]))
display(Markdown("**Purpose:** "+dataset['datasetv2']['provenance']['origin']['purpose'][0]))
display(Markdown("**Source:** "+dataset['datasetv2']['provenance']['origin']['source'][0]))
display(Markdown("**Pathway:** "+dataset['datasetv2']['coverage']['pathway']))
display(Markdown("**Information collected:** Demographic information about people who test positive for SARS-CoV-2"))  
display(Markdown("**Structure of dataset:** Each line represents one participant"))  
display(Markdown("**Update frequency in UK LLC TRE:** Quarterly"))  
display(Markdown("**Dataset version in UK LLC TRE:** TBC"))
display(Markdown("**Summary of changes between dataset versions:** TBC"))  
display(Markdown("**Data quality issues:** TBC"))  
display(Markdown("**Restrictions to data usage:** Medical purposes only"))  
display(Markdown("**Further information:** [https://digital.nhs.uk/about-nhs-digital/corporate-information-and-documents/directions-and-data-provision-notices/data-provision-notices-dpns/sgss-and-sari-watch-data](https://digital.nhs.uk/about-nhs-digital/corporate-information-and-documents/directions-and-data-provision-notices/data-provision-notices-dpns/sgss-and-sari-watch-data)"))

**NHS England title of dataset:** Covid-19 Second Generation Surveillance System

**Dataset name in UK LLC TRE:** nhsd.COVIDSGSS

**Short abstract:** Data forming the Covid-19 Second Generation Surveillance Systems data set relate to demographic and diagnostic information from Pillar 1 swab testing in PHE labs and NHS hospitals and Pillar 2 Swab testing in the community.

**Extended abstract:** Data forming the Covid-19 Second Generation Surveillance Systems data set relate to demographic and diagnostic information from Pillar 1 swab testing in PHE labs and NHS hospitals for those with a clinical need, and health and care workers and Pillar 2 Swab testing in the community at drive through test centres, walk in centres, home kits returned by posts, care homes, prisons etc).

Timescales for dissemination can be found under 'Our Service Levels' at the following link: https://digital.nhs.uk/services/data-access-request-service-dars/data-access-request-service-dars-process

**Geographical coverage:** United Kingdom,England

**Temporal coverage:** 06/04/2020

**Data available from:** 06/04/2020 onwards

**Typical age range:** 0-150

**Collection situation:** IN-PATIENTS

**Purpose:** CARE

**Source:** LIMS

**Pathway:** NOT APPLICABLE

**Information collected:** Demographic information about people who test positive for SARS-CoV-2

**Structure of dataset:** Each line represents one participant

**Update frequency in UK LLC TRE:** Quarterly

**Dataset version in UK LLC TRE:** TBC

**Summary of changes between dataset versions:** TBC

**Data quality issues:** TBC

**Restrictions to data usage:** Medical purposes only

**Further information:** [https://digital.nhs.uk/about-nhs-digital/corporate-information-and-documents/directions-and-data-provision-notices/data-provision-notices-dpns/sgss-and-sari-watch-data](https://digital.nhs.uk/about-nhs-digital/corporate-information-and-documents/directions-and-data-provision-notices/data-provision-notices-dpns/sgss-and-sari-watch-data)

## 2. Metrics
Below we include tables that summarise the COVIDSGSS dataset in the UK LLC TRE.

**Table 1** The number of participants from each LPS that are represented in the COVIDSGSS dataset in the UK LLC TRE  

**Note**: Numbers in Table 1 relate to the most recent extract of NHS England data and so may not correspond to the numbers of participants from each LPS in the data you were provisioned.

In [3]:
# group extract by date
gb_cohort = document.get_cohort_count()
print(gb_cohort.to_markdown(index=False, tablefmt="fancy_grid"))
#display(gb_cohort)

╒════════════════╤═════════╕
│ cohort         │   count │
╞════════════════╪═════════╡
│ ALSPAC         │    2516 │
├────────────────┼─────────┤
│ BCS70          │    2182 │
├────────────────┼─────────┤
│ BIB            │    9382 │
├────────────────┼─────────┤
│ ELSA           │    1707 │
├────────────────┼─────────┤
│ EPICN          │    2298 │
├────────────────┼─────────┤
│ EXCEED         │    2882 │
├────────────────┼─────────┤
│ FENLAND        │    3076 │
├────────────────┼─────────┤
│ GLAD           │   21185 │
├────────────────┼─────────┤
│ MCS            │    6743 │
├────────────────┼─────────┤
│ NCDS58         │    1654 │
├────────────────┼─────────┤
│ NEXTSTEP       │    1860 │
├────────────────┼─────────┤
│ NIHRBIO_COPING │    7005 │
├────────────────┼─────────┤
│ NSHD46         │     388 │
├────────────────┼─────────┤
│ TEDS           │       0 │
├────────────────┼─────────┤
│ TRACKC19       │    6281 │
├────────────────┼─────────┤
│ TWINSUK        │    3908 │
├─────────────

## 3. Helpful syntax
Below we will include syntax that may be helpful to other researchers in the UK LLC TRE. For longer scripts, we will include a snippet of the code plus a link to Git where you can find the full script. 