# Title: Collecting data using interactive Jupyter widgets  
**Author details:** *Author:* Shona McElroy. *Contact details:* s2272790@ed.ac.uk.  <br /> 
**Notebook and data info:** This Notebook provides for part of the assessment for Working with Data Types and Structures in R and Python using interactive jupyter-widgets and to collect the NHS England mortality data (ons_mortality).The following widgets are designed to capture the data required for a data capture tool. <br /> **Data:** Data consists of date, numerical data and character data from NHSRdatasets package. <br />
**Copyright statement:** This Notebook is the product of Shona McElroy.

In [1]:
#Load the 'pandas' package
import pandas as pd
testData=pd.read_csv("../Data/ons_mortality_ENG_1019_test.csv")
testData

Unnamed: 0,index,week_no,year,date,counts,mort_avg,variance_from_avg
0,1798,30,2011,2011-07-29,8456,8737.4,0.967794
1,1803,35,2011,2011-09-02,7717,7984.6,0.966485
2,5324,19,2013,2013-05-10,8814,9096.0,0.968997
3,5326,21,2013,2013-05-24,9530,9311.2,1.023499
4,7121,48,2014,2014-11-28,9928,9398.8,1.056305
5,10722,27,2016,2016-07-08,9138,8872.2,1.029959
6,10742,47,2016,2016-11-25,10603,9572.0,1.10771
7,12473,10,2017,2017-03-10,11077,10816.0,1.024131
8,14259,28,2018,2018-07-13,9293,9018.0,1.030495
9,16021,22,2019,2019-05-31,8260,8125.0,1.016615


#### Data type

In [2]:
result = testData.dtypes
print("Output:")
print(result)

Output:
index                  int64
week_no                int64
year                   int64
date                  object
counts                 int64
mort_avg             float64
variance_from_avg    float64
dtype: object


#### View a sample of the test data frame

In [3]:
testData.head(n=1)

Unnamed: 0,index,week_no,year,date,counts,mort_avg,variance_from_avg
0,1798,30,2011,2011-07-29,8456,8737.4,0.967794


#### Set up empty data frame for data collection

In [4]:
dfTofill = pd.DataFrame({'index': [0],# Integer
                   'week_no': [0], # Integer
                   'year': [0], # Integer
                   'date': [pd.Timestamp('20000101')], # Date
                   'counts': [0], # Integer
                   'mort_avg': [0.0], # Float
                   'variance_from_avg': [0], # Integer
                   'consent': [False]}) # Boolean 

dfTofill

Unnamed: 0,index,week_no,year,date,counts,mort_avg,variance_from_avg,consent
0,0,0,0,2000-01-01,0,0.0,0,False


#### Save the empty data frame 

In [252]:
#dfTofill.to_csv('../Data/collected_data.csv', index=False)

In [5]:
CollectData=pd.read_csv("../Data/collected_data.csv")
CollectData

Unnamed: 0,index,week_no,year,date,counts,mort_avg,variance_from_avg,consent
0,1798,30,2011,2011-07-29,0,8737.4,0.967794,True
1,1803,35,2011,2011-09-02,7717,7984.6,0.966485,True
2,5324,19,2013,2013-05-10,8814,9096.0,0.968997,True
3,5326,21,2013,2013-05-24,9530,9311.2,1.023499,True
4,7121,48,2014,2014-11-28,9928,9398.8,1.056305,True
5,10722,27,2016,2016-07-08,9138,8872.2,1.029959,True
6,10742,47,2016,2016-11-25,10603,9572.0,1.10771,True
7,12473,10,2017,2017-03-10,11077,10816.0,1.024131,True
8,14259,28,2018,2018-07-13,9293,9018.0,1.030495,True
9,16021,22,2019,2019-05-31,8260,8125.0,1.016615,True


## Index number for each record (to be changed for each entry) 

In [6]:
index_number=16029 #Remember to change for each record.

In [7]:
dfTofill.iloc[0,0]=index_number
dfTofill

Unnamed: 0,index,week_no,year,date,counts,mort_avg,variance_from_avg,consent
0,16029,0,0,2000-01-01,0,0.0,0,False


#### Load the widgets and display packages

In [1]:
#Load the 'ipywidgets' package
import ipywidgets as widgets
#Load functions from widgets
from ipywidgets import VBox, Label, Layout
#Load the 'IPython.display' package
from IPython.display import display

## Week Number

In [2]:
b = widgets.BoundedIntText(
    value=1,
    min=1,
    max=52,
    step=1,
    description='Week Number:',
    style={'description_width': 'initial'},
    layout={'width': 'max-content'},
    disabled=False
)
display(b)

BoundedIntText(value=1, description='Week Number:', layout=Layout(width='max-content'), max=52, min=1, style=D…

In [3]:
dfTofill.iloc[0,1]=b.value
dfTofill

NameError: name 'dfTofill' is not defined

## Year

In [4]:
c = widgets.Dropdown(
    options=['2010', '2011', '2012', '2013', '2014', '2015', '2016', '2017', '2018', '2019', '2020', '2021', '2022', '2023', '2024'],
    value='2022',
    description='Year:',
    disabled=False,
    layout={'width': '200px'})
display(c)

Dropdown(description='Year:', index=12, layout=Layout(width='200px'), options=('2010', '2011', '2012', '2013',…

In [10]:
dfTofill.iloc[0,2]=c.value
dfTofill

Unnamed: 0,index,week_no,year,date,counts,mort_avg,variance_from_avg,consent
0,0,1,2022,2000-01-01,0,0.0,0,False


## Date

In [5]:
d = widgets.DatePicker(
    description='Friday of given week:',
    disabled=False,
    style={'description_width': 'initial'},
    layout={'width': 'max-content'})
display(d)

DatePicker(value=None, description='Friday of given week:', layout=Layout(width='max-content'), style=Descript…

In [12]:
dfTofill.iloc[0,3]=d.value
dfTofill

Unnamed: 0,index,week_no,year,date,counts,mort_avg,variance_from_avg,consent
0,0,1,2022,NaT,0,0.0,0,False


## Counts

In [6]:
e = widgets.IntText(
        value=0,
        description='Number of deaths in the preceding week:',
        disabled=False,
        style={'description_width': 'initial'},
        layout={'width': 'max-content'})
display(e)

IntText(value=0, description='Number of deaths in the preceding week:', layout=Layout(width='max-content'), st…

In [14]:
dfTofill.iloc[0,4]=e.value
dfTofill

Unnamed: 0,index,week_no,year,date,counts,mort_avg,variance_from_avg,consent
0,0,1,2022,NaT,0,0.0,0,False


## Average mortality 

In [7]:
f = widgets.FloatText(
        value=0.0,
        description='Average deaths for this week (5-years preceding):',
        disabled=False,     
        style={'description_width': 'initial'},
        layout={'width': 'max-content'})
display(f)

FloatText(value=0.0, description='Average deaths for this week (5-years preceding):', layout=Layout(width='max…

In [16]:
dfTofill.iloc[0,5]=f.value
dfTofill

Unnamed: 0,index,week_no,year,date,counts,mort_avg,variance_from_avg,consent
0,0,1,2022,NaT,0,0.0,0,False


## Variance from Average

In [8]:
g=widgets.FloatText(
    value=0.0,
    description='Actual/Average mortality (in decimal):',
    disabled=False,
    style={'description_width': 'initial'},
    layout={'width': 'max-content'}
)
display(g)

FloatText(value=0.0, description='Actual/Average mortality (in decimal):', layout=Layout(width='max-content'),…

In [18]:
dfTofill.iloc[0,6]=g.value
dfTofill

Unnamed: 0,index,week_no,year,date,counts,mort_avg,variance_from_avg,consent
0,0,1,2022,NaT,0,0.0,0,False


## Consent

In [9]:
h = widgets.Checkbox(
    value=False,
    description='I consent for the data I have provided to be processed and shared in accordance with data <br>  protection regulations with the purpose of improving care service provision across the UK.',
    disabled=False, 
    style={'description_width': 'initial'},
    layout={'width': 'max-content'}
)
display(h)

Checkbox(value=False, description='I consent for the data I have provided to be processed and shared in accord…

In [20]:
dfTofill.iloc[0,7]=h.value
dfTofill

Unnamed: 0,index,week_no,year,date,counts,mort_avg,variance_from_avg,consent
0,0,1,2022,NaT,0,0.0,0,False


# Concatenate the collected data to the CollectData data frame.   
Let us use the `concat()` function from the Python *pandas* package to append the CollectData and dfTofill data frames. The concat() function is used to concatenate *pandas* objects.

In [23]:
# CollectData is the first data frame
# dfTofill is the second data frame
CollectData  = pd.concat([CollectData, dfTofill])
display(CollectData)

Unnamed: 0,index,week_no,year,date,counts,mort_avg,variance_from_avg,consent
0,1798,30,2011,2011-07-29,0,8737.4,0.967794,True
1,1803,35,2011,2011-09-02,7717,7984.6,0.966485,True
2,5324,19,2013,2013-05-10,8814,9096.0,0.968997,True
3,5326,21,2013,2013-05-24,9530,9311.2,1.023499,True
4,7121,48,2014,2014-11-28,9928,9398.8,1.056305,True
5,10722,27,2016,2016-07-08,9138,8872.2,1.029959,True
6,10742,47,2016,2016-11-25,10603,9572.0,1.10771,True
7,12473,10,2017,2017-03-10,11077,10816.0,1.024131,True
8,14259,28,2018,2018-07-13,9293,9018.0,1.030495,True
9,16021,22,2019,2019-05-31,8260,8125.0,1.016615,True


### Check consent has been given

In [24]:
CollectData=CollectData[CollectData['consent'] == True]
display(CollectData)

Unnamed: 0,index,week_no,year,date,counts,mort_avg,variance_from_avg,consent
0,1798,30,2011,2011-07-29,0,8737.4,0.967794,True
1,1803,35,2011,2011-09-02,7717,7984.6,0.966485,True
2,5324,19,2013,2013-05-10,8814,9096.0,0.968997,True
3,5326,21,2013,2013-05-24,9530,9311.2,1.023499,True
4,7121,48,2014,2014-11-28,9928,9398.8,1.056305,True
5,10722,27,2016,2016-07-08,9138,8872.2,1.029959,True
6,10742,47,2016,2016-11-25,10603,9572.0,1.10771,True
7,12473,10,2017,2017-03-10,11077,10816.0,1.024131,True
8,14259,28,2018,2018-07-13,9293,9018.0,1.030495,True
9,16021,22,2019,2019-05-31,8260,8125.0,1.016615,True


### Save the CollectData data frame

In [273]:
CollectData.to_csv('../Data/collected_data.csv', index=False)

### Save the completed CollectData file to RawData

In [274]:
CollectData.to_csv('../RawData/collected_data_final.csv', index=False)

# Create the form

In [10]:
label_layout = Layout()

form_item_layout = Layout(
    display='flex',
    flex_flow='row',
    justify_content='space-between'
)

#form=widgets.VBox([a,b,c,d,e,f,g,h])
form=widgets.VBox([b,c,d,e,f,g,h], layout=Layout(
    display='flex',
    flex_flow='column',
    min_height='280px',
    margin= '20px',
    border='solid 1px',
    align_items='stretch',
    width='70%'
))



In [106]:
display(form)

VBox(children=(BoundedIntText(value=1, description='Week Number:', layout=Layout(width='max-content'), max=52,…

# Data driven decisions for efficient resource use in the NHS
We aim to help resource planners make more informed decisions so that the NHS’s limited resources get to the places that need them most, when they need them. We’re building a predictive model to support their planning decisions.<br>
To do this we are researching how well the average mortality for a given week predicts actual mortality over time. We’d like to understand how the seasons affects this predictability and we invite you to participate by contributing data to help this analysis. <br>
### Why is that important?
Predicting mortality is important for use in planning and resource allocation across the NHS system. It can help with short and long term planning. For example they may be used to plan capacity planning for hospitals and mortuaries or GP patient numbers. If we can understand how predictable mortality rates using different indicators (such as the average for the preceding 5 years) then we can build predictive models. <br>
### Data collection
Most of the data we use will be collected during the normal operation of the NHS. The data you provide here will augment this analysis and enable us to create a more robust predictive model. <br>
### How will the results be used?
The results will be combined with other analysis to develop a model of prediction of mortality. This model will be shared with resource planners including hospital administration via an interactive dashboard. If you would like access to the data dashboard please email info@theresearchstudy.com 

In [11]:
display(form)

VBox(children=(BoundedIntText(value=1, description='Week Number:', layout=Layout(width='max-content'), max=52,…

Thank you for contributing and giving us your consent to process and share it with teams across England. We will add your data to our [open data resource](https://github.com/Exam-B210533/B210533_assessment). 