# B202280_Assessment  
**Author details:** *Author:* B202280.   
**Notebook and data info:** This Notebook provides an example of using interactive jupyter-widgets and to collect the NHS England accident and emergency attendances(ae_attendances) data (my test data) and saving to my working ‘Data’ folder, and finally saving all the captured test data to my 'RawData'.  
**Data:** Data consists of date, numerical data from NHSRdatasets package.  
**Copyright statement:** This Notebook is the product of The University of Edinburgh.  




### The *pandas* package
To import the data, I will load the *pandas* package. The Python *pandas* package is used for data manipulation and analysis.

In [1]:
#Load the 'pandas' package
import pandas as pd
testData=pd.read_csv("../Data/ae_attendances_test.csv")
testData

Unnamed: 0,index,period,attendances
0,1155,2016-12-01,200
1,2059,2016-10-01,6452
2,3468,2016-05-01,417
3,4153,2018-03-01,9376
4,4820,2018-02-01,245
5,7243,2017-07-01,5170
6,8057,2017-04-01,15957
7,8957,2019-02-01,7258
8,10214,2018-10-01,3197
9,10328,2018-10-01,2033


#### Data type
I will now check the data type in the testData data frame. I will use the `dtypes` function from the Python *pandas* package to query the data types in the testData. The `dtypes` function returns the data types in the data frame.

In [2]:
result = testData.dtypes
print("Output:")
print(result)

Output:
index           int64
period         object
attendances     int64
dtype: object


Now I will collect the first row of data from the test data, 
using the `df.head()` function to see the first row in the data frame(df).


In [31]:
testData.head(n=2)

Unnamed: 0,index,period,attendances
0,1155,2016-12-01,200
1,2059,2016-10-01,6452


We need to set up an empty data frame in the working data folder to collect the data captured by the Juypter widgets.

In [33]:
dfTofill = pd.DataFrame({'index': [0],# Integer
                   'period': [pd.Timestamp('20000101')], # Date
                   'attendances': [0], # Integer
                   'consent': [False]}) # Boolean 

dfTofill

Unnamed: 0,index,period,attendances,consent
0,0,2000-01-01,0,False


Save the empty data frame to your working 'Data' folder:

In [34]:
#dfTofill.to_csv('../Data/CollectedData.csv', index=False)

The empty data frame is now saved to the working 'Data' folder. Now make sure to comment out the last cell (Ctrl+/), as you only need to do this once. Now let's read in the empty data frame to collect the data from the Jupyter-widgets.

In [35]:
CollectData=pd.read_csv("../Data/CollectedData.csv")
CollectData

Unnamed: 0,index,period,attendances,consent
0,0,2000-01-01,0,False


Now let us collect the first row of data from the test data. 
Use the `df.head()` function to see the first row in the data frame(df).

##### The `head()` function
The `head()` function lets you look at the top n rows of a data frame. By default, it shows the first five rows in a data frame. We can specify the number of rows we want to see in a data frame with the argument “n”. For example, look at the first row (n=1) of the test data:

In [36]:
testData.head(n=2)

Unnamed: 0,index,period,attendances
0,1155,2016-12-01,200
1,2059,2016-10-01,6452


# Index variable 
The first variable contains the index number, that allows us to connect the test data to the orginal data set "../RawData/ae_attendances.csv". I will have to use indexing to to add the index number to the 'dfTofill' file


In [40]:
index_number=2059 #Remember to change for each record.
dfTofill.iloc[0,0]=index_number
dfTofill

Unnamed: 0,index,period,attendances,consent
0,2059,2000-01-01,0,False


In [41]:
#Load the 'ipywidgets' package
import ipywidgets as widgets

### `display()`

The *IPython.display* package is used to display different objects in Jupyter. 
I can also explicitly display a widget using the `display()` function from the *IPython.display* package

In [42]:
#Load the 'IPython.display' package
from IPython.display import display

# Consent
Consent is a vital area for data protection compliance. Consent means giving data subjects genuine choice and control over how you process their data. If the data subject has no real choice, consent is not freely given, and it will be invalid. The [General Data Protection Regulation](https://eu01.alma.exlibrisgroup.com/leganto/public/44UOE_INST/citation/37632538310002466?auth=SAML) sets a high standard for consent and contains significantly more detail than previous data protection legislation. Consent is defined in Article 4 as: “Consent of the data subject means any freely given, specific informed and unambiguous indication of the data subject’s wishes by which he or she, by a statement or by a clear affirmative action, signifies agreement to the processing of personal data relating to him or her”.

Before we collect any data, we need to get consent from the end-user to process and share the data we will collect with the data capture tool.

## Boolean widgets
Boolean widgets are designed to display a boolean value.

### Checkbox widget

In [43]:
a = widgets.Checkbox(
    value=False,
    description='I consent for the data I have provided to be processed and shared in accordance with data protection regulations with the purpose of improving care service provision across the UK.',
    disabled=False
)

In [45]:
display(a)

Checkbox(value=True, description='I consent for the data I have provided to be processed and shared in accorda…

In [46]:
dfTofill.iloc[0,3]=a.value
dfTofill

Unnamed: 0,index,period,attendances,consent
0,2059,2000-01-01,0,True


# The period variable  
The period variable includes the month this activity relates to, stored as a date (1st of each month).  

#### Data type
I now need to check the data type in the testData data frame. Let us use the `dtypes` function from the Python *pandas* package to query the data types in the testData. The `dtypes` function returns the data types in the data frame.

In [19]:
print(result[1])
#String data type

object


The data type object is a string.

##### The `head()` function
The `head()` function lets me look at the top n rows of a data frame. By default, it shows the first five rows in a data frame. We can specify the number of rows we want to see in a data frame with the argument “n”. For example, look at the first row (n=1) of the test data:

In [47]:
testData.head(n=2)

Unnamed: 0,index,period,attendances
0,1155,2016-12-01,200
1,2059,2016-10-01,6452


### DatePicker widget 
I next need to set up a DatePicker widget to collect the period data.

In [49]:
b = widgets.DatePicker(
    description='Period',
    disabled=False
)
display(b)

DatePicker(value=None, description='Period')

In [22]:
dfTofill.iloc[0,1]=b.value
dfTofill

Unnamed: 0,index,period,attendances,consent
0,1155,2016-12-01,0,True


# The attendances variable
The attendances variable includes the number of attendances for this department type at this organisation for this month.

#### Data type
We now need to check the data type in the testData data frame. Let us use the `dtypes` function from the Python *pandas* package to query the data types in the testData. The `dtypes` function returns the data types in the data frame.

##### The `head()` function
The `head()` function lets you look at the top n rows of a data frame. By default, it shows the first five rows in a data frame. We can specify the number of rows we want to see in a data frame with the argument “n”. For example, look at the first row (n=1) of the test data:

In [50]:
testData.head(n=2)

Unnamed: 0,index,period,attendances
0,1155,2016-12-01,200
1,2059,2016-10-01,6452


## Numeric widgets
There are many widgets distributed with ipywidgets that are designed to display numeric values. Widgets exist for displaying integers and floats, both bounded and unbounded. The integer widgets share a similar naming scheme to their floating point counterparts. By replacing Float with Int in the widget name, you can find the Integer equivalent.

### IntText

In [53]:
e=widgets.IntText(
    value=0,
    description='Attendances:',
    disabled=False)
display(e)

IntText(value=0, description='Attendances:')

In [54]:
dfTofill.iloc[0,2]=e.value
dfTofill

Unnamed: 0,index,period,attendances,consent
0,2059,2000-01-01,6452,True


# Concatenating the collected data to the CollectData data frame.   
Let us use the `concat()` function from the Python *pandas* package to append the CollectData and dfTofill data frames. The concat() function is used to concatenate *pandas* objects.

In [55]:
# CollectData is the first data frame
# dfTofill is the second data frame
CollectData  = pd.concat([CollectData, dfTofill])
display(CollectData)

Unnamed: 0,index,period,attendances,consent
0,0,2000-01-01,0,False
0,2059,2000-01-01 00:00:00,6452,True


## Have you consent to process and share the data before you save it to the working data folder?

Before I save our data to file, I must make sure we have consent to do so. The following line of code, will ensure that I have consent to save data.

In [29]:
CollectData=CollectData[CollectData['consent'] == True]
display(CollectData)

Unnamed: 0,index,period,attendances,consent
0,1155,2016-12-01,200,True
0,1155,2016-12-01,200,True


### Saving the CollectData data frame
Saving the data collected by your data-capture tool to the working data folder:

In [28]:
CollectData.to_csv('../Data/CollectedData.csv', index=False)

That is the CollectData data frame saved to the working 'Data' folder. You need to iterate through this Notebook until you have collected all of your test data and then save the captured test data to your 'RawData' folder.

In [29]:
#CollectData.to_csv('../RawData/CollectedDataFinal.csv', index=False)

That is the final CollectData data frame saved to the 'RawData' folder. 

I hope these examples help you to improve your Python programming skills. Happy Coding!