# AI@University - Hacking Competition

For this hacking competition you have **2 hours** to solve the **Maternity Ward** case study in teams of **4-5 students**. Be aware, that during that time you have to implement your **technical solution** as well as prepare a short **5 minute presentation** of your results. It is recommended to split the work amongst your team members accordingly.

Before the workshop, please make sure that you have installed Python 3 and the Juypter Notebook environment, preferrably using the [Anaconda distribution](https://www.anaconda.com/download/#macos), as it contains already a set of useful Data Science libraries such as pandas, numpy and scikit-learn.

**Note**:
* When downloading this file to macOS, it automatically gets converted to a text file. To be able to open it as a Jupyter Notebook, select the file and press `command` + `i`. In the opening detail view, delte the file ending `.txt`

# Case Study: Thomas J. Watson Hospital - Maternity Ward
After delivering a successful project for the oncology department of the **Thomas J. Watson Hospital** in **Berlin**, you have been hired as a consulting team of Data Scientists by the maternity ward to deliver another consulting project.

The hospital is facing some challenges in treating the increasing number of pregnant women in Berlin, as the hospital's
staff is either decreasing (paramedical staff) or only slightly increasing (physicians). The hospital's board of directors fears that the increasing birthrate and lackof personnel might impact the time and treatment their employees have for the young families. Their patients health is their highest priority.

Therefore, the Thomas J. Watson hospital is looking for ways to support its personnel in ensuring that the best 
possible care is given to the soon-to-be mothers and their children.

You have been hired to find a solution that assists the physicians and paramedics in determining the health condition
of the fetus and its mother. To accomplish your goal, you have been provided with cardiotocography data that enables
you to predict whether the fetal health condition is **normal** or **pathologic** (Hint: the dataset might contain more classes on the health condition, which the hospital is not interested in).

You should present your results to the hospital board coming Friday. Keep in mind to present your findings in a way
that both business and technical stakeholders feel addressed.

### Data Dictionary
The dataset consists of measurements of fetal heart rate (FHR) and uterine contraction (UC) features on cardiotocograms classified by expert obstetricians.

Attribute|Description
---|---
LB|FHR baseline (beats per minute)
AC|number of accelerations per second
FM|Number of fetal movements per second
UC|Number of uterine contractions per second
ASTV|Percentage of time with abnormal short term variability
mSTV|Mean value of short term variability 
ALTV|Percentage of time with abnormal long term variability 
mLTV|Mean value of long term variability
DL|Number of light decelerations per second
DS|Number of severe decelerations per second
DP|Number of prolonged decelerations per second
Width|Width Of Fetal Heart Rate Histogram
Min|Minimum Of Fetal Heart Rate Histogram
Max|Number of highest Histogram peaks
Nmax|number of Histogram peaks
Nzeros|Number of lowest Histogram zeros
Mode|Mode of Fetal Heart Rate Histogram
Mean|Mean of Fetal Heart Rate Histogram
Median|Median of Fetal Heart Rate Histogram
Variance|Variance of Fetal Heart Rate Histogram
Tendency|Tendency of Fetal Heart Rate Histogram: -1=left assymetric, 0=symmetric, 1=right assymetric
NSP|Label: Normal=1, Suspect=2, Pathologic=3

### Medical Background
To ensure fetal and maternal health during pregnancy, **cardiotocography**, the measurement of **fetal heart rate** 
(FHR) and **uterine contractions** (UC), is used to identify pathologic health conditions early on. Currently, 
FHR and UC data is analyzed manually by a physician or paramedic, therefore leaving room for errors. FHR and UC are 
strong indicators for certain critical health conditions, like a lack of oxygen, prematurity or growth restrictions, 
that can lead to impairment and even death of the fetus. Cardiotocography is used both during pregnancy and during 
delivery of the child.

# Evaluation
### Technical Solution (50%)
Your technical solution will be evaluated based on the degree of **accuracy** of your predictions on the hold-out validation set.

![equation](https://latex.codecogs.com/png.latex?%5Cdpi%7B300%7D%20%5Ctiny%20%5Ctext%7Baccuracy%7D%28y%2C%20%5Chat%7By%7D%29%20%3D%20%5Cfrac%7B1%7D%7Bn%7D%20%5Csum_%7Bi%3D1%7D%5E%7Bn%7D%201%28%5Chat%7By%7D_i%20%3D%20y_i%29)

### Final Presentation (50%)
You will present your results to the key stakeholders of the hospital. Your presentation should cover
+ Understanding the business problem
+ Approach taken
+ Interpretation of the key results

# Sumitting Your Results
Please store your predictions on the validation data as a comma-separated file, and name that file after your team (`YOUR_TEAM_NAME.csv`).

To create a CSV-File from an array of predictions (`y_pred`), you can use the following `pandas` function:
```
pandas.Series(y_pred).to_csv('YOUR_TEAM_NAME.csv', sep=',', index=False)
```

Subsquently, please upload the file to the following Cloud share: https://ibm.biz/aiatuni_submission

**Author**: Daniel Jaeck, Data Scientist at IBM (daniel.jaeck@de.ibm.com)

Copyright © IBM Corp. 2018. This notebook and its source code are released under the terms of the MIT License.