![](_fig/labeled.jpg)

# Studio 4: How to Use Neural Networks with Python
In this studio you will learn the basics of building Artifical Neural Netowkrs in Python. You will also learn how to compare different types of modeling tehcniques and find the best tool for the question you are trying to answer.<br>
<br>
Each *Python for Healthcare* studio has six sections:
1. **Objectives**
2. **Readings and Videos**
3. **Discussion**
4. **Analysis**
5. **Conclusion**
6. **Reflection**

Refer to the video below to learn how to use the *Python for Healthcare* studios.

["How to use Py4HC Notebooks"](https://www.youtube.com/watch?v=5fzBGgflXk8&t=4s)  

---

## Objectives
By the end of this course, our goal is for you to learn how to start using the Python Programming Language in health science applications. This includes:

- Understanding how computer code allows humans to talk to computers
- Learning the concrete process for using computer code and Python
- Becoming familiar with the process of writing in Python
- Identify healthcare questions that can be answered with open source data science tools
- Experience how Python can be used to answer questions related to healthcare
- Increase awareness of how Python can be used in a future career in healthcare

Keep these goals in mind as you go through the studios and the course. If you dont understand every part, that is ok. Use these resources to get familair with the concepts.


## Videos
Before starting the **Discussion**, watch the following:

["How to Talk to Computers: Part 4"]()   
["Neural Network in 5 Minutes"](https://www.youtube.com/watch?v=bfmFfD2RIcg)

## Discussion
After completing the readings and videos above, answer the following questions with your team. 

1. There are many different types of models with different use cases. What are the important concepts to consider when trying to find a model to answer your question?
2. In the videos and readings, you can see that the mathematical concepts for Data Science can be very advanced, but the code is much simpler to write. Since you can't learn everything, what things do you have to understand about the math in order to use a model appropriately?
3. While many like advanced techniques because they sound cool, it is not uncommon for them to be used inappropriately. What kind of applications would you like to use for these advanced tools that could not be answered using a simpler method?

Be sure that everyone answers each question and responds to at least one answer from another team member. 

## Analysis
In each module, you will complete a live data analysis with your team. The goal of the studio is to give you hands on experience writing Python code and using Data Science tools that are important for health science.<br>
<br>
In each **Analysis** section, there are four steps:
1. Setup Workspace
2. Process Data
3. Create Model
4. Display Results

Within each step of the **Analysis** a header will provide general details about what the lines of code below are used to do.<br>
<br>
As you complete the **Analysis** component of the studio, follow this video below to understand each of the steps in the code.

### Step 1: Setup Workspace
In the first step, you will assemble libraries, set your working directory, and import data. You will do this same process before every analysis. This is just like setting up your brushes, canvas, and easel for paiting or putting your tools, parts, and bike in the repair stand.

![](_fig/E1_1_1.jpg)

![](_fig/E1_1_2.jpg)

#### Import Standard Libraries
These libraries are imported for every data science related Python script.

#### Import Specific Libraries
These libraries are used for specific components of the script.

In [3]:
from sklearn.preprocessing import StandardScaler
from sklearn.impute import SimpleImputer
from sklearn.metrics import roc_curve
from sklearn.metrics import auc
from keras.models import Sequential  
from keras.layers import Dense

#### Set Working Directory
This is the location of the folder on your device that holds all of your files. Once you set this, any file can be accessed by the relative location. 

#### Import Data
In order to use data, you will often import from a `".csv"` file located in your directory and save it with a useful name. After importing, you can use `.info()` and `.head()` to quicky view the data. 

### Step 2: Process Data
In the second step, you will modfiy the data frames that you imported in order to get them ready for whatever model you wish to create. You will need to make sure the data is correctly subset or joined, uses the correct shape and type, and has missing values resolved. This is just like kneading clay to remove air bubbles or cleaning the bike before you install new parts. 

![](_fig/E3_2_1.jpg)

![](_fig/E3_2_2.jpg)

#### Filter DataFrame
This will remove all columns except those specified.

#### Rename Columns
This will rename selected columns except those specified.

#### Describe Columns
This will provoide descriptive statistics on a given column of a DataFrame. 

In [None]:
df_doh["Diabetes Mortality"].describe()

#### Create Columns Based on Conditions
This will create a new column in a DataFrame based on given conditions of another column. This is commonlu used to create a categorical version of a quantitative outcome.

In [None]:
df_doh["train"] = np.where(df_doh["Diabetes Mortality"] > 2.125, 1, 0)
df_doh["test"] = np.where(df_doh["Diabetes Mortality"] > 1.44, 1, 0) 

#### Drop Columns
This will drop selected columns from the DataFrame.

#### Join DataFrames
This will join two pandas DataFrames along a column they both share with an identical name and data type. For common options:<br>
`how = "inner"` Keeps rows that appear in both<br>
`how = "outer"` Keeps every row from both<br>
`how = "left"` Keeps every row from the first, and assigns any row from the left (including duplicates).<br>

#### Drop NA Values (with threshold)
This will drop all columns that have a missing value above a given threshold. 75% non-missing data is commonly used threshold for machine learning. 

#### Impute Missing Values
One option for handling missing values in large datasets is to impute values using a mathematical process. This will impute missing values with the "median" value for each column.

#### Standard Scaling
With large data and complex algorithms, standard scaling allows helps improve the modeling process and provide better resutls. This will create a new DataFrame where all columns will have a median of 0 and a standard deviation of 1.

#### Verify
At the end of each step, use `.info()` and `.head()` to quicky view the data.

### Step 3: Create Model
In this step you will create a model that provides meaningful information about the data. This can include a statistical test, a machine learning algorithm, or a neural network. This is similar to drawing a still life or writing a poem. 


![](_fig/E4_3_1.jpg)

#### Multi-Layer Perceptron
A Multi-Layer Perceptron is a simple type of artifical neural netowkr that uses "dense" layers to learn how to predict an outcome. Neural Networks pass predictors through a layer of multiple neurons where a function modifies the value along with a "weight" value. Each predictor is passed through each neuron and a final set of predictions are made. Those predictions are compared to the actual value and error is calculated. The network then uses the error to apply the inverse of the fucntion backwards through "backpropogating." When this occurs, the weights are updated to improve the model. This process is called an "epoch" and the neural network will do this a given amount of times so that predictive ability is improved. 

### Step 4: Display Results
In this step you will create an informative visual that displays the results for others to see. This is similar to framing your finished painting or posting an image to social media with an informative caption. 

![](_fig/E4_4_1.jpg)

#### ROC Test
The Reciever Operator Curve compares the true positive and false negative of a given test. This allows for easy comparison of how well the preditive test is at providing meaningful results.

## Conclusion
In your own words write the following:

1. This topic is important because...

Identify two peer reviewed scientific articles that have findings related to your first statement. Then write the following:

2. Other studies have found that...

Using the results of the analysis above, craft a simple conclusion with the following items:

3. It was hypothesized that...
4. Data was collected from...
5. The study found that...
6. This provides evidence that...

After writing each of these statements, assemble them into a paragraph in the shown order. Then do the following:

- Edit the paragraph to be coherent
- Provide a simple title
- Add references in an appropirate style

Now you have an abstract!

## Reflection
After completing the studio session please pick one or two of the following questions to discuss with your team. 

1. What was something new that you learned in this module?
2. What was something that you knew previously but heard differently in this module?
3. What was something that understand better after this module?
4. What was something that is confusing after this module?

Each team member may select a different question, but be sure that everyone provides a refelction and responds to another's reflection.

---
After you have completed the studio, print the page as a PDF and save it on your local computer. 