In [None]:
# Run this first! Be sure to watch the video to learn how. 

%matplotlib inline

from js import fetch

async def get_csv(url):
    res = await fetch(url)
    text = await res.text()
    filename = 'data.csv'
    with open(filename, 'w') as f:
        f.write(text)

# Visualizing Heart Failure

In this course you will learn how to interact with data using the Python programming language. 
The approach is a bit different than software engineering/development/architecutre. 
Instead, we work with programming languages interactively, usually one line at a time. 

Think of this proess like any other task that you might need to do. 
Imagine you are making cookies; we can agree that there are multiple steps that you need to follow (in a specific order) to be successful. 

1. Gather ingredients: 
    - Flour
    - Water
    - Baking powder
    - Sugar
    - Butter
    - Chocolate chips
2. Preheat oven
3. Combine ingredients
4. Mix ingredients
5. Portion dough onto baking sheet
6. Place baking sheet in oven for 15 minutes
7. Remove from oven
8. Let the cookies cool down (optional)
9. Eat the cookies!

This simple recipe demonstrates the same concepts that we will use when interacting with data using code. 

- Order is important, but sometimes has flexibility (i.e. you could have preheated the oven later or earlier)
- Subsequent steps often depend on execution of previous steps
    - You need to combine and mix ingredients before they go on the backing sheet

Data scientists do the same thing!

## This Environment

You are looking at a Jupyter notebook. 
Jupyter notebooks have two types of user input. 

1. Markdown (text that is intended for reading, you are reading text formatted with Markdown right now)
2. Code (Python in this case, when you tell it to the computer will read it and do something)

Think of it as a way of talking to humans and computers in a single document. 
Pretty cool, right?

Assignments in this environment will have instructions and information, and eventually a call out that looks like this: 

<span style="color: blue; background-color: white">**TASK**: Do Something</span>

When you see that, follow the instructions underneath. 
You will be expected to download your a pdf of your work (file > download) when you're done. 
Watch the video that accompanies this assignment for a demo!

## Scenario

We want to learn about a cohort of patients with heart failure. 
Perhaps we want to see how age might relate to other clinical parameters, or how those parameters are correlated. 
Visualizing data can be one of the most powerful techniques for evaluating data quality and patterns. 

We will use Python to load and visualize a dataset with information about patients with heart failure. 
In this case, dataset just means a text file with columns and rows (just like a spreadsheet).
Generally, we work with data where rows represent observations, and columns are variables. 

<span style="color: blue; background-color: white">**TASK**: Obtain the Data File</span>

The next cell is a "Code" cell. 
You can type Python code in the cell, press Shift + Enter, and it will tell your computer to run the code. 
To get started, you are going to tell your computer where the data file is located. 
In this case (and for this class in general), this will be a URL. 
We are going to pass a url leading to the file we need to the function in the top cell. 
This function will then download that file to your computer, which will allow you to work with data in that file. 

Go ahead and copy/paste the following code into the cell below, then press shift + enter.

```python

await get_csv("https://raw.githubusercontent.com/sadams-teaching/PGPM-503-ENV/main/data/heart_failure.csv")

```


## Reading Data with Pandas

Now that you've downloaded the data file, you need to read it into something called Pandas. 
In simplest terms, Pandas is a Python library (collection of functions) for reading and manipulating data. 

<span style="color: blue; background-color: white">**TASK**: Read "data.csv" into Pandas</span>

Copy and paste the following code into the code cell below: 

```python
import pandas as pd 

data = pd.read_csv("data.csv") 

data.head()
```

## Scatterplots

A scatterplot is a fundemantal method for visualizing continuous data. 
From the cell above, we can see that our data frame has several variables from a cohort of patients with heart failure. 
Let's make a simple visualization for sodium and ejection fraction. 

<span style="color: blue; background-color: white">**TASK**: Make a Scatterplot with Sodium and Ejection Fraction</span>

Copy and paste the following code into the code cell below: 

```python

data.plot.scatter(x = "Sodium", y = "Ejection.Fraction")

```

## All Done!

When you are finished, save the ipynb file to your computer, then submit it through the Canvas assignment. 
