## Revisiting `diabetes_num.csv`
You've already worked a bit with the `diabetes` and `diabetes_num` CSV files, but what else can you do with the file now that you've learned the magic known as `numpy`? Well that's what you're going to find out by doing this challenge!

#### Note:
The directions for this challenge aren't all contained at the top, they're written throughout in a step-by-step way to get you more acquainted with all of the different things that `numpy` can do (which ultimately will make your life a lot easier!)!

### STEP 1: Importing the necessary tools
First things first, import the necessary packages. After importing what you need, read in the diabetes CSV file. <br><br>
__Hint 1:__ There are two different ways that you can read in the file.<br>
__Hint 2:__ Make sure to use the `diabetes_num` CSV file, which contains only numeric values.

In [3]:
import numpy as np  # Using the np abbreviation will be helpful when you're repeatedly using this package later on!

In [7]:
data_file = "diabetes_num.csv"    # file name

# When reading in the file this way, remember to skip the header because it contains non-numeric (NAN) values.
# Give the variable a useful name, you'll be using it often.
diabetes = np.genfromtxt(data_file, delimiter=",", skip_header=1)

In [9]:
# What does the data we read in look like? It's an array of arrays! (Multi-dimensional array!)
diabetes

array([[  2778.,    443.,    185., ...,     98.,     43.,     48.],
       [ 20313.,    404.,    206., ...,     88.,     38.,     39.],
       [ 12769.,    347.,    197., ...,     86.,     51.,     49.],
       ..., 
       [ 21357.,    122.,     82., ...,     80.,     41.,     45.],
       [ 15017.,    118.,     95., ...,     76.,     30.,     36.],
       [  1003.,     78.,     93., ...,     50.,     33.,     38.]])

### STEP 2: Back to basics
If I asked you to find the average cholesterol values for all patients listed in the CSV file, as well as the highest and lowest values WITHOUT using loops or building your own functions, do you think you could do it??<br><br>
Well you can with the power of numpy!!<br><br>
Cholesterol is at index 1 in every row (if you need to double check, just open the CSV file on your computer), so the math we'll be doing is on that index for every array.

In [12]:
# First up, let's find the average cholesterol value for this data set.
# We want to use index 1 from every row (every array) within the multi-dimensional array that we named `diabetes`.
# How would we slice that? Plug that slice of the diabetes multi-dimensional array into numpy's mean function,
# which finds the average value for the values given.
np.mean(diabetes[:,1])

207.75661375661375

In [13]:
# Does the value found by np.mean() make sense?
# You could double check the math by hand or by writing your own function, but let's "cheat" and find
# the max and min values for the cholesterol and use reasoning to see if this average makes sense.
# Hint: Use the same slicing from above and plug it into np.min() and np.max() to find the min and max values
np.min(diabetes[:,1])

78.0

In [14]:
np.max(diabetes[:,1])

443.0

### Step 3: On your own!
Told you that you could do it! Now go back to the [cheatsheet](#https://www.dataquest.io/blog/large_files/numpy-cheat-sheet.pdf) given to you and pick out some other functions that you want to try. Try them out on other columns, do the results make sense? If you don't remember what the columns are, just open up the CSV and double check to see what the headers are. Get stuck or need help? Just ask! :)