# Exploratory Data Analysis with the National Immunization Survey

Vaccinations are important for the health of both individuals and society as a whole. A number of factors play into when children are immunized against a variety of diseases: doctor recommendations, limitations set by insurance companies, and in some cases, parental preferences and scheduling logistics.

How do these factors play into when children are vaccinated? And how does the scheduling of a vaccine recommended to take place at a certain age (like the Measles, Mumps, Rubella (MMR) vaccine) compare to when children receive vaccines for seasonal diseases like the flu?

Let's find out!

## 1. The National Immunization Survey

The National Immunuzation Survey is conducted anually by the US Center for Disease Control, and both the raw data and reports are published [on their website](https://www.cdc.gov/vaccines/imz-managers/nis/datasets.html). Here is a taste of what the dataset looks like.

Import the 2016 National Immunization Survey from `nis_immunization_2016.csv` and view the first few records.

- Import `pandas` as `pd` and `matplotlib.pyplot` as `plt`.
- Use the pandas function `read_csv()` to read `nis_immunization_2016.csv` into a pandas dataframe called `vac`.
- Use `print()` and `head()` to show the first few rows of the dataframe.

<hr>

## Good to know

You can find (and print out -- maybe even laminate!) a helpful [pandas cheatsheet](https://github.com/pandas-dev/pandas/blob/master/doc/cheatsheet/Pandas_Cheat_Sheet.pdf) to remind yourself of some of the basic commands and workflows we'll be exploring here.

Also, this project builds on skills and terms introduced in [Intro to Python for Data Science](https://www.datacamp.com/courses/intro-to-python-for-data-science) and [Intermediate Python for Data Science](https://www.datacamp.com/courses/intermediate-python-for-data-science). Feel free to review the slides and exercises as you go through this project.

The `import ____ as ____` snytax will allow you to import functions and libraries with custom (usually shorter) names like `pd` and `plt`.

When calling a `pandas` function, be sure to include both the library and function in your code, and the filename in quotes (single or double will do):

```
vac = pd.read_csv('filename.csv')
```

Remember the syntax difference between functions (like `print()`) and methods (like `head()`). To print the head of a dataframe `df`, use:

```
print(df.head())
```

In [12]:
# import pandas as pd and matplotlib.pyplot as plt
....

# read nis_immunization_2016.csv in with pandas: vac
vac = ....

# print the first 5 records of vac with head()
....

In [13]:
# import pandas as pd and matplotlib.pyplot as plt
import pandas as pd
import matplotlib.pyplot as plt

# read nis_immunization_2016.csv in with pandas: vac
vac = pd.read_csv('nis_immunization_2016.csv')

# print the first 5 records of vac with head()
print(vac.head())

Unnamed: 0,sepal_length,sepal_width,petal_length,petal_width,species
0,5.1,3.5,1.4,0.2,setosa
1,4.9,3.0,1.4,0.2,setosa
2,4.7,3.2,1.3,0.2,setosa
3,4.6,3.1,1.5,0.2,setosa
4,5.0,3.6,1.4,0.2,setosa


In [14]:
%%nose

# one or more tests of the students code. 
# The @solution should pass the tests.
# The purpose of the tests is to try to catch common errors and to 
# give the student a hint on how to resolve these errors.

def test_example():
    assert like_this == 'missing part filled in', \
    'The student will see this message if the test fails'

1/1 tests passed


## 2. Here goes the title of the second task

Context / background / story / etc. This will show up in the student's notebook. It should at most have 800 characters and/or 3 paragraphs.

The task instructions start with a brief sentence framing the task.

- The specific task instructions go in a bullet point list.
- One bullet per sub task.
- At most 4 bullets.

<hr>

Give more info, context, and links to external documentation under the horizontal ruler. The instructions should at most have 800 characters.

This is a hint the student will get if they click the hint button at the bottom of the instructions. It's the last help the student can get, so make it helpful. 

```
# Feel free to include links to documentation.
print("And code snippets in the hint.")
```

In [15]:
# This is the sample code the student will see. It should
# consist of up to 10 lines of code and comments, and the
# student should have to complete at most 5 lines of code.

# Rule of thumb: Each bullet point in @instructions should
# correspond to a comment in the @sample_code

# Indicate missing code with ...
like_this = ...
# or when a line or more is required, like this:
# ... YOUR CODE FOR TASK 2 ...

In [16]:
# Your solution code. This won't be shown to the student.

# The @solution should mirror the corresponding @sample_code,
# but with the missing parts filled in.
like_this = 'missing part filled in'

# It should consist of up to 10 lines of code and comments 
# and take at most 5 seconds to execute on an average laptop.

In [17]:
%%nose

# one or more tests of the students code. 
# The @solution should pass the tests.
# The purpose of the tests is to try to catch common errors and to 
# give the student a hint on how to resolve these errors.

def test_example():
    assert like_this == 'missing part filled in', \
    'The student will see this message if the test fails'

1/1 tests passed


## 3. Here goes the title of the third task

Context / background / story / etc. This will show up in the student's notebook. It should at most have 800 characters and/or 3 paragraphs.

The task instructions start with a brief sentence framing the task.

- The specific task instructions go in a bullet point list.
- One bullet per sub task.
- At most 4 bullets.

<hr>

Give more info, context, and links to external documentation under the horizontal ruler. The instructions should at most have 800 characters.

This is a hint the student will get if they click the hint button at the bottom of the instructions. It's the last help the student can get, so make it helpful. 

```
# Feel free to include links to documentation.
print("And code snippets in the hint.")
```

In [18]:
# This is the sample code the student will see. It should
# consist of up to 10 lines of code and comments, and the
# student should have to complete at most 5 lines of code.

# Rule of thumb: Each bullet point in @instructions should
# correspond to a comment in the @sample_code

# Indicate missing code with ...
like_this = ...
# or when a line or more is required, like this:
# ... YOUR CODE FOR TASK 3 ...

In [19]:
# Your solution code. This won't be shown to the student.

# The @solution should mirror the corresponding @sample_code,
# but with the missing parts filled in.
like_this = 'missing part filled in'

# It should consist of up to 10 lines of code and comments 
# and take at most 5 seconds to execute on an average laptop.

In [20]:
%%nose

# one or more tests of the students code. 
# The @solution should pass the tests.
# The purpose of the tests is to try to catch common errors and to 
# give the student a hint on how to resolve these errors.

def test_example():
    assert like_this == 'missing part filled in', \
    'The student will see this message if the test fails'

1/1 tests passed


## 4. Here goes the title of the fourth task

Context / background / story / etc. This will show up in the student's notebook. It should at most have 800 characters and/or 3 paragraphs.

The task instructions start with a brief sentence framing the task.

- The specific task instructions go in a bullet point list.
- One bullet per sub task.
- At most 4 bullets.

<hr>

Give more info, context, and links to external documentation under the horizontal ruler. The instructions should at most have 800 characters.

This is a hint the student will get if they click the hint button at the bottom of the instructions. It's the last help the student can get, so make it helpful. 

```
# Feel free to include links to documentation.
print("And code snippets in the hint.")
```

In [21]:
# This is the sample code the student will see. It should
# consist of up to 10 lines of code and comments, and the
# student should have to complete at most 5 lines of code.

# Rule of thumb: Each bullet point in @instructions should
# correspond to a comment in the @sample_code

# Indicate missing code with ...
like_this = ...
# or when a line or more is required, like this:
# ... YOUR CODE FOR TASK 4 ...

In [22]:
# Your solution code. This won't be shown to the student.

# The @solution should mirror the corresponding @sample_code,
# but with the missing parts filled in.
like_this = 'missing part filled in'

# It should consist of up to 10 lines of code and comments 
# and take at most 5 seconds to execute on an average laptop.

In [23]:
%%nose

# one or more tests of the students code. 
# The @solution should pass the tests.
# The purpose of the tests is to try to catch common errors and to 
# give the student a hint on how to resolve these errors.

def test_example():
    assert like_this == 'missing part filled in', \
    'The student will see this message if the test fails'

1/1 tests passed


## 5. Here goes the title of the fifth task

Context / background / story / etc. This will show up in the student's notebook. It should at most have 800 characters and/or 3 paragraphs.

The task instructions start with a brief sentence framing the task.

- The specific task instructions go in a bullet point list.
- One bullet per sub task.
- At most 4 bullets.

<hr>

Give more info, context, and links to external documentation under the horizontal ruler. The instructions should at most have 800 characters.

This is a hint the student will get if they click the hint button at the bottom of the instructions. It's the last help the student can get, so make it helpful. 

```
# Feel free to include links to documentation.
print("And code snippets in the hint.")
```

In [24]:
# This is the sample code the student will see. It should
# consist of up to 10 lines of code and comments, and the
# student should have to complete at most 5 lines of code.

# Rule of thumb: Each bullet point in @instructions should
# correspond to a comment in the @sample_code

# Indicate missing code with ...
like_this = ...
# or when a line or more is required, like this:
# ... YOUR CODE FOR TASK 5 ...

In [25]:
# Your solution code. This won't be shown to the student.

# The @solution should mirror the corresponding @sample_code,
# but with the missing parts filled in.
like_this = 'missing part filled in'

# It should consist of up to 10 lines of code and comments 
# and take at most 5 seconds to execute on an average laptop.

In [26]:
%%nose

# one or more tests of the students code. 
# The @solution should pass the tests.
# The purpose of the tests is to try to catch common errors and to 
# give the student a hint on how to resolve these errors.

def test_example():
    assert like_this == 'missing part filled in', \
    'The student will see this message if the test fails'

1/1 tests passed


## 6. Here goes the title of the sixth task

Context / background / story / etc. This will show up in the student's notebook. It should at most have 800 characters and/or 3 paragraphs.

The task instructions start with a brief sentence framing the task.

- The specific task instructions go in a bullet point list.
- One bullet per sub task.
- At most 4 bullets.

<hr>

Give more info, context, and links to external documentation under the horizontal ruler. The instructions should at most have 800 characters.

This is a hint the student will get if they click the hint button at the bottom of the instructions. It's the last help the student can get, so make it helpful. 

```
# Feel free to include links to documentation.
print("And code snippets in the hint.")
```

In [27]:
# This is the sample code the student will see. It should
# consist of up to 10 lines of code and comments, and the
# student should have to complete at most 5 lines of code.

# Rule of thumb: Each bullet point in @instructions should
# correspond to a comment in the @sample_code

# Indicate missing code with ...
like_this = ...
# or when a line or more is required, like this:
# ... YOUR CODE FOR TASK 6 ...

In [28]:
# Your solution code. This won't be shown to the student.

# The @solution should mirror the corresponding @sample_code,
# but with the missing parts filled in.
like_this = 'missing part filled in'

# It should consist of up to 10 lines of code and comments 
# and take at most 5 seconds to execute on an average laptop.

In [29]:
%%nose

# one or more tests of the students code. 
# The @solution should pass the tests.
# The purpose of the tests is to try to catch common errors and to 
# give the student a hint on how to resolve these errors.

def test_example():
    assert like_this == 'missing part filled in', \
    'The student will see this message if the test fails'

1/1 tests passed


## 7. Here goes the title of the seventh task

Context / background / story / etc. This will show up in the student's notebook. It should at most have 800 characters and/or 3 paragraphs.

The task instructions start with a brief sentence framing the task.

- The specific task instructions go in a bullet point list.
- One bullet per sub task.
- At most 4 bullets.

<hr>

Give more info, context, and links to external documentation under the horizontal ruler. The instructions should at most have 800 characters.

This is a hint the student will get if they click the hint button at the bottom of the instructions. It's the last help the student can get, so make it helpful. 

```
# Feel free to include links to documentation.
print("And code snippets in the hint.")
```

In [30]:
# This is the sample code the student will see. It should
# consist of up to 10 lines of code and comments, and the
# student should have to complete at most 5 lines of code.

# Rule of thumb: Each bullet point in @instructions should
# correspond to a comment in the @sample_code

# Indicate missing code with ...
like_this = ...
# or when a line or more is required, like this:
# ... YOUR CODE FOR TASK 7 ...

In [31]:
# Your solution code. This won't be shown to the student.

# The @solution should mirror the corresponding @sample_code,
# but with the missing parts filled in.
like_this = 'missing part filled in'

# It should consist of up to 10 lines of code and comments 
# and take at most 5 seconds to execute on an average laptop.

In [32]:
%%nose

# one or more tests of the students code. 
# The @solution should pass the tests.
# The purpose of the tests is to try to catch common errors and to 
# give the student a hint on how to resolve these errors.

def test_example():
    assert like_this == 'missing part filled in', \
    'The student will see this message if the test fails'

1/1 tests passed


## 8. Here goes the title of the eighth task

Context / background / story / etc. This will show up in the student's notebook. It should at most have 800 characters and/or 3 paragraphs.

The task instructions start with a brief sentence framing the task.

- The specific task instructions go in a bullet point list.
- One bullet per sub task.
- At most 4 bullets.

<hr>

Give more info, context, and links to external documentation under the horizontal ruler. The instructions should at most have 800 characters.

This is a hint the student will get if they click the hint button at the bottom of the instructions. It's the last help the student can get, so make it helpful. 

```
# Feel free to include links to documentation.
print("And code snippets in the hint.")
```

In [33]:
# This is the sample code the student will see. It should
# consist of up to 10 lines of code and comments, and the
# student should have to complete at most 5 lines of code.

# Rule of thumb: Each bullet point in @instructions should
# correspond to a comment in the @sample_code

# Indicate missing code with ...
like_this = ...
# or when a line or more is required, like this:
# ... YOUR CODE FOR TASK 8 ...

In [34]:
# Your solution code. This won't be shown to the student.

# The @solution should mirror the corresponding @sample_code,
# but with the missing parts filled in.
like_this = 'missing part filled in'

# It should consist of up to 10 lines of code and comments 
# and take at most 5 seconds to execute on an average laptop.

In [35]:
%%nose

# one or more tests of the students code. 
# The @solution should pass the tests.
# The purpose of the tests is to try to catch common errors and to 
# give the student a hint on how to resolve these errors.

def test_example():
    assert like_this == 'missing part filled in', \
    'The student will see this message if the test fails'

1/1 tests passed


*The recommended number of tasks in a DataCamp Project is between 8 and 10, so feel free to add more if necessary. You can't have more than 12 tasks.*