In this exercise, you will use your new knowledge to propose a solution to a real-world scenario. To succeed, you will need to import their data into Python, answer questions using the data, and generate line plots to help them their understand patterns in the data.

## Scenario

You have recently been hired to manage the museums in the City of Los Angeles. Your first project focuses on the four museums pictured in the images below.

<img src="images/ex1.png">

You will levarage data from their [data portal](https://data.lacity.org/) that tracks the number of visitors to each museum, by month.  

<img src="images/ex1_xlsx.png">


## Setup

Run the next cell to import and configure the Python libraries that you need to complete the exercise.

In [None]:
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
import seaborn as sns

The questions below will give you feedback on your work. Run the following cell to set up the feedback system.

In [None]:
# Set up code checking
from learntools.core import binder
binder.bind(globals())
from learntools.data_viz_easy.ex1 import *
print("Setup Complete")

## Step 1: Load the data

Read the LA Museum Visitors data file into a DataFrame called `museum_data`.

In [None]:
# Path of the file to read
museum_file_path = '../input/museum_visitors.csv'

# Fill in the line below to read the file into a variable museum_data
museum_data = ____

# Run the line below with no changes to check that you've loaded the data correctly
step_1.check()

In [None]:
#%%RM_IF(PROD)%%
museum_data = pd.read_csv(museum_file_path, index_col="time", parse_dates=True)
step_1.assert_check_passed()

In [None]:
# Lines below will give you a hint or solution code
#_COMMENT_IF(PROD)_
step_1.hint()
#_COMMENT_IF(PROD)_
step_1.solution()

## Step 2: Review the data

Print the first 5 rows of the data.

In [None]:
# Print the first five rows of the data 

# REMOVE THIS COMMENT AND WRITE YOUR CODE HERE

Use the first 5 rows of the data to answer the questions below.

In [None]:
# Fill in the line below: How many visitors did the Chinese American Museum 
# receive in January 2014?
ca_museum_jan14 = ____ 

# Fill in the line below: In March 2014, how many more visitors did Avila 
# Adobe receive than the Firehouse Museum?
avila_over_firehouse_mar14 = ____

# Check your answers
step_2.check()

In [None]:
#%%RM_IF(PROD)%%
ca_museum_jan14 = 
avila_over_firehouse_mar14 = 
step_2.assert_check_passed()

In [None]:
# Lines below will give you a hint or solution code
#_COMMENT_IF(PROD)_
step_2.hint()
#_COMMENT_IF(PROD)_
step_2.solution()

## Step 3: Visualize the data 

DB COMMENT: This is nice as an overview graphic. Worth pondering whether your prefer doing it this way vs something more embedded in the story/scenario. Will comment on github.

Use the following code cell to create a line plot that shows how the number of visitors to each museum evolved over time.  Your plot should have four lines (one for each museum).

In [None]:
# Set width and height of figure 
plt.figure(figsize=(12,6))

# Line plot showing the number of visitors to each museum over time
____ # Your code here

# Check your answer
step_3.check()

In [None]:
#%%RM_IF(PROD)%%
plt.figure(figsize=(12,6))
sns.lineplot(data=museum_data)

step_3.assert_check_passed()

In [None]:
#%%RM_IF(PROD)%%
plt.figure(figsize=(12,6))
sns.lineplot(data=museum_data.avila_adobe, label="Avila Adobe")
sns.lineplot(data=museum_data.firehouse_museum, label="firehouse_museum")
sns.lineplot(data=museum_data.chinese_american_museum, label="chinese_american_museum")
sns.lineplot(data=museum_data.america_tropical_interpretive_center, label="america_tropical_interpretive_center")

step_3.check()

In [None]:
#%%RM_IF(PROD)%%
plt.figure(figsize=(12,6))
sns.lineplot(data=museum_data.avila_adobe)
sns.lineplot(data=museum_data.firehouse_museum)
sns.lineplot(data=museum_data.chinese_american_museum)
sns.lineplot(data=museum_data.america_tropical_interpretive_center)

# currently fails. need to fix (in instructions, perhaps -- fixed by requiring legend?)
step_3.check()

In [None]:
# Lines below will give you a hint or solution code
#_COMMENT_IF(PROD)_
step_3.hint()
#_COMMENT_IF(PROD)_
step_3.solution()

## Step 4: Visualize the data

DB COMMENT: Maybe motivate with scenario? Also, just to make sure I understand the problem, the stuff below has has solutions filled in and will eventually have RM_IF(PROD) for any cell with lineplot.  Right?

In this step, you will create four separate line plots, each corresponding to the number of visitors to a different museum.

Begin by creating a line plot that shows the number of visitors to Avila Adobe over time.  If you'd like a hint or the solution, look at the code cell at the end of this step!

In [None]:
# PLOT 1: line plot showing the number of visitors to Avila Adobe over time
plt.figure(figsize=(12,6))

sns.lineplot(data=museum_data.avila_adobe, 
             label='avila_adobe')

# This fails. not clear why ...
step_4.check()

Next, create a line plot that shows the number of visitors to the Firehouse Museum over time.

In [None]:
plt.figure(figsize=(12,6))

# PLOT 2: line plot showing the number of visitors 
# to the Firehouse Museum over time
sns.lineplot(data=museum_data.firehouse_museum, 
             label='firehouse_museum')

# Show PLOT 2 in the notebook
plt.show()

In the following code cell, create a line plot that shows the number of visitors to the Chinese American Museum over time.

In [None]:
plt.figure(figsize=(12,6))

# PLOT 3: line plot showing the number of visitors 
# to the Chinese American Museum over time
sns.lineplot(data=museum_data.chinese_american_museum, 
             label='chinese_american_museum')

# Show PLOT 3 in the notebook
plt.show()

Finally, create a line plot that shows the number of visitors to the America Tropical Interpretive Center over time.

In [None]:
plt.figure(figsize=(12,6))

# PLOT 4: line plot showing the number of visitors 
# to the America Tropical Interpretive Center over time
sns.lineplot(data=museum_data.america_tropical_interpretive_center, 
             label='america_tropical_interpretive_center')

# Show PLOT 4 in the notebook
plt.show()

In [None]:
# Lines below will give you a hint or solution code
# step_4.hint()
# step_4.solution()

## Step 5: Analyze the data

Use the plots you created in **Step 3** and **Step 4** to answer the questions below:

**Question 1**: One of your goals is to increase the number of visitors to the museums through a marketing campaign.  Towards this end, you decide to first identify the most popular museum.  **Use the plots to determine which museum gets the most visitors over time.**  If you have to demonstrate the most popular museum in a presentation, do you think it is more effective to show:
- the single plot from **Step 3**, or
- all four plots in **Step 4**?

In [None]:
# hidden answer
# avila adobe 
# use single plot (will go into detail)

**Question 2**: You are very interested in ensuring the happiness of the employees at each museum.  When meeting with them, you hear that one major pain point is that the number of museum visitors varies greatly with the seasons, with  low seasons (when the employees are perfectly staffed and happy) and also high seasons (when the employees are understaffed and stressed).  You realize that if you can predict these high and low seasons, you can plan ahead to hire some additional seasonal employees to help out with the extra work. **Use the plots to determine which museums have predictable high and low seasons each year.**  If you have to demonstrate the high and low seasons in a presentation, do you think it is more effective to show:
- the single plot from **Step 3**, or
- all four plots in **Step 4**?

In [None]:
# DB COMMENT: I found this one really hard, and I'm not entirely convinced the other ones don't have seasonality.
# I made the following plot to check the answer.  Seems like Avila has most seasonality. 
# But I just find it hard to tell from original graphs that the user has seen
museum_data['activity_month'] = museum_data.index.month
museum_data.groupby(['activity_month']).mean().plot()

In [None]:
# hidden answer
# all predicatable except america tropical 
# should also mention that crazy spike in 2014 for fireshouse but other than that, firehouse predictable
# use all four plots (will go into detail)

**Question 3**: You would like to have a general idea of overall trends in museum visits.  What do you think: for each museum, does the number of visitors seem to be following a general upward or downward trend over the years?

In [None]:
# hidden answer
# chinese american up, all others down