# Visualization and Plotting using Pandas

## Grocery Data Example

In [None]:
import pandas
groceries = pandas.read_csv('groceries.csv') 
groceries

<style>
    .jp-RenderedMarkdown {
  background-color: yellow;
}
</style>  

<div style="background-color:palegreen">
<h3>How to compute a new Pandas DataFrame column</h3>
    
Example:

```bash
    mydataframe['newColumn'] = mydataframe.Column1 - mydataframe.Column2 / mydataframe.Column3
```
</div>

In [None]:
# Compute a new TotalCost column for each row (CostPerItem x Quantity)

# insert your code here

groceries

<div style="background-color:palegreen">

### How to compute a new Pandas column using a Python function

Example:
```bash    
    def my_function(row) :
        if row.a == 7 :
            return row.c + row.d
        else
            return 42
            
    mydataframe['newCol'] = mydataframe.apply(my_function, axis=1)
```
</div>

In [None]:
# Change the TotalCost calculation so that a 10% discount is applied to an item if the quantity purchased is more than 5.
    
# insert your code here

groceries

In [None]:
# What is the total cost of all groceries purchased?

# insert your code here

# Answer should be $7620.38

<div style="background-color:palegreen">

### How to use a group by query in Pandas

Examples:
```bash      
    mydataframe.groupby('my_column').other_column.mean()
    mydataframe.groupby('my_column').other_column.sum()
    mydataframe.groupby('my_column').other_column.min()
``` 
</div>

In [None]:
# Use a group by to compute the total groceries purchased on each Shopping Date

# insert your code here

<div style="background-color:palegreen">

### How to Visualize a Pandas DataFrame as a line plot

Example:
```bash     
    ax = mydataframe.plot(figsize=(width, height)) # width and height specified in inches
    ax.set_title('My Title')
    ax.set_xlabel('My X axis label')
    ax.set_ylabel('My Y axis label')
```
</div>

In [None]:
# Plot this data as a line graph with an appropriate title and axis labels
# Hint: use the following code to display all 20 dates on the X-axis
#   date_labels = totals_by_date.index
#   ticks = ax.set_xticks(range(len(date_labels)), date_labels, rotation=90)

# insert your code here

<div style="background-color:palegreen">

### How to Visualize a Pandas DataFrame as a bar plot

Example:
```bash     
    ax = mydataframe.plot.bar(figsize=(width, height)) # width and height specified in inches
```
</div>

In [None]:
# Now visualize it using a bar graph instead

# insert your code here

In [None]:
# Use a group by to compute the total groceries purchased for each Category

# insert your code here

In [None]:
# Visualize category data using a bar graph

# insert your code here

<div style="background-color:palegreen">

### How to Visualize a Pandas DataFrame as a pie plot

Example:
```bash     
    ax = mydataframe.plot.pie(figsize=(width, height), autopct='%.1f%%') # width and height specified in inches
```
</div>

In [None]:
# Visualize the category data using a pie chart with percentages shown for each slice. 

# insert your code here

<div style="background-color:palegreen">

### How to use a group by with more than one column in Pandas

Examples:
```bash      
    mydataframe.groupby(['column1', 'column2']).other_column.mean()
    mydataframe.groupby(['column1', 'column2']).other_column.sum()
    mydataframe.groupby(['column1', 'column2']).other_column.min()
``` 
</div>

In [None]:
# Group by both the ShoppingDate and the Category and compute the total cost for each group

# insert your code here

In [None]:
# What happens if you apply the unstack() method to these results?

# insert your code here

In [None]:
# Plot the resulting unstacked data as a stacked bar graph (set parameter stacked=True)

# insert your code here

In [None]:
# Plot the total quantity of groceries purchased each week as a bar graph

# insert your code here

## Weather Data Example

In [None]:
import pandas
weather = pandas.read_csv('BrisbaneDailyWeather.csv', index_col=0, parse_dates=[0])
weather

In [None]:
# Plot the MinTemp and MaxTemp columns together on a simple line plot

# insert your code here

<div style="background-color:palegreen">

### How to access the parts of a DataFrame index of type date/time

Examples:
```bash      
    mydataframe.index.minute
    mydataframe.index.hour
    mydataframe.index.day
    mydataframe.index.day_name()
    mydataframe.index.day_of_week
    mydataframe.index.day_of_year    
    mydataframe.index.weekofyear 
    mydataframe.index.month
    mydataframe.index.month_name
    mydataframe.index.year  
``` 
</div>

In [None]:
# Use a group by to find the average Minimum and Maximum Temp for each day of the year

# insert your code here

# Result should be 366 rows × 2 columns

In [None]:
# Plot these day averages together on a single line plot

# insert your code here

In [None]:
# Compute the statistical correlation between all variables in the weather data
# Which columns are closely correlated?

# insert your code here

<div style="background-color:palegreen">

### How to Visualize a Pandas DataFrame as a scatter plot

Example:
```bash     
    ax = mydataframe.plot.scatter(x='column1', y='column2')
```
</div>

In [None]:
# Create a scatter plot of the minimum temperature vs the maximum temperate
# What pattern do you notice?

# insert your code here

In [None]:
# Determine the minimum and maximum values of the MaxTemp column

# insert your code here

<div style="background-color:palegreen">

### How to group by using bins

Example:
```bash  
    bins = range(lower, upper, step) # can use either a list or a range
    groups = mydataframe.groupby(pandas.cut('columnName', bins)).SomeOtherColumnName
```
</div>

In [None]:
# Divide the range of MaxTemp values into bins so that each bin covers 5 degrees
# Use a group by over these MaxTemp bins to explore how the MinTemp is correlated to the MaxTemp.
# For each group, use the agg method to compute:
#     max (100 percentile), 
#     75 percentile, 
#     mean (50 percentile), 
#     25 percentile and 
#     min (0 percentile) 

def q25(x) :
    return  x.quantile(0.25)

# insert your code here


In [None]:
# plot the above data as a line graph.

# insert your code here

<div style="background-color:palegreen">

### How to plot a line graph with error bars

Examples:
```bash  
    groups = mydataframe.plot(yerr=data.std(), ecolor='someColour', elinewidth=width, capsize=size,color='otherColour'))
```
</div>

In [None]:
# Plot just the mean MinTemp, but use the standard deviation as the y error bar

# insert your code here

In [None]:
# Compute a new column named month that contains the month number of the index
weather['month'] = weather.index.month

In [None]:
# Create  a box plot of the MaxTemp versus the month column

# insert your code here

In [None]:
# Add friendly xtick labels to the above box plot
# Hint, use:
# ax.set_xticks(range(1,13), ['Jan','Feb','Mar','Apr','May','Jun','Jul','Aug','Sep','Oct','Nov','Dec'])

In [None]:
# Use a group by to compute the average maximum temperature for each month of the year
# Hint: groupby weather.index.month

# insert your code here

In [None]:
# Plot the above average maximum temperature for each month of the year and use the standard deviation as the y error bar

# insert your code here

In [None]:
# Add fill_between to the above graph from min to max and also  from 25 percentile to 50 percentile
# Hint, use:
#   means = max_temp_by_month.mean()
#   ax.fill_between(means.index, max_temp_by_month.min(), max_temp_by_month.max(), alpha=0.1)
#   ax.fill_between(means.index, max_temp_by_month.quantile(0.25), max_temp_by_month.quantile(0.75), alpha=0.2)

# insert your code here