# Presenting Structured Data
In this notebook we will be exploring the presentation of data with a range of tools. Create your Prac9 directory and copy the hospitals data file (govhack3_FOP.csv) into the directory.

In [None]:
#import matplotlib and make it inline for plotting
import matplotlib.pyplot as plt
%matplotlib inline

## Matplotlib
Matplotlib is part of standard Python and will always be available. 

**Pros**
- always available - nothing to install
- publication quality output
- recently introduced styles (See: https://matplotlib.org/users/style_sheets.html)

**Cons**
- prettier outputs are available
- doesn't provide csv and dataframe tools

**Example:**

In [None]:
with open('govhack3_FOP.csv') as fileobj:
    lines = []
    for line in fileobj.readlines():
        lines.append(line.split(','))
headers=lines[0].copy()
del lines[0]
print(lines[-3:])
print(headers)   

In [None]:
RPH_attendance = []
for line in lines:
    RPH_attendance.append(int(line[2]))
plt.plot(RPH_attendance)

** Styles **

Exploring the matplotlib styles. Start by listing what's available. You can also define your own.

Putting the style in a "with" statement restricts it to that block - otherwise it affects every plot that follows.

In [None]:
print(plt.style.available)

In [None]:
with plt.style.context(('dark_background')):
    plt.plot(RPH_attendance)

## Pandas
Pandas is part of Python and will always be available. It adds Dataframe objects and operations, along with a wrapper around matplotlib to make it easy to use with Dataframes.

**Pros**
- always available
- simpler methods for plotting
- pandas gives lots of useful support to tabular data
- styles available to easily apply an overall, consistent style to plots. ** Note: ** the styles used to be in pandas, but have now been moved back to matplotlib, so you will get a warning with the code from the lectures and some websites.

**Cons**
- still may need to know matplotlib if needing specific tweaking of plots

**Example:**

In [None]:
import pandas as pd
hospitals = pd.read_csv("govhack3_FOP.csv", header=0) 


In [None]:
hospitals.describe()


In [None]:
hospitals.head()

In [None]:
my_plot = hospitals['RPH_Attendance'][:30].plot(kind='bar',legend=None, title="RPH Attendance") 
my_plot.set_xlabel("Dates") 
my_plot.set_ylabel("Numbers")

In [None]:
monthly = hospitals.groupby('Month')
monthly.sum()['RPH_Attendance']

In [None]:
month_plot = monthly.sum()['RPH_Attendance'].plot(kind='bar', legend = None, title='RPH Monthly attendance')

** Styles **

Testing out styles with pandas. Note that the code below will affect all later calls to matplotlib plots. To make style local, use the with statement.

In [None]:
month_plot = monthly.sum()['RPH_Attendance'].plot(kind='bar', legend = None, title='RPH Monthly attendance')
plt.style.use('grayscale') 
#with plt.style.context(('grayscale')):
#   month_plot = monthly.sum()['RPH_Attendance'].plot(kind='bar', legend = None, title='RPH Monthly attendance')



## Seaborn
Seaborn is also based on matplotlib. It aims to make plotting simpler and more attractive

**Pros**
- matplotlib provides a familiar basis
- Prettier output

**Cons**
- another package to load and use

**Example:**

In [None]:
import seaborn as sns

hospitals2 = pd.read_csv("govhack3_FOP.csv", header=0)

hospitals2.head()



** Styles **

There are five preset seaborn themes: darkgrid, whitegrid, dark, white, and ticks. They are each suited to different applications and personal preferences. The default theme is darkgrid. More information is at https://seaborn.pydata.org/tutorial/aesthetics.html

In [None]:
hosp_july = hospitals2[:31]
sns.set_style("darkgrid") 
bar_plot = sns.barplot(x=hosp_july["Date"], y=hosp_july["RPH_Attendance"], palette="muted") 
plt.xticks(rotation=90) 
plt.show() 

In [None]:
sns.set(style="white") 
bar_plot = sns.barplot(x=hosp_july["Date"], y=hosp_july["RPH_Attendance"], palette="muted") 
plt.xticks(rotation=90) 
plt.show()

## Bokeh
Not based on matplotlib, Bokeh is geared towards web-visualisations. See http://bokeh.pydata.org/en/latest/docs/user_guide/charts.html for more.

**Pros**
- Has some amazingly beautiful plots
- Scale well with large data options

**Cons**
- Another package to install

**Example:**

In [None]:
from bokeh.charts import Bar, show, output_file

hospitals3 = pd.read_csv("govhack3_FOP.csv", header=0)

hosp_july = hospitals3[:31]
#print(hosp_july)

p = Bar(hosp_july, 'Date', values='RPH_Attendance', title="July Attendance at RPH", color="indigo", legend=None)

output_file("bar.html")

show(p)


## Basemap
Basemap is a package for plotting maps in Python. It may need additional packages to be installed (basemap, pillow). Visit the basemap documentation and tutorial to find out more - http://basemaptutorial.readthedocs.io/en/latest/. Then try the code below.

**Pros**
- builds on matplotlib
- lots of options for working with maps

**Cons**
- need to install basemap and pillow packages
- complex

**Example:**

In [None]:
from mpl_toolkits.basemap import Basemap
import matplotlib.pyplot as plt


map = Basemap(width=5000000,height=4000000,
            rsphere=(6378137.00,6356752.3142),\
            resolution='l',area_thresh=1000.,projection='lcc',\
            lat_0=-27.,lon_0=133)

map.drawmapboundary(fill_color='aqua')
map.fillcontinents(color='#ddaa66', lake_color='aqua')

map.drawcountries()
map.drawstates(color='0.5')
map.drawcoastlines()
plt.title('Map of Australia')

plt.show()