# Data Visualization/Analysis

## Placeholder Notebook

This notebook is initially set up to allow you to quickly start examining the data you have. You should expect it to change a lot over the course of the week as you add features and explore the data.

The majority of your code is likely to be in your package (i.e. the `flood_tool` directory). You should import this package into this notebook and use it to explore\analysis the data.

You should try to make sure that this synergizes with the predictive tools your team is building, for example identifiying high risk areas near locations of heavy rainfall or high river/tide levels.

### A quick look at postcode data

In [None]:
import os
import sys
sys.path.append('..') # Add parent directory to path to always find flood_tool
                      # This is not best practice, but it works for this example

import pandas as pd
import matplotlib.pyplot as plt

import flood_tool as ft

# The unlabelled postcode data
df1 = pd.read_csv(os.path.join(ft._example_dir,
                               'postcodes_unlabelled.csv'))
df1.head()

In [None]:
# The labelled postcode data

df2 = pd.read_csv(os.path.join(ft._data_dir, 'postcodes_labelled.csv'))
df2.head()

In [None]:
df2['riskLabel'].value_counts().sort_index().plot(kind='bar', title='Risk Label Distribution', logy=True)

In [None]:
# The data on households/population per sector

df3 = pd.read_csv(os.path.join(ft._data_dir, 'sector_data.csv'))
df3.head()

In [None]:
# The data on measurement stations

df4 = pd.read_csv(os.path.join(ft._data_dir, 'stations.csv'))
df4.tail()

In [None]:
# The data for a wet day

df5 = pd.read_csv(os.path.join(ft._example_dir, 'wet_day.csv'))
df5.head()

In [None]:
rain = pd.to_numeric( df5.value.loc[(df5.parameter=='rainfall')], errors='coerce')

rain.loc[(rain>=0) & (rain<=20)].plot(kind='hist',
                                      title='Rainfall Distribution',
                                      logy=True)
plt.xlabel('Rainfall (mm/15 mins)')

In [None]:
# The data for a more typical day

df6 = pd.read_csv(os.path.join(ft._example_dir, 'typical_day.csv'))
df6.head()

In [None]:
rain = pd.to_numeric( df6.value.loc[(df6.parameter=='rainfall')], errors='coerce')
rain.loc[(rain>=0) & (rain<=20)].plot(kind='hist',
                                      title='Rainfall Distribution',
                                      xlabel='Rainfall (mm/15 mins)',
                                      logy=True)
plt.xlabel('Rainfall (mm/15 mins)')

### Mapping your data

As one possible approach, we have provided a function to plot a circle on a map using the `folium` package. You can use `folium` and expand on this functionality, or you may prefer to use a different package. Please check with us that the mapping package you wish to use is permissible before you start.

In [None]:
## Mapping functionality
map = ft.plot_circle(51., 0, 2000.) #Plots a circle of radius 2000 m at the lat, lon: 53., 0.

import folium
folium.Marker(location=(51, 0.1), 
                  popup='This is my popup',
                  icon=folium.Icon(color='black', icon='info-sign')).add_to(map)
map