This is an exploratory analysis of animal complaints in Brisbane. There are a few interesting things I'd like to show - animal complaints per suburb of course, but animal complaints per square kilometre and per person (by pulling suburb info as well) will be equally interesting. Trends over time - which suburbs have improved or gotten worse over a few quarters. 

In [35]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import datetime as dt

In [16]:
apr_to_jun_19 = pd.read_csv("/home/jack/AnimalC/cars-bis-open-data-animal-related-complaints-apr-jun-2019.csv")

In [17]:
apr_to_jun_18 = pd.read_csv("/home/jack/AnimalC/cars-bis-open-data-animal-related-complaints-apr-to-jun-2018.csv")

In [18]:
jan_to_mar_19 = pd.read_csv("/home/jack/AnimalC/cars-bis-open-data-animal-related-complaints-jan-mar-2019.csv")

In [19]:
jan_to_mar_18 = pd.read_csv("/home/jack/AnimalC/cars-bis-open-data-animal-related-complaints-jan-to-mar-2018.csv")

In [20]:
jul_to_sep_18 = pd.read_csv("/home/jack/AnimalC/cars-bis-open-data-animal-related-complaints-jul-to-sep-2018.csv")

In [21]:
jul_to_sep_19 = pd.read_csv("/home/jack/AnimalC/cars-bis-open-data-animal-related-complaints-jul-to-sep-2019.csv")

In [22]:
oct_to_dec_18 = pd.read_csv("/home/jack/AnimalC/cars-bis-open-data-animal-related-complaints-oct-to-dec-2018.csv")

In [23]:
oct_to_dec_19 = pd.read_csv("/home/jack/AnimalC/cars-srsa-open-data-animal-related-complaints-oct-to-dec-2019.csv")

In [24]:
jan_to_mar_20 = pd.read_csv("/home/jack/AnimalC/cars-srsa-open-data-animal-related-complaints-jan-to-mar-2020.csv")

The first thing that needs doing is to make all this data one usable dataset. The data itself isn't dated, so the only way of knowing what date a complaint is from is by looking at the dataset it's from. 

In [42]:
apr_to_jun_19["Date"] = np.datetime64('2019-05-01')

For ease of charting later on a column is added to each dataframe with the date being the 1st of the middle month of the dataset in question. So Apr to June 19 has the date of each incident listed as 1 May 2019.

In [45]:
apr_to_jun_18["Date"] = np.datetime64('2018-05-01')
jan_to_mar_19["Date"] = np.datetime64('2019-02-01')
jan_to_mar_18["Date"] = np.datetime64('2018-02-01')
jul_to_sep_18["Date"] = np.datetime64('2018-08-01')
jul_to_sep_19["Date"] = np.datetime64('2019-08-01')
oct_to_dec_18["Date"] = np.datetime64('2018-11-01')
oct_to_dec_19["Date"] = np.datetime64('2019-11-01')
jan_to_mar_20["Date"] = np.datetime64('2020-02-01')

In [62]:
jan18_to_mar_20 = pd.concat([apr_to_jun_18, jan_to_mar_19, jan_to_mar_18, jul_to_sep_18, jul_to_sep_19, oct_to_dec_18, oct_to_dec_19, jan_to_mar_20])

In [63]:
jan18_to_mar_20

Unnamed: 0,Category: Nature,Category: Type,Category: Reporting Level,Location: Suburb,Date,Office: Responsible Office
0,Animal,Other Animal,Unregistered,MORNINGSIDE,2018-05-01,
1,Animal,Other Animal,Fox,GUMDALE,2018-05-01,
2,Animal,Cat,,CARSELDINE,2018-05-01,
3,Animal,Dog,Fencing Issues,ZILLMERE,2018-05-01,
4,Animal,Dog,Odour,CAMP HILL,2018-05-01,
...,...,...,...,...,...,...
2141,Animal,Attack,Attack On An Animal,ENOGGERA,2020-02-01,
2142,Animal,Dog,Wandering,EVERTON PARK,2020-02-01,
2143,Animal,Cat,Wandering,BRACKEN RIDGE,2020-02-01,
2144,Animal,Dog,Fencing Issues,INALA,2020-02-01,


In [67]:
# Responsible office only shows up in a single dataset here - not really useful for analysing them as a whole.
del jan18_to_mar_20['Office: Responsible Office']

In [68]:
jan18_to_mar_20

Unnamed: 0,Category: Nature,Category: Type,Category: Reporting Level,Location: Suburb,Date
0,Animal,Other Animal,Unregistered,MORNINGSIDE,2018-05-01
1,Animal,Other Animal,Fox,GUMDALE,2018-05-01
2,Animal,Cat,,CARSELDINE,2018-05-01
3,Animal,Dog,Fencing Issues,ZILLMERE,2018-05-01
4,Animal,Dog,Odour,CAMP HILL,2018-05-01
...,...,...,...,...,...
2141,Animal,Attack,Attack On An Animal,ENOGGERA,2020-02-01
2142,Animal,Dog,Wandering,EVERTON PARK,2020-02-01
2143,Animal,Cat,Wandering,BRACKEN RIDGE,2020-02-01
2144,Animal,Dog,Fencing Issues,INALA,2020-02-01


One notable thing stands out already. While a few Category: Reporting Level pieces are missing, a look at the data shows that this is only sometimes an issue - sometimes it's just that the Type has no Reporting Level below it - Cat Trapping is one example. However, given that this data is primarily useful and interesting on a *suburb* level, the 200 or so rows without suburb data are almost completely useless.

In [70]:
jan18_to_mar_20.to_csv("jan18_to_mar_20.csv", index=False)

<bound method Series.dropna of 0         MORNINGSIDE
1             GUMDALE
2          CARSELDINE
3            ZILLMERE
4           CAMP HILL
            ...      
2141         ENOGGERA
2142     EVERTON PARK
2143    BRACKEN RIDGE
2144            INALA
2145      GORDON PARK
Name: Location: Suburb, Length: 15274, dtype: object>