This is an exploratory analysis of animal complaints in Brisbane. There are a few interesting things I'd like to show - animal complaints per suburb of course, but animal complaints per square kilometre and per person (by pulling suburb info as well) will be equally interesting. Trends over time - which suburbs have improved or gotten worse over a few quarters. This is all being done in JupyterLab in Python 3.7.6.

In [2]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import datetime as dt

In [3]:
apr_to_jun_19 = pd.read_csv("/home/jack/AnimalC/cars-bis-open-data-animal-related-complaints-apr-jun-2019.csv")

In [11]:
apr_to_jun_18 = pd.read_csv("/home/jack/AnimalC/cars-bis-open-data-animal-related-complaints-apr-to-jun-2018.csv")

In [12]:
jan_to_mar_19 = pd.read_csv("/home/jack/AnimalC/cars-bis-open-data-animal-related-complaints-jan-mar-2019.csv")

In [13]:
jan_to_mar_18 = pd.read_csv("/home/jack/AnimalC/cars-bis-open-data-animal-related-complaints-jan-to-mar-2018.csv")

In [14]:
jul_to_sep_18 = pd.read_csv("/home/jack/AnimalC/cars-bis-open-data-animal-related-complaints-jul-to-sep-2018.csv")

In [15]:
jul_to_sep_19 = pd.read_csv("/home/jack/AnimalC/cars-bis-open-data-animal-related-complaints-jul-to-sep-2019.csv")

In [16]:
oct_to_dec_18 = pd.read_csv("/home/jack/AnimalC/cars-bis-open-data-animal-related-complaints-oct-to-dec-2018.csv")

In [17]:
oct_to_dec_19 = pd.read_csv("/home/jack/AnimalC/cars-srsa-open-data-animal-related-complaints-oct-to-dec-2019.csv")

In [18]:
jan_to_mar_20 = pd.read_csv("/home/jack/AnimalC/cars-srsa-open-data-animal-related-complaints-jan-to-mar-2020.csv")

The first thing that needs doing is to make all this data one usable dataset. The data itself isn't dated, so the only way of knowing what date a complaint is from is by looking at the dataset it's from. 

In [19]:
apr_to_jun_19["Date"] = np.datetime64('2019-05-01')

For ease of charting later on a column is added to each dataframe with the date being the 1st of the middle month of the dataset in question. So Apr to June 19 has the date of each incident listed as 1 May 2019.

In [20]:
apr_to_jun_18["Date"] = np.datetime64('2018-05-01')
jan_to_mar_19["Date"] = np.datetime64('2019-02-01')
jan_to_mar_18["Date"] = np.datetime64('2018-02-01')
jul_to_sep_18["Date"] = np.datetime64('2018-08-01')
jul_to_sep_19["Date"] = np.datetime64('2019-08-01')
oct_to_dec_18["Date"] = np.datetime64('2018-11-01')
oct_to_dec_19["Date"] = np.datetime64('2019-11-01')
jan_to_mar_20["Date"] = np.datetime64('2020-02-01')

In [21]:
jan18_to_mar_20 = pd.concat([apr_to_jun_18, jan_to_mar_19, jan_to_mar_18, jul_to_sep_18, jul_to_sep_19, oct_to_dec_18, oct_to_dec_19, jan_to_mar_20])

In [22]:
jan18_to_mar_20

Unnamed: 0,Category: Nature,Category: Type,Category: Reporting Level,Location: Suburb,Date,Office: Responsible Office
0,Animal,Other Animal,Unregistered,MORNINGSIDE,2018-05-01,
1,Animal,Other Animal,Fox,GUMDALE,2018-05-01,
2,Animal,Cat,,CARSELDINE,2018-05-01,
3,Animal,Dog,Fencing Issues,ZILLMERE,2018-05-01,
4,Animal,Dog,Odour,CAMP HILL,2018-05-01,
...,...,...,...,...,...,...
2141,Animal,Attack,Attack On An Animal,ENOGGERA,2020-02-01,
2142,Animal,Dog,Wandering,EVERTON PARK,2020-02-01,
2143,Animal,Cat,Wandering,BRACKEN RIDGE,2020-02-01,
2144,Animal,Dog,Fencing Issues,INALA,2020-02-01,


In [23]:
# Responsible office only shows up in a single dataset here - not really useful for analysing them as a whole.
del jan18_to_mar_20['Office: Responsible Office']

In [24]:
jan18_to_mar_20

Unnamed: 0,Category: Nature,Category: Type,Category: Reporting Level,Location: Suburb,Date
0,Animal,Other Animal,Unregistered,MORNINGSIDE,2018-05-01
1,Animal,Other Animal,Fox,GUMDALE,2018-05-01
2,Animal,Cat,,CARSELDINE,2018-05-01
3,Animal,Dog,Fencing Issues,ZILLMERE,2018-05-01
4,Animal,Dog,Odour,CAMP HILL,2018-05-01
...,...,...,...,...,...
2141,Animal,Attack,Attack On An Animal,ENOGGERA,2020-02-01
2142,Animal,Dog,Wandering,EVERTON PARK,2020-02-01
2143,Animal,Cat,Wandering,BRACKEN RIDGE,2020-02-01
2144,Animal,Dog,Fencing Issues,INALA,2020-02-01


One notable thing stands out already. While a few Category: Reporting Level pieces are missing, a look at the data shows that this is only sometimes an issue - sometimes it's just that the Type has no Reporting Level below it - Cat Trapping is one example. However, given that this data is primarily useful and interesting on a *suburb* level, the 200 or so rows without suburb data are almost completely useless.

In [25]:
# This gives a new CSV to deal with in future if we need to use this file elsewhere.
jan18_to_mar_20.to_csv("jan18_to_mar_20.csv", index=False)

In [26]:
# The index numbers repeat, but they're not too important for what we're looking to do.
jan18_to_mar_20

Unnamed: 0,Category: Nature,Category: Type,Category: Reporting Level,Location: Suburb,Date
0,Animal,Other Animal,Unregistered,MORNINGSIDE,2018-05-01
1,Animal,Other Animal,Fox,GUMDALE,2018-05-01
2,Animal,Cat,,CARSELDINE,2018-05-01
3,Animal,Dog,Fencing Issues,ZILLMERE,2018-05-01
4,Animal,Dog,Odour,CAMP HILL,2018-05-01
...,...,...,...,...,...
2141,Animal,Attack,Attack On An Animal,ENOGGERA,2020-02-01
2142,Animal,Dog,Wandering,EVERTON PARK,2020-02-01
2143,Animal,Cat,Wandering,BRACKEN RIDGE,2020-02-01
2144,Animal,Dog,Fencing Issues,INALA,2020-02-01


In [27]:
jan18_to_mar_20.rename(columns={'Category: Nature' : 'Nature', 'Category: Type' : 'Type',
                               'Category: Reporting Level' : 'Reporting Level', 'Location: Suburb' : 'Suburb'}, inplace=True)

In [28]:
jan18_to_mar_20 = jan18_to_mar_20[jan18_to_mar_20["Suburb"].notnull()]

In [29]:
jan18_to_mar_20

Unnamed: 0,Nature,Type,Reporting Level,Suburb,Date
0,Animal,Other Animal,Unregistered,MORNINGSIDE,2018-05-01
1,Animal,Other Animal,Fox,GUMDALE,2018-05-01
2,Animal,Cat,,CARSELDINE,2018-05-01
3,Animal,Dog,Fencing Issues,ZILLMERE,2018-05-01
4,Animal,Dog,Odour,CAMP HILL,2018-05-01
...,...,...,...,...,...
2141,Animal,Attack,Attack On An Animal,ENOGGERA,2020-02-01
2142,Animal,Dog,Wandering,EVERTON PARK,2020-02-01
2143,Animal,Cat,Wandering,BRACKEN RIDGE,2020-02-01
2144,Animal,Dog,Fencing Issues,INALA,2020-02-01


In [30]:
suburb_counts = jan18_to_mar_20['Suburb'].value_counts()
print(suburb_counts["NUNDAH"])
print(suburb_counts["FORTITUDE VALLEY"])
suburb_counts

109
18


INALA                 500
BRACKEN RIDGE         284
FOREST LAKE           271
WYNNUM                260
ACACIA RIDGE          229
                     ... 
BULWER                  3
LAKE MANCHESTER         3
PORT OF BRISBANE        3
ENOGGERA RESERVOIR      1
MORETON ISLAND          1
Name: Suburb, Length: 189, dtype: int64

This is where the exploratory analysis starts to show some cracks. There are several important factors in explaining animal complaints. Firstly of course is population - both animal and human. We have no way of getting information on the former, though, so we'll need to use the human population as a proxy for it. Other factors involve things like the density of the suburb - we can see above that Fortitude Valley has very few animal complaints, likely due to having few animals relative to the human population. Feral cats, wandering dogs, etc, are more likely to be seen in suburbia than in the CBD.

The first is the most important to address, though. Inala has had 500 complaints over this time period, and Nundah, a suburb of similar size, has had 109. But Forest Lake has had two and a half times the complaints, but also has twice the population. Without proportionality we are unlikely to get a picture of what suburbs suffer more animal complaints. After all, suburbs are an artificial boundary - Forest Lake has ten times the population and six times the area of Seven Hills. If we compare on a suburb-to-suburb basis we're not really getting actionable information.

The next step is going to be pulling census data from the ABS and using that to compare with the various suburbs.

Unnamed: 0,Nature,Type,Reporting Level,Suburb,Date
56,Animal,Dog,Fencing Issues,INALA,2018-05-01
58,Animal,Dog,Fencing Issues,INALA,2018-05-01
123,Animal,Other Animal,Nuisance Animal,INALA,2018-05-01
142,Animal,Attack,Not An Attack,INALA,2018-05-01
143,Animal,Other Animal,Too Many Animals,INALA,2018-05-01
...,...,...,...,...,...
2103,Animal,Cat Trapping,,INALA,2020-02-01
2122,Animal,Other Animal,Pest / Feral Animal,INALA,2020-02-01
2125,Animal,Other Animal,Pest / Feral Animal,INALA,2020-02-01
2134,Animal,Attack,Dangerous,INALA,2020-02-01


Unnamed: 0,Nature,Type,Reporting Level,Suburb,Date
121,Animal,Dog,Too Many Animals,INALA,2018-02-01
135,Animal,Dog,Odour,INALA,2018-02-01
180,Animal,Other Animal,,INALA,2018-02-01
220,Animal,Dog,Fencing Issues,INALA,2018-02-01
226,Animal,Dog,Fencing Issues,INALA,2018-02-01
249,Animal,Other Animal,,INALA,2018-02-01
301,Animal,Cat Trapping,,INALA,2018-02-01
432,Animal,Other Animal,Odour,INALA,2018-02-01
569,Animal,Attack,Not An Attack,INALA,2018-02-01
590,Animal,Cat Trapping,,INALA,2018-02-01


Unnamed: 0,Nature,Type,Reporting Level,Suburb,Date
56,Animal,Dog,Fencing Issues,INALA,2018-05-01
58,Animal,Dog,Fencing Issues,INALA,2018-05-01
123,Animal,Other Animal,Nuisance Animal,INALA,2018-05-01
142,Animal,Attack,Not An Attack,INALA,2018-05-01
143,Animal,Other Animal,Too Many Animals,INALA,2018-05-01
...,...,...,...,...,...
1615,Animal,Dog,,INALA,2018-05-01
1616,Animal,Dog,Fencing Issues,INALA,2018-05-01
1617,Animal,Attack,Attack On An Animal,INALA,2018-05-01
1629,Animal,Attack,Not An Attack,INALA,2018-05-01
