# Breed differences of heritable behaviour traits in cats

## Intro

Since I come from a family who likes pets, I decided to analyze data related to dog or cat breeds. I used PMC in NCBI to find an interesting [article](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6538663/), where the authors have provided the [data](https://figshare.com/articles/Salonen_et_al_Breed_differences_of_heritable_behaviour_traits_in_cats_-_data/8143835). 

When doing a comprehensive animal breed comparison the only way to generate a large amount of data is using questionnaires, therefore the data is not 100% reliable as the method of gathering it can not be validated. However, some researchers<sup>1, 2</sup> suggest that questionnaires ar as valid as behavioral tests.

The data set consists of 5726 observations, and 21 variables, there are some missing values - probably where there was no answer given.




In [2]:
import pandas as pd
print("Pandas imported successfully, version: "+pd.__version__)
import statsmodels as sm
print("Statsmodels imported successfully, version: "+sm.__version__)
import plotly
print("Plotly imported successfully, version: "+plotly.__version__)

Pandas imported successfully, version: 0.25.2
Statsmodels imported successfully, version: 0.10.1
Plotly imported successfully, version: 4.3.0


## Data import

In [31]:
data = pd.read_excel("https://s3-eu-west-1.amazonaws.com/pfigshare-u-files/15177200/publisheddatascientificreports.xlsx")
data = data.iloc[:, 1:-3]
data.Gender.replace([1, 2], ['Male', 'Female'], inplace=True)
data.head()

Unnamed: 0,Age,Gender,Neuter_status,Breed_group,Weaning_age,Outdoors,Other_cats,Activity_level,Contact_people,Aggression_stranger,Aggression_owner,Aggression_cats,Shyness_novel,Shyness_strangers,Grooming,Wool_sucking,Behaviour_problem
0,4.0274,Female,1,BEN,8,0,1,4,5,1,1,1,2,1,1.0,0.0,1.0
1,2.1096,Female,1,BEN,8,0,1,5,4,1,1,1,3,3,1.0,0.0,1.0
2,7.6822,Male,1,BUR,4,0,1,4,5,1,1,1,2,1,4.0,3.0,2.0
3,5.0027,Male,1,BUR,4,4,0,5,5,1,1,2,1,1,1.0,0.0,1.0
4,5.0137,Male,1,EUR,4,5,1,4,5,1,1,1,2,1,1.0,0.0,1.0


We can see that the data imported successfully. The authors also added 3 personality components from some of the traits listed, since I wanted to work with a raw dataset these were removed by removing the last 3 columns.

## Descriptive statistics

The are 17 different variables in this data set, Table 1 describes some statstics for each variable. Table 2 lists each variable type, dispersion and central tendecy.

**Table 1**

In [4]:
data.describe(include = "all") 

Unnamed: 0,Age,Gender,Neuter_status,Breed_group,Weaning_age,Outdoors,Other_cats,Activity_level,Contact_people,Aggression_stranger,Aggression_owner,Aggression_cats,Shyness_novel,Shyness_strangers,Grooming,Wool_sucking,Behaviour_problem
count,5726.0,5726.0,5726.0,5726,5726.0,5726.0,5726.0,5726.0,5726.0,5726.0,5726.0,5726.0,5726.0,5726.0,5683.0,5696.0,5719.0
unique,,,,19,,,,,,,,,,,,,
top,,,,HCS,,,,,,,,,,,,,
freq,,,,836,,,,,,,,,,,,,
mean,4.753083,1.538945,0.779776,,4.618407,2.546455,0.847363,3.771743,4.089067,1.116312,1.096577,1.584177,2.026546,1.884736,1.794123,0.912395,1.070467
std,3.769304,0.498525,0.414434,,1.576421,1.910538,0.359669,0.864301,0.878921,0.417632,0.368069,0.840766,0.996585,1.051672,0.998514,1.544499,0.38433
min,0.1671,1.0,0.0,,1.0,0.0,0.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,0.0,0.0
25%,1.789,1.0,1.0,,4.0,1.0,1.0,3.0,4.0,1.0,1.0,1.0,1.0,1.0,1.0,0.0,1.0
50%,3.87945,2.0,1.0,,4.0,2.0,1.0,4.0,4.0,1.0,1.0,1.0,2.0,2.0,1.0,0.0,1.0
75%,6.7781,2.0,1.0,,5.0,5.0,1.0,4.0,5.0,1.0,1.0,2.0,3.0,2.0,3.0,2.0,1.0


In [55]:

#Continous
print("%.3f" % data.Age.mean())
print("%.3f" % data.Age.std(), "\n")

#Nominal types
print(data.Gender.mode(), "\n")

print(data.Neuter_status.mode(), "\n")

print(data.Breed_group.mode(), "\n")

print(data.Other_cats.mode())


## All other data types are ordinal so:
print(data.median())
print((data.quantile(1) - data.quantile(0)), "\n")



4.753
3.769 

0    Female
dtype: object 

0    1
dtype: int64 

0    HCS
dtype: object 

0    1
dtype: int64
Age                    3.87945
Neuter_status          1.00000
Weaning_age            4.00000
Outdoors               2.00000
Other_cats             1.00000
Activity_level         4.00000
Contact_people         4.00000
Aggression_stranger    1.00000
Aggression_owner       1.00000
Aggression_cats        1.00000
Shyness_novel          2.00000
Shyness_strangers      2.00000
Grooming               1.00000
Wool_sucking           0.00000
Behaviour_problem      1.00000
dtype: float64
Age                    24.6439
Neuter_status           1.0000
Weaning_age             7.0000
Outdoors                5.0000
Other_cats              1.0000
Activity_level          4.0000
Contact_people          4.0000
Aggression_stranger     4.0000
Aggression_owner        4.0000
Aggression_cats         4.0000
Shyness_novel           4.0000
Shyness_strangers       4.0000
Grooming                4.0000
Wool_suc

**Table 2**

| Variable | Type | Mean | Median | Mode | SD | Range |
| --- | --- | --- | --- | --- | --- | --- |
| Age | Continous| 4.753 | - | - | 3.769 | - |
| Gender | Nominal| - | - | Female |  - | - |
| Neuter status | Nominal | - | - | 1 | - | - |
| Breed group | Nominal | - | - | HCS | - | - |
| Weaning age | Ordinal| - | 4 | - | - | 7 |
| Outdoors | Ordinal | - | 2 | - | - | 4 |
| Other cats | Nominal| - | - | 1 | - | - |
| Activity level | Ordinal| - | 4 | - | - | 4 |
| Contact people | Ordinal| - | 4 | - | - | 4 |
| Agression stranger | Ordinal| - | 1 | - | - | 4 |
| Aggresion owner | Ordinal| - | 1 | - | - | 4 |
| Agression cats | Ordinal| - | 1 | - | - | 4 |
| Shyness novel | Ordinal| - | 2 | - | - | 4 |
| Shyness strangers | Ordinal| - | 2 | - | - | 4|
| Grooming | Ordinal| - | 1 | - | - | 4 |
| Wool sucking | Ordinal| - | 0 | - | - | 7 |
| Behavior problem | Ordinal| - | 1 | - | - | 3 |




### References
1. Yukihide Momozawa et al. Assessment of equine temperament by a questionnaire survey to caretakers and evaluation of its reliability by simultaneous behavior test https://doi.org/10.1016/j.applanim.2003.08.001
2. Erik Wilsson and David L.Sinn. Are there differences between behavioral measurement methods? A comparison of the predictive validity of two ratings methods in a working dog program https://doi.org/10.1016/j.applanim.2012.08.012

In [None]:
jupyter nbconvert nbconvert-example.ipynb --TagRemovePreprocessor.remove_cell_tags='{"remove_cell"}'