# CSE151A_UrbanAnimals
#### Unveiling patterns: predictive modeling of animal disposition in urban settings.

## Introduction

With the increasing volume of animals picked up by urban animal control centers, understanding and predicting the outcomes of these incidents have become crucial for effective management and resource allocation. In this study, we leverage a comprehensive 7-year dataset from Baton Rouge Animal Control and Rescue Center (ACRC) to develop a supervised machine learning model that predicts the disposition of animals based on a variety of characteristics such as incident date, request type, location, species, breed, sex, size, age, condition, etc.

Our goal is to, with the help of this dataset, be able to predict the behaviour of each animal species depending on the time of the year and other factors such as location, to subsequently predict how many animal controllers should be attending each call and possibly help them decide the tools to use and how to perform in each task. This supervised predictive model can aid in prioritizing resources, optimizing intervention strategies, and ultimately improving the welfare of animals within urban communities.

The results of our study will not only contribute to a deeper understanding of the factors influencing animal outcomes but will also provide a practical tool for other animal control centers to anticipate the disposition of animals in their care.

## Method

#### 1. Data Exploration

For data exploration, we looked at missing value counts for each column, Male:Female ratios for each species, condition distributions for each species, and a pairplot and heatmap for each category. After identifying the features of interest and areas of concern such as missing values, we began the preprocessing for the data.

Here's the link to the related data exploration notebooks:
- https://github.com/PaulaEsteban2000/CSE151A_UrbanAnimals/blob/main/notebook.ipynb
- https://github.com/PaulaEsteban2000/CSE151A_UrbanAnimals/blob/main/Untitled.ipynb

#### 2. Preprocessing

To begin, our team identified the large amounts of missing values in the data. The columns of concern were primarily age, sex, and request type. We immediately decided to drop zip_code and request type, as we have complete data in latitude and longitude to determine location of each entry and request type is largely unrelated to our project. As the remaining counts of missing values represented a small portion of our data, we agreed to drop all rows with unknown values except for those in the column of sex, with unknown denoted as "U". After cleaning up the unknown values, we then proceeded to correct typos and consolidate different spellings or naming conventions referring to one specific breed of a species, such as "Pitbull", "Pit Bull", and "Pit". Since our data largely consisted of dogs, we also decided to categorize every breed into several groups to simplify our data and prevent runtime issues when training models. Similarly, we then grouped the date-time values and remapped them to the four seasons. After consolidating and simplifying, the categorical features were then one-hot encoded, followed by a [0,1] min max scaling across the numerical features.

Here's the link to the preprocessing notebook:
- https://github.com/PaulaEsteban2000/CSE151A_UrbanAnimals/blob/main/notebook.ipynb

#### 3. Our models

##### 3.a) Model 1: Perceptron

We have started to work separately into different groups, each developing a different first mode simultaneaously: decision tree, logistical regression, perceptron and SVM.

After getting to some promising conclussions in decission trees, logistic regression and perceptrons, we decided the first model we are presenting is the perceptron.

Our model consisted of 7 hidden layers: a pair of 96 node layers, a pair of 48 node layers, a pair of 16 node layers, and a final 5 node layer with a sigmoid function. Each pair has one layer with a relu activation function and another layer with a tanh function. We used stochastic gradient descent with a learning rate of 0.01 as our optimizer. In total, we trained the model for 200 epochs using a batch size of 150 and a validation split size of 0.1. We used MSE as our loss function.

Here is the link to the perceptron notebook: https://github.com/PaulaEsteban2000/CSE151A_UrbanAnimals/blob/main/perceptron.ipynb

##### 3.b) Model 2: Logistic Regression

On this second turn-in, we're leaning towards going with logistic regression. We dived into a binary approach for this model. More on the following sections.

Here is the link to the logistic regression notebook: https://github.com/PaulaEsteban2000/CSE151A_UrbanAnimals/blob/main/logistical_regres_final.ipynb

##### 3.c) Model 3: Decision Tree

Finally, we decided to lean towards a decision tree model. We decided to try simple decision tree, and then we tried to oversample our model. After getting better results than expected, we also tried random forest and finally ensambles. More about our  process in the next sections.

Here is the link to our decision tree notebook: https://github.com/PaulaEsteban2000/CSE151A_UrbanAnimals/blob/main/decisiontree_notebook.ipynb

## Result

#### 1. Model 1: Perceptron Results/Figures

<img src="https://github.com/PaulaEsteban2000/CSE151A_UrbanAnimals/blob/main/graph_perceptron.png?raw=1" width="600" height="600" />

<img src="https://github.com/PaulaEsteban2000/CSE151A_UrbanAnimals/blob/main/graph_perceptron_loss.png?raw=1" width="650" height="650" />

Our final test and train accuracy and loss values were as follows:

```
Test loss: 0.11528602242469788
Test accuracy: 0.5306574106216431

Train loss: 0.11475052684545517
Train accuracy: 0.5378226637840271
```

The following confusion matrix is based on our model's predictions on the test data set.
![cm](https://github.com/PaulaEsteban2000/CSE151A_UrbanAnimals/blob/main/cm_perceptron.png?raw=1)

Finally, our classification report based on the same data used in the confusion matrix:

```
              precision    recall  f1-score   support

      NORMAL       0.75      0.55      0.63      3337
    FRIENDLY       0.42      0.92      0.58      2194
     NERVOUS       0.00      0.00      0.00      1504
   DANGEROUS       0.00      0.00      0.00       185
      SCARED       0.00      0.00      0.00         5

   micro avg       0.53      0.53      0.53      7225
   macro avg       0.23      0.29      0.24      7225
weighted avg       0.47      0.53      0.47      7225
 samples avg       0.53      0.53      0.53      7225
```

#### 2. Model 2: Logistic Regression Results/Figures

![cm](https://github.com/PaulaEsteban2000/CSE151A_UrbanAnimals/blob/main/lr.png?raw=1)

Before Binary Classification

```
Classification Report:  precision  recall   f1-score  support

                 0       0.85      0.97      0.91     28900
                 1       0.73      0.31      0.43     7225

    accuracy                           0.84     36125
   macro avg       0.79      0.64      0.67     36125
weighted avg       0.82      0.84      0.81     36125
```

After Binary classification.

```
Train Accuracy: 0.7788895040734003
Test Accuracy: 0.7810279597674633
```

# After Tuning
Tuning with Class weight

```
Train Accuracy: 0.5945582535790556
Test Accuracy: 0.5913075574420965
```

Our final results we did tuning with different intercepts, resulting in the graph above.


```
Train Accuracy: 0.7788895040734003
Test Accuracy: 0.7810279597674633
---
Train Accuracy: 0.778929051649134
Test Accuracy: 0.7807511303866383
```

#### 3. Model 3: Decision Tree Results/Figures

Fitting Graph:

![fitting](https://github.com/PaulaEsteban2000/CSE151A_UrbanAnimals/blob/main/dtree_oversample.png?raw=1)

Classification Report:

              precision    recall  f1-score   support

           0       0.80      0.95      0.87      5024
           1       0.94      0.79      0.86      5515

    accuracy                           0.87     10539
    macro avg       0.87      0.87     0.87     10539
    weighted avg    0.88      0.87     0.87     10539

Confusion Matrix:

![cm](https://github.com/PaulaEsteban2000/CSE151A_UrbanAnimals/blob/main/dtree_cm.png?raw=1)

## Discussion

In this section we will be discussing the why, and our interpretation of the process. We will also talk about our results, and whether they could be improved within each model or not.

##### a) Model 1: Perceptron

Like we said earlier on, after getting to some promising conclussions in decission trees, logistic regression and perceptrons, we decided the first model we were presenting is the perceptron.

This is because both train and test accuracies present similar proper results. The test and train accuracies are about 0.5319 and 0.5371, respectively. We are getting a higher rate of accuracy than expected on the model, making results convincing at the end. However, the loss graph shows that by the end of our training, we were starting to overfit as our training loss was still slowly decreasing while our testing loss had started to plateau and bounce up and down after around epoch 115.

Based on the confusion matrix and classification report, we can see that our model was not good at precision or recall of 'NERVOUS', 'DANGEROUS', or 'SCARED' temperaments. The 'SCARED' class in particular is probably suffering from underrepresentation because of how few samples there are in the entire data set, however the results for 'NERVOUS' and 'DANGEROUS' indicate that there are other issues with the model that make it unable to predict these classes. This was most likely due to the activation function for the output layer being a sigmoid function which should be used for binary classification. Our problem on the otherhand is a multiclass classification problem.

If we were to improve the model, we could definitely try different activation functions, using a different learning rate, adjusting the number of nodes per layer, and adjusting the number of layers in total. This would be done with hyperparameter tuning to find the best parameters given the large number of different possible combinations.


##### b) Model 2: Logistic Regression

After running our first model we decided to explore data analysis with logistical regression. After doing our logistical regression, our first classification report we received 84% accuracy. The initial results seemed promising so we delved deeper trying to enhance the performance and seeing if we could get better results by attempting binary classification, and hyperparameter tuning.

After using binary classification our results dropped to 78%. We continued to try a different approach, hyperparameter tuning with class weights. After running it, we saw a decrease in model performance, resulting in 59% accuracy.The resulting decrease may have been from non optimal hyperparameters.

We then decided to use different intercept settings in our model which resulted in 78% test accuracy. The results were much better than class weight but still a decrease in performance.

Logistical regression seemed promising, however our attempts to use different tuning methods and classifications did not increase the performance. In hindsight, we may have been able to achieve better model performance had we done various experimentation with other classifications or hyperparameters. However, after our failed attempts to get better results, we believed it best to begin exploring with a decision tree model instead.

##### c) Model 3: Decision Tree

Our exploration began with a Decision Tree Classifier trained to predict animal outcomes based on various incident characteristics. We targeted two main classes by aggregating 'NORMAL' and 'FRIENDLY' into one class and considering 'NERVOUS', 'DANGEROUS', and 'SCARED' as another class. This simplification was necessitated by the challenges in distinguishing between the more nuanced classes (especially those with fewer samples, see explanation for perceptron above), and it led to an improvement in our model's accuracy.

A challenge we faced was the model's tendency to overfit, as evidenced by the initial difference between the training (98.83%) and test (72.02%) accuracies. With our initial model, the difference between the two were 27%. To combat this, we implemented oversampling, which not only reduced overfitting but also enhanced the model's test accuracy.

The classification report shows our model's strength in precisely identifying the 'sensitive' cases (class 0) while maintaining a high recall, thereby minimizing the risk of overlooking such critical instances. The balance in the f1-score across both classes confirms that our decision to implement oversampling effectively countered the initial class imbalance.

Further experiments were conducted with RandomForest, AdaBoost, and GradientBoosting classifiers. For RandomForest, we found only a negligible increase in accuracy compared to the baseline Decision Tree model. The other experiments did not yield satisfactory results and were not used.

Of course, further improvements could be made via hyperparameter tuning for our DecisionTree model, alongside possible further testing for RandomForest as it too showed similar potential.

## Conclusion

Our project aimed to develop a predictive model to help better understand and anticipate animal disposition in urban settings using data sourced from Baton Rouge Animal Control and Rescue Center. Through data exploration, preprocessing, and development and analysis of a supervised machine learning model, we have gained better insight into the factors that are more likely to influence animal behavior, as well as uncovering the strengths and weaknesses of the three models that we decided to use.

The Decision Tree Classifier proved to be the model that we were able to achieve the best results, though out of necessity, we had to group our five classifications of animal behavior into “sensitive” (NORMAL or FRIENDLY) cases and “non-sensitive” (NERVOUS, DANGEROUS, or SCARED) cases, as well as implement oversampling techniques. This model was able to identify “sensitive” cases accurately and maintain a high recall rate. This model was further experimented upon by implementing other classifiers in hopes of increasing accuracy, however these results showed little difference in accuracy compared to the baseline model. If given more time for improvement, improving hyperparameter tuning and experimenting more with the RandomForest classifier might give more accurate results.

Our other models certainly had room for improvement. The Perception model suffered from having trouble accurately predicting results from the ‘NERVOUS’ and ‘DANGEROUS’ classes. While it achieved a reasonable test and train accuracy, it had trouble with overfitting and our testing and training losses after epoch 115. To improve upon this model, we could have experimented with different activation functions as well as improving hyperparameter tuning, as the activation function seemed to be a root cause of the results we got. Additionally using a different learning rate, a different number of nodes per layer, and changing the number of layers could have given us better results too.

Lastly, our Logistic Regression model achieved better performance but at the cost of less hyperparameter tuning options. To further develop this model, implementing more hyperparameter tuning, feature extension or K-fold cross validation would be implemented in hopes of improving its accuracy.

All things considered, our project achieved its goal in finding valuable insights into the factors affecting animal behavior in urban settings, which can be applied as a tool for animal control centers to improve upon their resource allocation and intervention strategies, which ultimately benefit communities as well as the animals themselves.


## Collaboration

__Delete cell prior to turnin:__ ------------------------------------------
_This is a statement of contribution by each member. This will be taken into consideration when making the final grade for each member in the group. Did you work as a team? was there a team leader? project manager? coding? writer? etc. Please be truthful about this as this will determine individual grades in participation. There is no job that is better than the other. If you did no code but did the entire write up and gave feedback during the steps and collaborated then you would still get full credit. If you only coded but gave feedback on the write up and other things, then you still get full credit. If you managed everyone and the deadlines and setup meetings and communicated with teaching staff only then you get full credit. Every role is important as long as you collaborated and were integral to the completion of the project. If the person did nothing. they risk getting a big fat 0. Just like in any job, if you did nothing, you have the risk of getting fired. Teamwork is one of the most important qualities in industry and academia!!!_

_Start with Name: Title: Contribution. If the person contributed nothing then just put in writing: Did not participate in the project._


In this group project, we've been working together the following way:
 - Name: Title: Contribution
 - Paula Esteban Carrillo: Project manager/Writer: Assumed a central role in facilitating effective organization and coordination within our group project. Additionally,  collaborated with other team members in drafting most sections of the project documentation, as well as discussing feedback on each step of the project process. Been active on group chat and attended all meetings.
 - Nick Ehsani: Writer. Worked on the writeup and analyzed results from the models and helped make decisions concerning what factors were worth cutting from the dataset, and how to organize and manage these factors.
 - Thomas Limperis: Data analyst: Performed various data exploration with logistical regression, neural networks. Worked with team members to discuss best approaches and or problems we faced. Showed up to most team meetings and active in our group chat, made sure our files were ready for final submissions.
 - Kevin Liu: Data analyst: Helped with data exploration to change the dataset into a usable state. Additionally worked on the perceptron model and worked with teammates to decide our overall plan.
 - Arjun Suresh Kumar: Data analyst: Performed initial data exploration tasks and contributed heavily to data preprocessing tasks. Helped perform model exploration by preliminary testing of different model types. Worked with team members to discuss workload/model assignment. Contributed to finalization of decision tree notebook and logistic regression
 - Rohan Meserve: Coder/Data analyst: Contributed heavily to EDA, logistic regression notebook, and decision tree notebook. Made minor contributions to preprocessing. Participated in discussions on project adjustments / improvements (such as swapping to a binary task and implementing oversampling), as well as their implementation. Actively communicated in team group chat, and had regular presence at group meetings.
 - Daniel Kong: Title: Contribution. 
 - Joshua Li: Title: Played a role in discussions centered around data preprocessing and exploration. Helped develop a strategy for data cleaning and transformation. Did the writeup for our decision tree analysis. Made edits to the notebook to assist with clarity in the final writeup.
