# 7. Conclusion and Future Considerations

[index](../Index.ipynb) | [prev](./06.AnomalyDetection.ipynb) | [next](./08.08.Acknowledgements.ipynb)

Below are the key reflections related to the three research questions, posed in the Introduction Chapter:

1. How complex is it, to build a fast and reliable object detection pipeline, using *IOT* devices and *Computer Vision*?

A reliable data collection stage manifested itself with a high complexity. Collecting images $24/7$ for $6$ months has challenged my problem solving skills, and thought me many valuable lessons:

- It is crucial to place the camera in the right location. It required wiring the house with ethernet cables and investing in Power Over Ethernet adapters. Camera units (and *IOT* devices) placed outside of the house, need to be monitored against environmental effects: direct contact with sunlight, humidity, dust, dirt, insects and even birds can have a negative impact on the quality of collected data
- It can get very slow to perform multiple tasks on each frame from the camera. It was made possible due to smart motion detection approach with suitable parameters (*Background Subtraction*), connected with the fast and accurate object detection algorithm (*Yolo v2*)
- It is challenging to forward *High Definition* images in real time to a web browser from Python. It was solved through the *Socket Data Transmission* inside a separate thread
- Software services need to start automatically when devices are rebooted or when network connections are broken. Utilizing *Supervisor* Linux utility and a proper network setup can minimize the negative impacts.

2. Given the collected image data with object detections, can future object counts be accurately predicted using *Machine Learning*?

Object counts for a given category (*Person* or *Vehicle*), can be predicted with relatively low error rates using Machine Learning models.

The process required a significant amount of image data extraction, cleaning and pre-processing. Numerous models of different type and complexity have been tested (ranging from *Linear Regression* through *Bi-Directional LTSM Neural Networks*).

Given the evidence gathered in Chapter 5, there are two types of models, which can be successfully applied to make predictions: a probabilistic model (*Gaussian Process*) and point estimate model (*Histogram-Based Gradient Boosting Regressor*).

Gaussian Processes have an advantage of providing uncertainty about the predictions, but Gradient Boosting models are faster to train and more robust to the object category selection.

3. Does object detection data contain anomalous patterns, which can be recognized with *Anomaly Detection* algorithms?

Applying anomaly detection algorithms to the collected image data can generate useful results.

Measuring the maximum threshold of expected objects of a certain category (*Person* or *Vehicle*), in a given hour, gives the ability to flag events as anomalous, if a count within an hour is above the threshold.

Out of four tested methods, the *probabilistic approach*, which utilized *gamma* distribution and *Poisson* likelihood function, produced an optimal result and classified $61$ observations as anomalous out of $4140$.

The second strategy to anomaly detection was to apply an Auto Encoder Neural Network directly to raw image data. This technique is an *Unsupervised Machine Learning* as the historical images are not labeled. In this case a dataset with mixed object categories can be utilized.

The aim is to search for images which differ the most from the others, and therefore can be flagged as anomalies. This method has multiple usages:

- If an incoming image deviates above a specified threshold (calculated using *mean squared error*), then an alert can be triggered. Model successfully flagged many people gathered outside the house as an anomalous event
- Thousands of images, collected throughout a single day, can be sorted from the most anomalous one, which quickly highlights the most unusual scenes in a given day

While testing on a hand-labeled dataset with $30$ images, the best model was able to classify correctly $9$ out of $15$ anomalies. It yielded a *Recall score* of $0.6$, while not sacrificing the *F1 score* of $0.72$.

**Recommendations for future iterations**

Below is a compressed list of future considerations for this research:

- It would be very beneficial to put all the components together into a working software product and let it run for a longer period of time. This would expose further opportunities and potential gaps to tweak
- An experiment with using a professional camera with a waterproof casing and night vision would allow to assess if the difference between the Raspberry Pi camera is significant enough to justify a purchase
- The recent trends in on-device learning (utilizing devices like *NVidia Jetson Nano* or *Google Coral*), if applied correctly, could significantly reduce the cost of the hardware
- It would be an interesting research question if the same system, deployed in another household produces similar forecast accuracy and anomaly detection results. A positive test would prove that the models developed in this research are truly generalizable
- Modern AI systems should prioritize ethics. Privacy mode could blur people silhouettes (or at least faces) in the detected images
- More collected data would allow to deploy more powerful and explainable forecasting models, like Arima with periodicity and seasonality components
- Current approach to determining a maximum threshold of objects in a given hour to detect anomalies, could be improved by incorporating the predictions from Gaussian Process or Gradient Boosting Regressor. These thresholds would be then tailored to the specific time of the day and weather conditions
- Vanilla auto encoder used in this research for raw image anomaly detection could be improved by exploring a Variational auto encoder, which produces a much smoother latent space and can significantly improve quality of predictions
- There are already newer versions of libraries (Yolo, pymc), which offer more accurate results with greater performance and lower resource consumption
- To potentially improve the portability of this system, it would be an interesting study to replace Message Queues to RTSP, and compare the FTP and quality of received picture

**Final remark**

The findings and recommendations above, combined high velocity in the release of new research papers and libraries, show plethora of opportunities for future research in this area.

The system proposed in this research can evolve, and play an important role in generating interesting insights about the world around us.

[index](../Index.ipynb) | [prev](./06.AnomalyDetection.ipynb) | [next](./08.08.Acknowledgements.ipynb)