# 7. Conclusion and Future Considerations

[index](../Index.ipynb) | [prev](./06.AnomalyDetection.ipynb) | [next](./08.08.Acknowledgements.ipynb)

Below are the key conclusions to the three research questions in this study:

**1. How complex is it, to build a fast and reliable object detection pipeline, using *IOT* devices and *Computer Vision*?**

A reliable data collection stage manifested itself with a high complexity. $6$ months of image capture posed various challenges, and led to the following insights:

- It is crucial to place the camera in the right location. It may require wiring the house with the ethernet cables, and an investment in Power Over Ethernet adapters. Camera units (and *IOT* devices) placed outside of the house, need to be monitored against environmental effects: direct exposure to sunlight, humidity, dust, dirt, insects and even birds. All of them can have a negative impact on the picture quality
- Multiple tasks performed on each frame from the camera, may introduce processing latency. Motion sensing (*Background Subtraction*) with suitable parameters, and fast object detector (*Yolo v2*), can eliminate this problem
- Smooth transmission of *High Definition* images to a web browser, can be achieved by using *web sockets* in a separate Python thread
- Software services need to start automatically when devices are rebooted, or when network connections are broken. Utilizing *Supervisor* Linux utility, and a proper network setup can minimize the loss in data.

**2. Given the dataset of collected images, can the future object counts be accurately predicted using *Machine Learning*?**

Object counts for a given category (*Person* or *Vehicle*), can be predicted with relatively low error rates using Machine Learning models.

This process requires a significant amount of image data extraction, cleaning and pre-processing. Numerous models of different type and complexity, have been tested (ranging from *Linear Regression* through *Bi-Directional LTSM Neural Networks*).

Given the evidence gathered in Chapter 5, there are two types of models, which can be successfully applied to make predictions: a probabilistic model (*Gaussian Process*), and a point estimate model (*Histogram-Based Gradient Boosting Regressor*).

While Gaussian Processes have an advantage of providing uncertainty about the predictions, Gradient Boosting models are faster to train, and more robust to the object category selection.

**3. Can *Anomaly Detection* algorithms assist in recognizing anomalous patterns in the object detection data?**

Applying anomaly detection algorithms to the collected image data, can generate useful results.

#### Hourly threshold estimation

Estimating a maximum number of objects per hour, allows to flag anomalies above that threshold. Each object category, like Person or Vehicle, is analyzed individually.

*Probabilistic approach*, which utilizes *gamma* distribution and *Poisson* likelihood function, produces an optimal result and classifies $61$ out of $4140$ observations as anomalous. 

#### Raw image classification

The second methodology applies an Auto Encoder Neural Network directly to raw image data. This technique is categorized as *Unsupervised Machine Learning*, as the historical images are not labeled. This time, multiple object classes are considered inside a single model.

The aim is to search for images which differ the most from the others, and therefore can be flagged as anomalies. This method has multiple usages:

- If an incoming image deviates above a specified threshold (calculated using *mean squared error*), then an alert can be triggered. The model successfully flagged many people gathered outside the house as an anomalous event
- Thousands of images, collected throughout a single day, can be sorted from the most anomalous one, which quickly highlights the most unusual scenes in a given day

While testing on a hand-labeled dataset with $30$ images, the best model was able to classify $9$ out of $15$ anomalies correctly. It yielded a *Recall score* of $0.6$, while not sacrificing the *F1 score* of $0.72$.

**Recommendations for future iterations**

Below is a compressed list of future considerations for this research:

- It would be very beneficial to put all the components together into a working software product and let it run for a longer period of time. This would expose further opportunities and potential gaps to tweak
- An experiment with using a professional camera with a waterproof casing and night vision would allow to assess if the difference between the Raspberry Pi camera is significant enough to justify a purchase
- The recent trends in on-device learning (utilizing devices like *NVidia Jetson Nano* or *Google Coral*), if applied correctly, could significantly reduce the cost of the hardware
- It would be an interesting research question if the same system, deployed in another household produces similar forecast accuracy and anomaly detection results. A positive test would prove that the models developed in this research are truly generalizable
- Modern AI systems should prioritize ethics. Privacy mode could blur people silhouettes (or at least faces) in the detected images
- Higher volume of collected data would allow the possibility to deploy other forecasting models, like Arima-family models with periodicity and seasonality components
- Current approach to determining a maximum threshold of objects in a given hour to detect anomalies, could be improved by incorporating the predictions from Gaussian Process or Gradient Boosting Regressor. These thresholds would be then tailored to the specific time of the day and weather conditions
- Vanilla auto encoder used in this research for raw image anomaly detection could be improved by exploring a Variational auto encoder, which produces a much smoother latent space and can significantly improve quality of predictions
- There are already newer versions of libraries (Yolo, pymc), which offer more accurate results with greater performance and lower resource consumption
- To potentially improve the portability of this system, it would be an interesting study to replace Message Queues to RTSP, and compare the FTP and quality of received picture

**Final remark**

The findings and recommendations above, combined with the high velocity of the release of new research papers and libraries, show a plethora of opportunities for future research in this area.

The system proposed in this research can evolve, and play an important role in generating interesting insights about the world around us.

[index](../Index.ipynb) | [prev](./06.AnomalyDetection.ipynb) | [next](./08.08.Acknowledgements.ipynb)