# 7. Conclusions and Future Considerations

[index](../Index.ipynb) | [prev](./06.AnomalyDetection.ipynb) | [next](./08.08.Acknowledgements.ipynb)

Below are the key conclusions to the three research questions in this study:

**1. What is the level of complexity, required to build a fast, and reliable object detection pipeline, using *IOT devices* and *Computer Vision*?**

A reliable data collection stage manifested itself with a high complexity. $6$ months of image capture posed various challenges, and led to the following insights:

- It is crucial to place the camera in the right location. It may require wiring the house with the ethernet cables, and an investment in Power Over Ethernet adapters. Camera units (and *IOT* devices) placed outside of the house, need to be monitored against environmental effects: direct exposure to sunlight, humidity, dust, dirt, insects and even birds. All of them can have a negative impact on the picture quality
- Multiple tasks performed on each frame from the camera, may introduce processing latency. Motion sensing (*Background Subtraction*) with suitable parameters, and fast object detector (*Yolo v2*), can eliminate this problem
- Smooth transmission of *High Definition* images to a web browser, can be achieved by using *web sockets* in a separate Python thread
- Software services need to start automatically when devices are rebooted, or when network connections are broken. Utilizing *Supervisor* Linux utility, and a proper network setup can minimize the loss in data

**2. Given the dataset with collected images, can the future object counts be accurately predicted using *Machine Learning*?**

Object counts for a given category (*Person* or *Vehicle*), can be predicted with relatively low error rates using Machine Learning models.

This process requires a significant amount of image data extraction, cleaning and pre-processing. Numerous models of different type and complexity, have been tested (ranging from *Linear Regression* through *Bi-Directional LTSM Neural Networks*).

Given the evidence gathered in Chapter 5, there are two types of models, which can be successfully applied to make predictions: a probabilistic model (*Gaussian Process*), and a point estimate model (*Histogram-Based Gradient Boosting Regressor*).

While Gaussian Processes have an advantage of providing uncertainty about the predictions, Gradient Boosting models are faster to train, and more robust to the object category selection.

**3. Can *Anomaly Detection* algorithms assist in recognizing anomalous patterns in the object detection data?**

Applying anomaly detection algorithms to the collected image data, can generate useful results.

#### Hourly threshold estimation

Estimating a maximum number of objects per hour, allows to flag anomalies above that threshold. Each object category, like Person or Vehicle, is analyzed individually.

*Probabilistic approach*, which utilizes *gamma* distribution and *Poisson* likelihood function, produces an optimal result and classifies $61$ out of $4140$ observations as anomalous. 

#### Raw image classification

The second methodology applies an Auto Encoder Neural Network directly to raw image data. This technique is categorized as *Unsupervised Machine Learning*, as the historical images are not labeled. In contrast with *hourly threshold estimation*, multiple object classes are considered inside a single model.

The inner workings of this method, is to search for images, which differ the most from the others, using raw pixel data. This technique presents two opportunities:

- An alert can be triggered, if an incoming image deviates outside of a threshold (calculated using *mean squared error*). In an experiment, a gathering of people outside of the house, was successfully flagged as an anomalous event
- Time spent of manual image analysis, can be significantly reduced, by sorting an image collection using the anomaly threshold, in a descending order. Additionally, this approach should lower the risk of missing an important event

In a model evaluation stage, a hand-labeled dataset with $30$ images was used. The best model model was able to classify $9$ out of $15$ anomalies correctly. It obtained a *Recall score* of $0.6$, while not sacrificing the *F1 score* of $0.72$.

**Recommendations for future work**

By developing a Minimum Viable Product, incorrect assumptions and potential weaknesses, can be quickly identified in the core features. The *MVP* should also include a basic user interface, with a good representation of forecast and anomaly data.

Further future recommendations are summarized below:

- Modern AI systems should emphasize ethics and protect privacy. Privacy mode should at least blur people's faces, or even full silhouettes, if required
- To prove that the system is truly generalizable, it should ideally be deployed in another household
- Anomaly detection based on hourly threshold estimation, can be significantly enhanced, by incorporating forecast data. Threshold estimated via forecast predictions, would carry additional information, like day of the week, and weather conditions
- Portability might potentially be strengthened, by allowing to consume an *RTSP* stream, instead of only *Message Queues*
- Security can be enhanced by an addition of waterproof casing, a camera with night vision mode, or even another camera looking at the same scene, but from a different angle
- Current strategy for counting objects is rather basic, and uses *Euclidean Distance*. To allow for more advanced object tracking, *Kalman Filter* could be utilized
- In the raw image classification, *Variational auto encoder* could replace the vanilla version. It would prevent overfitting, and ensure that the properties of latent space, optimize generative process
- New versions of Python libraries could improve performance, and reduce resource consumption
- Overall cost of the hardware, could potentially be significantly lowered, assuming that the *on-device learning* alone can achieve accurate results, and high performance
- Higher volume of collected data, would open up the possibility, to test other forecasting models, which can use periodicity and seasonality components
- After AI is deployed into a household, it should be able to adopt itself over time, to the changes in the environment. This can be achieved by a careful selection of the training data, perhaps only the most recent subset

**Final remark**

Presence of AI in the Home Monitoring setting, is still quite underutilized.

With relatively low costs of modern hardware, and the progress in Computer Vision and Machine Learning, the adoption of AI in this area should expand.

Once the AI systems are focused around the ethics, usability and explainability, they are on a good trajectory.

Conclusions from the experiments conducted in the previous chapters, and the recommendations for future work referenced above, should be a good starting point to pick up this research from.

The proposed system can play an important role in keeping households more secure. In addition, it can generate valuable insights about our surroundings.

[index](../Index.ipynb) | [prev](./06.AnomalyDetection.ipynb) | [next](./08.08.Acknowledgements.ipynb)