# 1. Introduction

[index](../Index.ipynb) | [next](./02.LiteratureReview.ipynb)

### Background

Computers and Vision have been linked together already since the sixties.

In 1963 Larry Roberts in his Ph.D. (Roberts 1963) mentions that the "pictorial data" understanding by the machines has been a challenge for quite a while.

Since then there has been a universe of research in the area with some ups and downs, but the major breakthrough, and what is currently seen as the beginning of modern era can be associated with the 2012 paper by Alex Krizhevsky: ImageNet Classification with Deep Convolutional Neural Networks (Krizhevsky 2012).

From 2012 onwards we have observed an exponential progress, with vast research in the area of Object Detection and Recognition.

The methods derived in the last few years by Computer Vision researchers from around the globe are easy to utilise these days in the form of library packages, online articles or tutorials.

These ground-breaking "software" ideas have been accompanied by the substantial innovation in the hardware space and data availability.

Starting from using GPUs to accelerate computations, doubling CPU power every year and spike in highly affordable small form factor IOT devices (like Raspberry Pi), both, the software and the hardware disciplines alongside with the large volumes of image data have converged into some incredible opportunities.

### This Research

My research uses modern Computer Vision, Machine Learning and both: low and high resource computers to create a Home Monitoring System, capable of showing a live video stream with real time object detection and recognition.

Even though the spotlight will be on Machine Learning an it's application in Camera Monitoring, I have quickly understood how important seeing a big picture is. Firstly, how to show an output of the camera to the user and to the developer? How to choose the right hardware, how to organise the network topology and wire the house, how and where to mount the camera and how to deal with downtimes. Each of these questions tought me a lesson, which I will also briefly share in my dissertation.

The questions I will try to answer are: Can we use object detections to provide a forecast of objects of different class (Person, Car, Dog, Cat, etc.) to show up in some time interval? And can an alering system be developed based on the anomaly detection from recognised objects and image data collected in the process.

If the research is successful, it means that anyone can affort such a system and learn about their surrounding. Also, that kind of a system can be installed anywhere and predict traffic, tourist conjestion, animal behaviour and detect anomalous events in the surrounding area.

The key element of the system is that it is all open source, which allows for great extensibility and modifications without having to break priopriary software, like in the majority of smart camera systems.

A great benefit of such a system is that it does not require to stream the data outside of the location, which means that the internet connection is not required and no data breach is possible, which is a big worry these days.

### Limitations

Like any system, the system I am proposing here has it's limitations. These are usually associated with the type of the camera used in the process of video capture. The default Raspberry Pi camera does not have a night vision capability, which somewhat limits the usage as a security device. Not everything is lost however. Contradicting to the common theory about the night crimes, according to FBI, as reported by many home alarm companies in the online sources (alarmnewengland 2020), most of burgalies occur between 10AM and 3PM, when most of adults are at work or school.

The next limitation is the forecast accuracy. Due to the stochastic nature of the world around us, we will never be able to predict a count of expected cars in a given hour perfectly. But the question is: can we do better than just using averages by implementing a Machine Learning based solution?

### Conclusion

This dissertation will explore all the areas above and more in a great detail and hopefully will motivate a reader that such a system has already great capabilities and at the same time has a great potential for improvements and new creative ideas and ways to use the data to inform us better about the world around us.

- Introduce the topic and context. Clearly state the objectives of the project.
- Include an outline of the rest of report: e.g. a few lines describing each Chapter.

<a id="intro_1.A"></a>
## Motivation

Home Monitoring systems are used to keep our households safe. Some so-called **"Smart"** solutions, provide modern features, like motion sensing, alerts, and even object detection and recognition.

After several interviews with the owners of such systems, I have learned that they were dissatisfied with the amount of False Positive alerts when motion sensors are triggered for unexplainable reasons. They would also like their system to provide real time object detection and recognition while minimising the costs.

My idea is to explore this area in a great detail and propose a new, Machine Learning driven system.

It will collect and use object detections captured from camera, combined with additional data sources (like weather data), to generate a forecast of objects expected to see in a given time interval.

The system will then use Machine Learning to exploit irregularites in the data and classify observations as anomalies if they don't follow the normal trend.

## Thesis overview

This project required to engineer a 5 discrete components, which are briefly mentioned below, and will be fully explored in the next sections of this document:

### Hardware:
- Raspberry Pi 3 with PyCam camera module
- PC1 and PC2 (with NVidia GTX 1080 Ti GPUs)
- Router and switches
- Real life challenges in setting up this kind of system
- Not in scope, but mentioning as next step/potential improvements:
    - Upgrade R-Pi 3 to R-Pi4 for improved performance
    - Experiment with on-device Machine Learning to bring the cost down and eliminate a need for external PC


### Application Back-end / Object Detection Pipeline:
- **Framing the problem in ML context: How can we detect interesting objects in real time video stream?**
- Input: frames coming from R-Pi
- image preprocessing (resizing)
- Computer Vision - motion detection (background subtraction and finding contours)
- Yolo V2 object detection
- Output: images stored on SSD and frames streamed through Socket server
- Not in scope, but mention as a next step:
    - Ethical approach would be to blur people in the images
    - Bring Yolo to newest version (V3)
    - Break script apart into separate modules for easier maintenance
    

### Forecasting:
- **Framing the problem in ML context: How can we create a forecast for the count of objects to expect during the day? The role of forecast is to explain why an observation was called an anomaly / alert was triggerred?**
- Input: images stored during Object Detection with outliers removed, Dark Sky weather data
- Data preprocessing: how do we actually calculate object counts? What constitutes to a legit object count?
- Models to be tested:
    - Linear Regressor
    - Random Forest Regressor
    - Neural Networks with probabilistic loss functions:
        - Poisson
        - Negative Binomial
        - Zero-inflated Negative Binomial
- Best model selected and reasons
- Output: daily forecast (separate forecast for each object, generated during the night)
- Not in scope, but mentioning as next step/potential improvements:
    - Scan local network for devices on wi-fi add owner-at-home binary feature to the dataset to potentially improve forecast accuracy (analysis required here)


### Anomaly Detection:
- **Framing the problem in ML context: How can we detect that the number of objects is outside of the reasonable norm?**
- Input: images stored during Object Detection with outliers kept, Dark Sky weather data
- Data preprocessing: how do we actually calculate object counts? What constitutes to a legit object count?
- Models to be tested:
    - Z-score
    - IQR
    - Auto-Encoder Neural Network
    - Variational Auto-Encoder Neural Network
- Best method selected and reason
- Output: Anomaly detection model, capable of real time prediction if observation is an anomaly
- Not in scope, but mentioning as next step/potential improvements:
    - Scan local network for devices on wi-fi and add owner-at-home binary feature to the dataset to prevent alerts (unless instructed otherwise)

### Application UI:
- Inputs:
    - Real time frames and detections
    - Object count forecast for each object type
- Outputs:
    - Allow users to see the camera output with detected objects in bounding boxes
    - Allow users to see the current forecast
- Not in scope, but mentioning as next step/potential improvements:
    - Allow user to adjust settings about alerts
    
Each of these components bursts with a potential for improvements as new research, technologies and capabilities become available in the open source Machine Learning, Big Data processing, and UX world.

<a id="test"></a>

## Note to the Dear Reader

Before going any deeper, I would like to stress to the Reader, that this work has gone through a number of iterations:
- At first it was just a toy project of using a camera with Raspberry Pi for real time object detection in video streams
- Then, a break through idea emerged to collect and use these object detections to produce a forecast of objects expected to show up in a given time interval
- And finally, to make the system really useful, a concept for alerts triggered by anomaly detection models completed the scope

The learning here is that diving into one area often results in the brand new further ideas. I found it also priceless to discuss ideas with other people and take their feedback onboard with an open minded approach.

## TODO: State of the art

## Literature Review
- ...

## Arguments
- ...

## Conclusion
- ...

## Useful links:

### Forecasting Count Data:

This is ideal for predicting count data using Neural Networks. This notebook has a few interesting links with theory as well. Great resource and potentially the best found so far:
- https://github.com/gokceneraslan/neuralnet_countmodels/blob/master/Count%20models%20with%20neuralnets.ipynb. This is a NN with models:
    - Poisson loss
    - Negative binomial
    - Zero-inflated Negative binomial models

Another interesting post about Negative binomial function as a loss function in a NN
- https://stackoverflow.com/questions/55782674/how-should-i-write-the-loss-function-with-keras-and-tensorflow-for-negative-bi

### Anomaly detection:

Z-score and IQR (tried and tested, works well):
- https://towardsdatascience.com/ways-to-detect-and-remove-the-outliers-404d16608dba

Autoencoders:
- https://towardsdatascience.com/a-keras-based-autoencoder-for-anomaly-detection-in-sequences-75337eaed0e5

## Books owned:
- Bayes' Rule
- The Signal and the Noise
- A Student's guide to Bayesian Statistics
- Learning From Data
- Make Your Own Neural Network
- Deep Learning for Computer Vision with Python (Vol 1,2,3)

## Papers to cover:
- Yolo v2
- Background subtraction
- Auto-Encoders
- Find something for anomaly detection through Computer Vision