# 1. Introduction

[index](../Index.ipynb) | [next](./02.LiteratureReview.ipynb)

## 1.1. Background

Computers and Vision have been already linked together since the sixties.

In 1963, Larry Roberts in his Ph.D. (Roberts 1963) mentions that the *pictorial data* understanding by the machines has been a challenge for quite a while.

Since then, there has been a universe of research in the area with some ups and downs, but the most recent major breakthrough, and what is currently seen as the beginning of modern era can be credited to 2012 paper by Alex Krizhevsky: *ImageNet Classification with Deep Convolutional Neural Networks* (Krizhevsky 2012).

From 2012 onwards there has been an exponential progress, with vast research in the area of Object Detection and Recognition.

The methods developed by Computer Vision researchers from around the globe are easy to utilise these days in the form of library packages, online articles or tutorials and they would be like something from an alien civilization for Larry Roberts.

These ground-breaking "intelligent software" ideas have been accompanied by the substantial innovation in the hardware space and data availability.

Starting from using GPUs to accelerate computations, doubling CPU power every year and spike in highly affordable small form factor IOT devices (like Raspberry Pi), the software and hardware disciplines alongside with the large volumes of image data have converged into some incredible opportunities for solving interesting challenges.

## 1.2. This Research

My research uses modern Computer Vision, Machine Learning and both: low and high resource computers to create a Camera Monitoring System, capable of showing a live video stream with real time object detection and recognition.

The questions I will try to answer are:
- How complex is it to build a smooth and reliable object detection pipeline using *Computer Vision*?
- Can future object counts be predicted using *Machine Learning*, given the collected data with object detections?  
- Does object detections data contain anomalous signals, which can be recognised with *Anomaly Detection* algorithms and used for alerts to the users?

If the research goals are achieved, then the final product should be generic enough to apply it in other households and to other use cases, like predicting traffic, tourist conjestion or animal behaviour, and finding unusual events or even security threats from the video stream.

From a data protection and security perspective, the aim is to keep the data local, which means that the internet connection should not be required and data breach is much less probable.

And lastly, I would like this thesis is distributed as an open source project, so anyone curious can see how all these pieces are glued together, make their own improvements and build their own datasets.

## 1.3. Limitations

Like any software in the real world, the system I am proposing here has its limitations.

The type of the camera used in the process of video capture is very basic. The default Raspberry Pi camera ([PiCam](https://picamera.readthedocs.io/en/release-1.13/)), does not have the Night Vision capability, which somewhat limits its usage as a security device. However, according to FBI, and as reported by many home alarm companies in the online sources (alarmnewengland 2020), most of burgalies occur between 10AM and 3PM, when most of adults are at work or school.

The next limitation is the forecast accuracy. Due to heavily stochastic nature of the world around us, it is not flawless. But the <span style="background-color: yellow;">main objective of this research is less about accuracy and metrics, but more about usefulness.</span>

## 1.4. Guidelines for reader

The whole dissertation has been written in Jupyter Notebooks. It means that code samples can be easily copied, and project can be cloned and code executed on another machine.

In general, the text material flows similarly to the data flow in the project, starting from an overall system design and hardware requirements, then moving to forecasting, and ending with anomaly detection.

I have tried to keep the code away from the main chapters, as it makes them difficult to follow.

I have extensively used two Jupyter extensions:
- [Diagram design](https://blog.jupyter.org/a-diagram-editor-for-jupyterlab-a254121ff919)
- [Table of contents](https://github.com/jupyterlab/jupyterlab-toc)

**Guidelines to be aware of:**
- Chapters contain links to the previous and next chapters on top and in the bottom of each
- <span style="background-color: yellow;">Key highlights</span> will be highlighted with a yellow background
- Important concepts, areas or terminology will be written in *italic*
- Some chapters provide a reference to an in-depth study (called *Extras*) with well documented code samples and additional commentary and plots
- Chapters are structured as a hierarchy with maximum two levels of depth (for example $6.$ -> $6.1.$ -> $6.1.1.$)
- There are often clickable [links](https://en.wikipedia.org/wiki/Artificial_intelligence) to create a better flow
- There are quite a lot of mathematical notations written in [LaTex](https://en.wikibooks.org/wiki/LaTeX/Mathematics)
- Each reference to a code or function will be formated like `this_function`
- Some paragraphs will be divided by a title in **bold font** to improve text spacing

Next Chapter contains a **Literature Review**, where I will study the theoretical framework related to my work in this research.

[index](../Index.ipynb) | [next](./02.LiteratureReview.ipynb)