# 1 - Introduction

A.I, Artificial Intelligence, as we know if from Sci-Fi doesn’t exist yet. When the term A.I is used right now it’s more Algorithm Input. The Intelligence comes from the designer/developer. Even the leading A.I, in say self-driving cars, is still just data models and decision trees.

In this course we will take a look at how algorithms can be used without machine learning and then how machine learning can be used without algorithms. We will find the shortcomings of using these two in isolation and then the benefits of combining the two.

## 1.1 - Course outcomes

By the end of this course you should have a working knowledge of algorithms and machine learning and combine them into making a working A.I. We will work on some practical examples as we go while touching on the theory and terminology on what’s been worked on at that time.

We will take a look at a few examples of algorithms and A.I in the wild as well as what services are available to build an A.I or services that are already fully built that you can just use in your application.

## 1.2 - Timetable

| **Day 1**                                                | **Day 2**                            |
| -------------------------------------------------------- | ------------------------------------ |
| Introduction                                             | Machine Learning: Step-by-step       |
| Jupyter Notebooks                                        |                                      |
| Algorithms                                               |                                      |
| Machine Learning: What is Machine Learning?              |                                      |
|                                                          |                                      |
| Exercise: Sentiment analysis in text                     | Exercise: Pulling text from images   |
| Discussion: Existing algorithms and databases at Media24 | Discussion: A.I use cases at Media24 |

## 1.3 - Exercises

### 1.3.1 - Sentiment analysis in text

This exercise will be done in JavaScript and will use a third-part code base to help introduce you to some of the terminology used in later sections.

We will be analysing sentences to try identify it’s they are positive or negative. We will use the third-party code base to save ourselves from developing something from scratch and we will build on it to make it more useful in the real world. 


### 1.3.2 - Pulling text from images

This exercise will be done in either JavaScript or Python, depending on your skillset, and will introduce you to A.I tools and frameworks.

We will be generating our own data using publicly available fonts to build and training a system that can read text from images.

## 1.4 - Your lecturers

### 1.4.1 - Carl Binneman

I have been working with object oriented programming for about 10 years, education for 8 years and machine learning for the past 5 years.
I have mainly worked with interactive software on a variety of platforms, such as : games and websites for mobile/desktop in 2D, 3D, AR & VR

### 1.4.2 - Steven Eksteen

Steven has been in I.T for longer than he can remember. For the past few years I have been freelancing and consulting as a Solutions Architect and Research and Development.

# 2 - Jupyter Notebooks

Jupyter is widely used by almost every field that works with data or mathematics. It’s main benefit comes from integrating the runtime environment directly into an editor. This makes explaining complex ideas and examples easier without the need to jump between platforms to generate charts or run example code.


## 2.1 - Using notebooks

# 3 - Algorithms

## 3.1 - What is an algorithm (in relation to Machine Learning)?

Algorithms are often thought of as complex mathematical equations and sometimes they are. Most of them though, are basically just decision trees.

Think of a self-driving car:

- If there is a car in front of me, slow down
- If there is no car in front of me and I’m below the speed limit, speed up
- If there is something approaching in front of me, slow down

The complex maths comes in where there are situations that need to be weighted, like [The Trolly Problem](https://en.wikipedia.org/wiki/Trolley_problem), as an extreme example.

We will be looking at something simpler. Don’t let that statement ruin your ideas of making a self-driving car or facial recognition for the NSA after this course though. We will cover how to identify objects in images in the Machine Learning section. Self-driving cars would just require video analysis, which is basically just image analysis done very quickly.

## 3.2 Stand-alone Algorithms

## 3.3 - Practical use cases

### 3.3.1 - Sentiment analysis in text



# 4 - Machine Learning

## 4.1 - What is Machine Learning?

While the statement made earlier that Algorithms + Machine Learning = A.I is not entirely true, it still is. Machine Learning has built in algorithms. The algorithms we covered earlier are the ones that process data received back from the Machine Learning part of our A.I and combine it with other data to return sane results.

The algorithms built into Machine Learning systems are there to help process raw data going into the system. There are many, many of these and they can sometimes cause the Machine Learning system to be very specific in its use. One example of this would be OCR (Optical Character Recognition). We will be using this in the exercise for this section and, as you’ll see, that even though it’s processing images, it’s processing them in a very specific way to accurately identify text.

Besides OCR, you’ll find many, many other terms floating around that can make Machine Learning seem far more complicated than it is. Terms like Deep Learning and Neural Networks are the two that will pop up most often. Deep Learning in particular can sometimes be confusing because it’s often freely interchanged with Machine Learning. Deep Learning is technically a subset of Machine Learning. Where Machine Learning relies on a fixed dataset to make decisions and those decisions will always be made against that dataset. When an incorrect decision is made, manual intervention would be required to retrain that incorrect result.

Deep Learning uses an initial data set to start making decisions, but has functions built in to keep improving itself. Deep Learning would be the one of the two that is most “human-like”. It’s the reason why Deep Learning is what is used when making A.Is that should have human attributes or perform human-like tasks. Siri, Alexa, Ok Google, these are all examples of Deep Learning. They teach themselves to understand your voice and accent over time.

https://playground.tensorflow.org

https://github.com/mapbox/pixelmatch

### 4.1.1 - Machine Learning vs Databases

With all the talk so far about storing data, classifying data, etc, it might be easy to think “This sounds like something a database can do”. There are many instances where that would be correct.

#### 4.1.1.1 - Recommendation lists

Many e-commerce sites like Takealot and Amazon, media sites YouTube and Netflix and many other examples, often use recommendations to direct users to new items and content. It may seem as though the system is learning about you, but this is generally just algorithms using mostly analytics data.

Elastic (Previous ElasticSearch) is a popular search engine that enables product data, for example, to be added and classified, weighted and then searched. Each search a certain user makes can be logged and then products suggested based on those searches.

Extending with more data and algorithms:

By linking the data from searches with data from analytics, the recommendations can be made more accurate by including products in the same category of items they have previously bought or maybe added to a wishlist.

Extending with Machine Learning:

Most e-commerce platforms will probably manually start showing adverts during certain times of the year for Christmas, Black Friday, etc. By using sales data with Machine Learning buying trends could be presented to advertisers for the best times to advertise certain products or brands.

#### 4.1.1.2 - Trending lists

Social media sites like Twitter and Facebook and News sites like News24 use trending lists to show users popular posts and articles. In the most basic form these are examples of [Map-Reduce](https://en.wikipedia.org/wiki/MapReduce) functions in databases. The most common use of these functions would be in things like game leader-boards and analytics when aggregating metrics for charts.

In a trending list the total views of a post or article will be reduced and the highest view numbers in the last 24 hours will be shown as the top trending.

Extending with more data and algorithms:

By using more analytics data, the map-reduce functions could include user data and create user based trending lists.

Extending with Machine Learning:

Machine Learning has been used extensively for time-series data predictions. Everything from stocks to server monitoring. Analytics, even though there is a lot of tagged context, is time-series data. Using Machine Learning and NLP, which we spoke about earlier, we can identify subjects, places, etc of posts and articles that have previously trended to predict which articles might trend next.

### 4.1.2 - Discussion: Machine Learning vs Databases

Other examples of where Machine Learning is used to build upon features of databases.

## 4.2 - Creating a Machine Learning framework

As with almost every choice of technology in engineering, choosing the correct type of Machine Learning is essential. We will see in the OCR exercises how one short term choice to achieve an immediate goal might hinder the overall project.

https://blog.statsbot.co/neural-networks-for-beginners-d99f2235efca

https://ivrlwww.epfl.ch/research/topics/text_reading.html

[https://www.analyticsvidhya.com/blog/2020/04/build-your-own-object-detection-model-using-tensorflow-api/](https://www.analyticsvidhya.com/blog/2020/04/build-your-own-object-detection-model-using-tensorflow-api/?utm_source=blog&utm_medium=build-your-own-ocr-google-tesseract-opencv)

https://towardsdatascience.com/a-gentle-introduction-to-ocr-ee1469a201aa

https://matthewearl.github.io/2016/05/06/cnn-anpr/

### 4.2.1 - Data

Topics covered in Carl’s “Data” section

https://github.com/tkrkt/text2png

https://github.com/ankush-me/SynthText

https://github.com/matthewearl/deep-anpr

https://github.com/udacity/self-driving-car/tree/master/annotations

### 4.2.2 - Training

Topics covered Carl’s “Networks + Training” section

https://supervise.ly/product

### 4.2.3 - Verification



## 4.3 - Practical use cases




### 4.3.1 - Google’s Captcha




### 4.3.2 - Pulling text from images

https://github.com/tkrkt/text2png

https://github.com/ankush-me/SynthText





https://github.com/matthewearl/deep-anpr


## 4.4 - Capabilities and limits

Topics covered in Carl’s “Knowing what ML can do” section

## 4.5 - Machine learning and Big Data

When we spoke about Machine Learning vs Databases (4.1.1) we covered two examples on where Machine Learning could be used to extend databases. When searching around the internet you’ll find this to be the most covered topic in the tutorials you find. These are generally predictive models. The one thing you need for predictions is a lot of data.

Big Data is rarely ever just a large collection of the same type of data and the strength of Big Data platforms is in their ability to process data as it comes in rather than processing large chunks of data at scheduled intervals. 

# 5 - Linking it all together

**Feature engineering**

# 6 - Discussion: How you can use A.I

https://quickdraw.withgoogle.com/#