# Artificial intelligence and machine learning 


**_Artificial intelligence (AI) is a technology with the ability to reason and make decisions given a set on information._** AI in pop culture is often depicted with cyborgs bringing on the singularity (the point when they are smarter than people. This is a Sci-Fi notion of the idea of _artificial general intelligence_ (AGI), the idea of a technology that can understand and learn the whole range of tasks that a human can. AGI is not yet technically feasible. It is also a highly anthropomorphic idea, this notion that the pinnacle of knowledge is reasoning as we do currently. 

I think of Artificial intelligence a complementary not competitive technology to human intelligence. Some AI technologies like image recognition is catching up to human levels of performance. In many areas dealing with high dimensional data AI is already better than humans since we have trouble thinking in higher dimensions. The types of things that we consider Artificial intelligence also tends to shift with technical accomplishments. We take for granted that computers can play chess better than us, spell check our documents, plan our travel routes, find one document out of millions of petabytes on the internet in a few seconds. Once your car starts driving you around and you phone translates language into your ear in real time we'll probably just think of that as normal things a car does. And move our definitions of AI to the future again.

In contrast to artificial general intelligence, applied AI also called weak AI is what we currently practice. This conforms to the definition of AI above "a technology with the ability to reason and make decisions given a set on information". Notice here that learning is not necessarily required for this, AI can be further broken down into subdisciplines 1) Logic all programming 2) Machine Learning based AI's and 3) Operations Research based optimization and constraint satisfaction.



![AI-ML_overview.png](attachment:AI-ML_overview.png)

# Machine learning

It's easy to see why AI and ML are often conflated, but ML is the subset of AI that learns from data to make decisions.  With the amount of data collected increasing, the _learning_ component of machine learning has made it much more valuable. One way we can think about the field of machine learning is by the types of problems that it can solve. They fall into 4 main areas:

## Classification

>*The assignment of samples to a category.*  The simplest form of this is binary classification, e.g. a plant is classified as diseased or healthy, based on an image of the plant. A sample can also be assigned to one of many categories (multi-class classification) e.g.  healthy, nutrient deficient, or insect damaged.


## Regression

>*The prediction of a continuous value.*  The simplest form of this ordinary least squares (OLS) regression on a single variable. Multivariate regression and generalized linear models (GLMs) are also examples of regression.

>_So how does the regression done in machine learning differ from the regression we have always done?_

>Machine learning often attempts to fit models with many more dimensions or independent variables than classical OLS regression. Using  all the variables naively would result in over-fitting. Instead, we use _**regularization**_ to reduce the weight of most variables. 


## Clustering

>Clustering attempts to group data together by the similarity of their features. Some clustering methods require uses to determine the number of clusters, others attempt to select the number based on criteria to optimize. 


## Dimensionality reduction

>Dimensionality reduction projects data with many variables or dimensions onto a lower dimensional space. Principal components analysis (PCA) and other ordination techniques like PCoA, NMDS, CCA DDA. More information of classical ordination techniques can be found [here.](http://ordination.okstate.edu/overview.htm.

>The machine learning community has developed new methods for dimensionality reduction including [t-Distributed Stochastic Neighbor Embedding (t-SNE)](https://distill.pub/2016/misread-tsne/). and [autoencoders](https://www.jeremyjordan.me/autoencoders/)


![scikit_ml_map.png](attachment:scikit_ml_map.png "From Scikit learn")

# Supervised vs. Unsupervised

Classification and regression are supervised learning tasks. Dimensionality reduction and clustering are unsupervised.

Supervised | Unsupervised
----------|--------------
Classification | Clustering
Regression | Dimensionality reduction


Supervised problems have **_labeled_** data or **_response variables_** that can be used to train a model.  That model is then used to predict the label or value of a response variable on new data.


Supervised learning problems have answers.  Unsupervised learning problems don't.  There are equally valid ways of grouping objects together. Maybe you group by color, or by shape. No way is objectively better.  Similarly, dimensionality reduction tries to maximize the amount of information that can be stored in lower dimensions, but there are different metrics of information content.

**_This course focuses on supervised learning_.**


# Key  machine learning terminology

This content modified from the [Google machine learning crash course](https://developers.google.com/machine-learning/crash-course/framing/)

As an example lets use the classic [Iris data set](https://en.wikipedia.org/wiki/Iris_flower_data_set) which consists of The data set consists of 50 samples from each of three species of Iris (_Iris setosa_, _Iris virginica_ and _Iris versicolor_). Four features were measured from each sample: the length and the width of the sepals and petals, in centimeters.

[Iris_dataset.csv](/assets/nb-datasets/iris-dataset.csv)

![iris-machinelearning.png](attachment:iris-machinelearning.png "From Machine Learning in R for beginners, on datacamp")

## Labels
> A label is the thing we're predicting—the `y` variable in simple linear regression. The label could be the species of a plant, the kind of animal shown in a picture, the meaning of an audio clip, or just about anything.

## Features
> A feature is an input variable—the `x` variable in simple linear regression. A simple machine learning project might use a single feature, while a more sophisticated machine learning project could use millions of features, specified as: $$x_1, x_2, \cdots, x_N$$

In the iris example, the features to learn from are:

1. sepal length
2. sepal width
3. petal length
4. petal width


## Examples

An example is a particular instance of data, x. (We put x in boldface to indicate that it is a vector.) We break examples 

* labeled examples
* unlabeled examples

A **labeled example** includes both feature(s) and the label. That is:

```
labeled examples: {features, label}: (x, y)
```
Use labeled examples to train the model. In our iris example, the labeled examples would be the size measurements we have collected from irises for which the species is definitively known.

For example, the following table shows 6 labeled examples from the iris dataset:

|sepal_length (feature) |sepal_width (feature) |petal_length (feature) |petal_width (feature) |species (label)  |
|------------|-----------|------------|-----------|----------|
|5.1         |3.5        |1.4         |0.2        |setosa    |
|4.9         |3.0        |1.4         |0.2        |setosa    |
|7.0         |3.2        |4.7         |1.4        |versicolor|
|6.4         |3.2        |4.5         |1.5        |versicolor|
|6.2         |3.4        |5.4         |2.3        |virginica |
|5.9         |3.0        |5.1         |1.8        |virginica |


An **unlabeled example** contains features but not the label. That is:

```
unlabeled examples: {features, ?}: (x, ?)
```

Here are 3 unlabeled examples from the same iris dataset, which exclude species:

|sepal_length (feature) |sepal_width (feature) |petal_length (feature) |petal_width (feature) |
|------------|-----------|------------|-----------|
|4.6         |3.4        |1.4         |0.3        |
|6.0         |2.9        |4.5         |1.5        |
|6.7         |3.1        |5.6         |2.4        |


Once we've trained our model with labeled examples, we use that model to predict the label on unlabeled examples. In the iris classifier, unlabeled examples are new irises that a botanist has not yet labeled.

## Models

A model defines the relationship between features and label. For example, a iris classification model might associate certain features like petal length strongly with a particular species. Let's highlight two phases of a model's life:

* **Training** means creating or **_learning_** the model. That is, you show the model labeled examples and enable the model to gradually learn the relationships between features and label.

* **Prediction**  (sometimes called inference) means applying the trained model to unlabeled examples. That is, you use the trained model to make useful predictions (`y'`). For example, during inference, you can predict species for new unlabeled examples.
