<a href="https://colab.research.google.com/github/ainnoun/UTS_ML_2019_ID13317464/blob/master/NB01_NaiveLearning_PythonBasics.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Model of Learning Procedure

## Naive Learning

We implement the naive learning scheme. More specifically we want to represent a map from of $X$ to $y$ intuitively -- by using a complete table of all possibilities exhaustively. To make it possible, we limit $X$ to be a discrete 2D tuple -- be one of a dot in a 2D square array, you will see examples shortly -- and $y$ to be 0 or 1. 

- Build a `Python object` to represent all the possible relationship between $X$ and $y$
- Given a training sample, i.e. a pair of $X$ and $y$, the learning-model object can eliminate all the possibilities that are incompatible with the observation.
- Given a test sample, i.e. an $X$ without $y$, the learning-model object can return all the possibilities and their respective $y$-values at the test $X$.

### Represent All X-Samples

#### Attempt Round 1

In [0]:
def generate_all_X_space_samples():
    """
    As the function name shows,  here we want to return the 
    complete set of possible X values. The straightforward 
    implementation of the X-space is a list of tuples. Let us 
    consider a simple range: the integers from 0 to N-1, and 
    use this range for both dimensions. Say N=3, we want to 
    generate X-samples as
    [
        (0, 0),
        (0, 1),
        (0, 2),
        (1, 0),
        (1, 1),
        (1, 2),
        (2, 0),
        (2, 1),
        (2, 2),
    ]
    
    For small N, we can explicitly write out the list, but we need 
    a program to generate such a list for arbitrary N:
    """
    
    # Let's make an empty list
    X_space = []
    
    # Study the elements in the example list, and fill up our
    # X_space, e.g. by
    X_space.append((0, 0)) # A sample in X is a tuple, so we use 
    # a pair of parentheses, i.e. the input to the "append" function
    # is "(0, 0)", not "0, 0", which will be interpreted as 2 inputs.
    X_space.append((0, 1))
    X_space.append((0, 2))
    # ... you can complete the rest if you wish, but better read on.
    # we will use smarter methods.
    
    # Last but note least, 
    return X_space

<span style="color:#006000"><b>EXERCISE</b></span>
In the cell below, experiment with the function `generate_all_X_space_samples` we just defined. You can manipulate the definition of the function  and  observe the change of its behaviour. 

In [3]:
X_space = generate_all_X_space_samples()
print(X_space)

[(0, 0), (0, 1), (0, 2)]


#### Attempt Round 2

In [0]:
def generate_all_X_space_samples():
    """
    We will use loops to generate the tuples!
    """
    
    # Let's make an empty list
    X_space = []
    
    # Simple observation shows the first 3 tuples are (0, j)
    # and j is running from 0 to 3 (exclusive, Python convention)
    
    # This is the perfect case to use a for-loop, so we can write the
    # list building program this way:
    
    # for j in range(3):
    #     X_space.append((0, j))
    # for j in range(3):
    #     X_space.append((1, j))
    # for j in range(3):
    #     X_space.append((2, j))
    
    # You may have noticed, the first element in each tuple in those
    # loops runs from 0 to 3 (exclusive) as well, and can also be
    # managed by a loop
    for i in range(2):
        for j in range(3):
            X_space.append((i, j))
    return X_space
    

In [18]:
  X_space = generate_all_X_space_samples()
  print(X_space)

[(0, 0), (0, 1), (0, 2), (1, 0), (1, 1), (1, 2)]


<span style="color:#006000"><b>EXERCISE</b></span>
Experiment with the for loop above. Try to generate x spaces of different sizes. 

#### Attempt Round 3

We further adjust our implementation in two ways:

1. It is natural for the function to be flexible so we can generate different sizes of X conveniently withing rewriting the code every time.

2. Python provides a more natural way to write loops to generate object collections (e.g. list of objects). 

Let's try 2 in the cell below and then re-write our X-sample generator.

In [19]:
# 1. building list by appending one element each time
my_list_a = []
for i in range(5):
    my_list_a.append(i**2) # square
print("List-a of Sqr for [0, 5):", my_list_a)


List-a of Sqr for [0, 5): [0, 1, 4, 9, 16]


In [20]:

# 2. Write the message above naturally as python code
my_list_b = [i**2 for i in range(5)] # Bracket [..] to construct a list
print("List-b of Sqr for [0, 5):", my_list_b)

List-b of Sqr for [0, 5): [0, 1, 4, 9, 16]


<span style="color:#006000"><b>EXERCISE</b></span>
Try to generate a list of even numbers from 2 to 10 (exclusive)

In [21]:
# 3. Powerful generator
# The element object can be complex object. 
# The []-generating loop can be nested.
# The generation process can be conditioned, too.

my_list_c = [(j, j + i**2) for i in range(10)
             if i % 2 == 0
             for j in range(100, 600, 100)
             if j != 300]
print(my_list_c)

[(100, 100), (200, 200), (400, 400), (500, 500), (100, 104), (200, 204), (400, 404), (500, 504), (100, 116), (200, 216), (400, 416), (500, 516), (100, 136), (200, 236), (400, 436), (500, 536), (100, 164), (200, 264), (400, 464), (500, 564)]


__CAVEAT__: Although looking very neat, internally this kind of generator does not save you any time or space complexity.  It is purely for readability,  so use it only to IMPROVE the readability!

In [0]:
def generate_all_X_space_samples(N):
    """
    Generate complete sample of X-space
    :param N: Discrete X-space dimension size. The size is homogeneous
      in all dimensions.
    :type N: int
    """
    
    return [(i, j) for i in range(N)
            for j in range(N)]
  

<span style="color:#006000"><b>EXERCISE</b></span>
In the cell below, experiment with the new function `generate_all_X_space_samples` we just defined. Please try different X-space sizes and investigate different X-samples.

In [23]:
X = generate_all_X_space_samples(3)
print("There are {} samples in X-space.".format(len(X))) # {}-format
# is used to inject some information from variables to a string.
print("All samples:\n\t", X) # \n: new line, \t indent

# You can also investigate using multiple print's
for sample_id in range(len(X)): # Try to figure out the construction
    print("Sampe {}: {}".format(sample_id, X[sample_id]))
    
# You can use [:] indexing to conveniently check a subset of data samples
print("Sample 1-5 (exc):", X[1:5])
# [:End] means start from 0
print("Sample 0-3 (exc):", X[:3])
# Similarly, [Start:] means until the end
print("Sample 3-Last (inc):", X[3:])
# You can use -i (<0 index) to represent "reversing from the end"
print("Last Sample:", X[-1])
print("Sample 3-Last (exc):", X[3:-1])

There are 9 samples in X-space.
All samples:
	 [(0, 0), (0, 1), (0, 2), (1, 0), (1, 1), (1, 2), (2, 0), (2, 1), (2, 2)]
Sampe 0: (0, 0)
Sampe 1: (0, 1)
Sampe 2: (0, 2)
Sampe 3: (1, 0)
Sampe 4: (1, 1)
Sampe 5: (1, 2)
Sampe 6: (2, 0)
Sampe 7: (2, 1)
Sampe 8: (2, 2)
Sample 1-5 (exc): [(0, 1), (0, 2), (1, 0), (1, 1)]
Sample 0-3 (exc): [(0, 0), (0, 1), (0, 2)]
Sample 3-Last (inc): [(1, 0), (1, 1), (1, 2), (2, 0), (2, 1), (2, 2)]
Last Sample: (2, 2)
Sample 3-Last (exc): [(1, 0), (1, 1), (1, 2), (2, 0), (2, 1)]


#### Attempt Round 4 -- Using Numpy Arrays

Python list is convenient for us to store and access data samples.  When it comes to doing analysis or machine learning algorithms it is more convenient if we can easily access individual attributes or perform computational operations on specified parts of the data. We will use numpy library, it is designed to manage array data. Numpy arrays can also be easily converted to/from `data frames`, `GPU device arrays`, `images (pixel arrays)`, etc.

In [0]:
# Let us use the numpy library
import numpy as np # the "as" is optional and to save typing

def generate_all_X_space_samples_np(N):
    """
    :param N: X-space will be an N by N discrete-valued array
    """
    
    # Let's make an empty list
    X_space = np.zeros((N**2, 2)) # the 
    
    # Loop is similar to that in Round2
    # except that all samples are created at the
    # beginning, and we now use an index to loop over them
    index = 0
    for i in range(N):
        for j in range(N):
            X_space[index][0] = i
            X_space[index][1] = j
            index += 1
    return X_space

In [0]:
X_np = generate_all_X_space_samples_np(3)
print(X_np)
print(type(X_np)) # Note the type is a np-array
# Check out a sample
i = 3
print("An X-Sample[{}]:{}".format(i, X_np[i]))
# Check attribute-0 for all samples
j = 0
print("X-Attribute[{}]:{}".format(j, X_np[:, j]))
# [:, 0]: take from all (:) samples, the attribute-0

<span style="color:#006000"><b>EXERCISE</b></span>
Please check (print out) the second (index=1) attribute for samples 1-5 (exclusive). 

Numpy arrays provide interface to apply computations for all elements. E.g. we may want to scale all elements in $X$ between $[0, 1]$. Numpy arrays provide interface to apply computations for all elements. Using an ordinary Python list,  we need to reconstruct another list to store the result,  and perform the competition element by element.

In [0]:
def scale_X_to_0_1(X, N):
    """
    Get a new list scaling the elements in X by 1/N.
    """
    new_list = []
    for x in X: # you can iterate over each element (a tuple in x)
        # now x is one data sample in X, such as (0, 2)
        new_list.append((x[0]/N, x[1]/N))
    return new_list

In [0]:
X = generate_all_X_space_samples(5)
X1 = scale_X_to_0_1(X, 5)
print(X1)

In [0]:
# On te other hand, operating on numpy array is much easier
X_np = generate_all_X_space_samples_np(5)
X1_np = X_np/5
print(X1_np)

Not only the code is more concise. The computation is done internally using fast C implementation, and therefore more efficient.

In [0]:
%timeit X1 = scale_X_to_0_1(X, 5)

In [0]:
%timeit X1 = X1_np = X_np/5

<span style="color:#006000"><b>EXERCISE</b></span>
Note the time units $\mu$s ($10^{-6}$ sec) / ns ($10^{-9}$ sec) used in the measurement above. You can make a larger matrix e.g. using `generate_all_X_space_samples(500)` and compare the difference. 

Finally, `numpy` provides an interface to generate this kind of X samples,  by sampling a grid in a multidimensional space. `meshgrid` takes the grid positions at each dimension and returns the grid matrices. In our example, matrix-0 for attribute-1, and matrix-1 for attribute-0 (the order of attributes can be adjusted when we composing the final X, and is not essential). I will not go to details please find more about the function referring to the [doc](https://docs.scipy.org/doc/numpy/reference/generated/numpy.meshgrid.html).

Please study the following example for some basic array operations.

In [0]:
def generate_all_X_space_samples_np(N):
    """
    :param N: X-space will be an N by N discrete-valued array
    """
    X0, X1 = np.meshgrid(np.arange(N), np.arange(N))
    # We will have the following for N=3
    # X0:      X1:
    # 0 1 2    0 0 0
    # 0 1 2    1 1 1
    # 0 1 2    2 2 2
    
    # X0, if "flattened", becomes
    # 0 1 2 0 1 2 0 1 2
    
    # flattened X0 and X1 if "stacked" becomes
    # [[0 1 2 0 1 2 0 1 2
    #  [0 0 0 1 1 1 2 2 2]]
    
    # The following matrix, 
    # [[a b c]
    #  [d e f]]
    # if "transposed" (numpy operator "T"), becomes
    # [[a d]
    #  [b e]
    #  [c f]]
    return np.stack([X0.flatten(), X1.flatten()]).T

In [0]:
print(generate_all_X_space_samples_np(3))

In [0]:
%timeit generate_all_X_space_samples_np(500)

In [0]:
%timeit generate_all_X_space_samples(500)

In [0]:
# Finally, we can make version that includes the normalisation 
# (1/N) in the construction
def generate_all_X_space_normalised_samples_np(N):
    """
    :param N: X-space will be an N by N discrete-valued array
    """
    X0, X1 = np.meshgrid(np.arange(N), np.arange(N))
    return np.stack([X0.flatten(), X1.flatten()]).T / N

### Represent all possible X-y relations

We will create a template from which we can generate objects, which represent  _generic_ relationship from all $X$-samples in to binary $y$.

#### Initialise the framework

In [0]:
# Let us first prepare the X-space as discrete samples as above. 
# And before we start building all the possible X-y mappings. 
# It is sensible to have an idea about how many such 
# mappings we are going to consider.

# So here is our first attempt of making the object template 
# of the all-inclusive mapping representation.
class CompleteDiscrete2DBinaryMapping(object):
    """
    An exhaustive representation of 2D X space to binary targets.
    The 2D space is represented using discrete grid points.
    """
    def __init__(self, N):
        """
        Create an object representing all possible mappings from 
        2D grid points to {0, 1}. 
        :param N: X-space samples are N by N grid in [0, 1)**2
        """
        self.grid_x = generate_all_X_space_samples_np(N)
        self.h_size = 2 ** (N**2)
        
    def size(self):
        """
        Total number of possible mappings.
        
        Note this tend to be really large number for any
        respectable N.
        """
        return self.h_size

In [0]:
complete_model = CompleteDiscrete2DBinaryMapping(10)
print("We are going to build {} different mappings."
       .format(complete_model.size()))

<span style="color:#006000"><b>EXERCISE</b></span>
Please review our discussion in class and figure out why we compute the size of the possible mappings to be $2^{N^2}$? 

So for any respectable problem size,  exclusively consider all possibilities is exceeding the capability of a computer. Can we possibly implement such an object?

#### Focus on prediction

Yes and no, we employer implement a representing all possible mappings.  But we cannot wait for it to make any useful predictions,  because it takes very long time to work. The point here we will adopt the _duck-typing_ / interface oriented programming protocol to have some object that works. This way of building programs is widely used in Python (and particularly useful in data science where the storage demand can be very large).

Duck-typing: 

> If something that walks like a duck and the quacks like a duck then it is probably a duck.

That is, we focused on how the object will be used and maintaining necessary information in working conditions only. 

Since we are implementing a data model _family_. At any particular call, we need only to specify the $y$ value (0 or 1) for some $X = (X_1 \in [0, 1), X_2 \in [0, 1))$ according to a _particular member_ in this family. That is, we do not need to worry about storing all possible mappings at one time.

In [0]:
# To be specific, we just need to implement such a function
def predict_according_to_hypothesis(X, hypothesis_id):
    """
    :param X: a data has 2 attributes
    :param hypothesis_id: a number in 0..1267650600228229401496703205376 
        (e.g. N=10)
    NOTE: for stand-alone function (not belonging to any class,
        "unbounded" is the technical term), we don't have the "self"
        in the first place in the input argument list.    
    """
    y = 0 # or 1
    return y

#### Predict using one assigned hypothesis

Now we need to solve two problems,
1. We need to verify the input X as one of the 2D grid points  according to our problem setting.  If it is not, quantise it to one of them.
2. Figure out according to the particular mapping specified by `hypothesis_id` (The technical term of such a hypothetical mapping is a _hypothesis_),  what is the corresponding y value.

The first problem can be solved by finding the nearest the neighbour to the input X from all the 2D grid points. This, of course, will remind us the nearest neighbour classifier. There is one essential difference though: there is no training data for our nearest neighbour classifier to refer to, so we have to assign some hypotheses, which leads us to the second problem.

In [0]:
# To compute the nearest neighbour, 
# please experiment with the following code.
X_np = generate_all_X_space_samples_np(2)
print("2D points")
print(X_np)
Xin = np.array((1, 2))
print("Input X")
print(Xin)
print("Difference")
print(X_np - Xin)

Amazingly, we have implemented the difference between the input $X$ to __each one of the grid points using just one operation__.  This seemingly incompatible substraction has been implemented in numpy using the mechanism _broadcasting_. It allows binary operators to work between one array $A$ of 
$n_1 \times n_2 $ and the other $B$ of $n_2$, while considering the larger $A$ to contain $n_1$ small arrays and applying the operation between each of the $n_1$ small arrays and $B$. 

It also generalises to the case when A is of $n_1 \times n_2 \times n_3 \times n_4$ and B is of $n_3 \times n_4$. Then we view $A$ as $n_1\times n_2$ cells and each cell is an $n_3 \times n_4$ array.


In [0]:
# To compute the nearest neighbour
diff = X_np - Xin
diff_square = diff ** 2 # each element
diff_norm2 = diff_square.sum(axis=1) # summing up every row, so now we have 
# N**2 distances (same number of X-rows) and need only to find the 
# smallest one.

In [0]:
print("The index of the nearest x-grid point is {}"
      .format(np.argmin(diff_norm2))) # argmin returns the index of the 
# smallest element in an array (take care and read doc for multi-dim arrays)

There we consider what the hypothesis would say about the y value at that particular x-grid point, such as point-3. You may have already guessed as we have totally $N^2$ x-grid points,  and the total number of possible hypotheses is $2^{N^2}$. We are exploring all possible binary combinations with $N^2$ bits. Say, $N=3, N^2=9$, we just count 9-bit binary numbers. And if you ask: what is hypothsis-178’s prediction on the 3rd x-grid point. We can just check the 3rd bit of the binary number corresponding to 178.

In [0]:
# to convert a number to binary format
print("{:b}".format(35))
# to specify the number of bits
print("{:9b}".format(35))
# to specify the number of bits and fill unused bits with 0
print("{:09b}".format(35))
# to specify the number of bits and fill unused bits with 0
# and finally take out the 3rd bit
print("{:09b}".format(35) [2])

# given N, build the "formatting" string (a meta string you use to 
# format other strings
N=3
print("{:0" + str(N**2) + "b}") # "+" concatenates strings

#### Put everything together

In [0]:
class CompleteDiscrete2DBinaryMapping(object):
    """
    An exhaustive representation of 2D X space to binary targets.
    The 2D space is represented using discrete grid points.
    """
    def __init__(self, N):
        """
        Create an object representing all possible mappings from 
        2D grid points to {0, 1}. 
        :param N: X-space samples are N by N grid in [0, 1)**2
        """
        self.grid_x = generate_all_X_space_samples_np(N)
        self.dof = N ** 2 # the degrees of freedom is eaqual to the number
        # of grid points at which you can freely choose {0/1} as the 
        # target value. DoF reduces as you start observing data (when you
        # observe the target value at a point, you lose the freedom of
        # setting it to arbitrary values)
        self.h_size = 2 ** self.dof
        
    def size(self):
        """
        Total number of possible mappings.
        
        Note this tend to be really large number for any
        respectable N.
        """
        return self.h_size
    
    def predict_according_to_hypothesis(self, X, hypothesis_id):
        """
        Note, when implement as class method, don't miss "self"
        :param X: a data has 2 attributes
        :param hypothesis_id: a number in 0..1267650600228229401496703205376 
            (e.g. N=10)
        """
        X = np.array(X) # make the input format more flexible, e.g.
        # you can use [0, 2] (Python list), or (0, 1) (Python tuple)
        d = ((self.grid_x - X)**2).sum(axis=1)
        bit_id = np.argmin(d)
        format_string = "{:0" + str(self.dof) + "b}"
        y = int(format_string.format(hypothesis_id) [bit_id])
        return y

In [0]:

complete_model = CompleteDiscrete2DBinaryMapping(10)
print("We are going to build {} different mappings."
       .format(complete_model.size()))

In [0]:
# This won't stop!
for hypothesis_id in range(complete_model.size()):
    print(complete_model
          .predict_according_to_hypothesis((8, 7), hypothesis_id))

### Fit to Training Data

(We will start moving faster from here.) Now suppose we are given training samples in the following format: $\{x_1 = \langle(0, 1), 1\rangle, x_2 = \langle(3, 4), 0\rangle\}$. How would the information affect our belief about the $X$-$y$ mapping?

We will introduce a method `fit`, which checks consistency between every hypothesis and the observed data and removes those hypotheses that disagree with the data.

In [0]:
class CompleteDiscrete2DBinaryMapping(object):
    def __init__(self, N):
        self.grid_x = generate_all_X_space_samples_np(N)
        self.dof = N ** 2
        self.h_size = 2 ** self.dof
        self.inconsistent_hypotheses = []
        
    def size(self):
        return self.h_size
    
    def predict_according_to_hypothesis(self, X, hypothesis_id):
        X = np.array(X)
        d = ((self.grid_x - X)**2).sum(axis=1)
        bit_id = np.argmin(d)
        format_string = "{:0" + str(self.dof) + "b}"
        y = int(format_string.format(hypothesis_id) [bit_id])
        return y
    
    # Let add a `fit` method
    def fit(self, X, Y): 
        """
        :param X: [M x 2] training data
        :param Y: [M] labels
        """
        # Let's check consistency for each training data and each hypothesis 
        for hid in range(self.h_size):
            for x_, y_ in zip(X, Y): 
                # be careful if the training set contains only 1 sample!
                # zip is literally zipping two "iterables" so the zipped object
                # yield multiple elements in each iteration.
                pred = self.predict_according_to_hypothesis(x_, hid)
                if pred != y_:
                    if hid not in self.inconsistent_hypotheses:
                        self.inconsistent_hypotheses.append(hid)
                    break # we have determined this hid is bad and no need
                    # to continue
        
        
    def predict_trained(self, X):
        return [
            self.predict_according_to_hypothesis(X, hid)
            for hid in range(self.h_size)
            if hid not in self.inconsistent_hypotheses
        ]

In [0]:
# Test using the example we have seen in class
complete_model = CompleteDiscrete2DBinaryMapping(3)
X_trn = [
    (0, 2),
    (1, 2),
    (1, 0),
    (1, 1),
    (2, 0),
    (2, 1),
]
Y_trn = [0, 0, 1, 1, 1, 1]
complete_model.fit(X_trn, Y_trn)

In [0]:
complete_model.predict_according_to_hypothesis((0,2), 3)

In [0]:
# Let us use the model to predict
complete_model.predict_trained((0, 2))

<span style="color:#006000"><b>EXERCISE</b></span>
Interpret how `predict_trained` works. 

__Improvement Idea 1__

Let's apply the "duck-typing" principle again -- we don't need to explicitly find out all inconsistent hypotheses and exclude them when testing. We can construct hypothesis set that is consistent. 


__Improvement Idea 2__

Try to increase the quantisation number $N$ to $4$ (or $5$ if you are in a more adventurous mood) and see how the model works. Next we will introduce limitations on the possibile hypotheses. See below

## Data Models (Preview)

<span style="color:#006000"><b>PREVIEW EXERCISE</b></span>
Figure out how the "linear" model below works. Try to introduce a non-trivial threshold when the hypotheses making predictions (See `PREDICTION BY INDIVIDUAL HYPOTHESIS`). 

In [0]:
# A Simple Linear Model Family
import numpy as np
class LinearHypothesisSpace:
    def __init__(self, quant_num=3):
        self.quant_num = quant_num
        grid_x0, grid_x1 = np.meshgrid(np.arange(self.quant_num),
                                       np.arange(self.quant_num))
        grid_x0 = grid_x0.flatten()
        grid_x1 = grid_x1.flatten()
        self.grid_x = np.stack([grid_x0, grid_x1]).T

        eps_angle = np.pi / 18
        angles = np.arange(0, np.pi, eps_angle)
        self.hypotheses = np.zeros((2 * len(angles), self.quant_num ** 2),
                                   dtype=np.int)
        x0 = grid_x0 - (quant_num - 1) / 2
        x1 = grid_x1 - (quant_num - 1) / 2
        for i, th in enumerate(angles):
            w = min(np.tan(th), 9999)
            # ** PREDICTION BY INDIVIDUAL HYPOTHESIS **
            ya = (x0 * w - x1 > 0).astype(np.int)
            yb = 1 - ya
            self.hypotheses[2 * i, :] = ya
            self.hypotheses[2 * i + 1, :] = yb
        self.sele_hypothesis_id = None

    def fit(self, x, y):
        x_ind = x[:, 0] + self.quant_num * x[:, 1]
        pred_trn = self.hypotheses[:, x_ind]
        accu_trn = pred_trn == y[np.newaxis, :]  # type: np.ndarray
        accu_trn_n = accu_trn.astype(np.float).sum(axis=1)
        self.sele_hypothesis_id = np.argmax(accu_trn_n)

    def predict_all_X(self, hypothesis_id=-1):
        h = self.hypotheses[self.sele_hypothesis_id] \
            if hypothesis_id == -1 \
            else self.hypotheses[hypothesis_id]
        return self.grid_x, h

In [0]:
linear_model0 = LinearHypothesisSpace(3)
X_trn = np.array([
    (0, 2),
    (1, 2),
    (1, 0),
    (1, 1),
    (2, 0),
    (2, 1),
])
Y_trn = np.array([0, 0, 1, 1, 1, 1])
linear_model0.fit(X_trn, Y_trn)

In [0]:
X_all, y_all = linear_model0.predict_all_X()

### Visualing the model behaviour

In [0]:
# Finally, let us visualise the model behaviour, we will us an interactive 
# visualision tool.

# NOTE drawing graphs is one noticeable difference between running your
# Python notebook on cloud (where the computers don't have screens and have
# to deliver graphics objects to your browser to render on YOUR screen), and 
# on local computer (where graphics display natively using graph interface 
# provided by your local OS). So we make a bit configuration here. 
#
# If the graphs don't work on your computer, try on colab, or you can 
# change to classical matplotlib library, which is easier to make working.


I_AM_RUNNING_THIS_NOTEBOOK_ON_MY_OWN_COMPUTER = True
COLAB = not I_AM_RUNNING_THIS_NOTEBOOK_ON_MY_OWN_COMPUTER

In [0]:
if COLAB: # We need to upgrade plotly to 4.0 for it to work with colab
    # [as of July 2019] this will obsolete soon when Google upgrades colab
    !pip install plotly --upgrade
    # Peform the same on your own computer if encountering issues, but only
    # do it once and for all. colab is a virtual machine, so you need to
    # perform the upgrading each time restarting a session.

In [0]:
import plotly.graph_objects as go
fig = go.Figure(
    data=[go.Scatter(
        x=X_all[:, 0], 
        y=X_all[:, 1], 
        marker_color=y_all,
        marker_size=12,
        marker_line_width=2,
        mode="markers")],
    layout_title_text="Prediction on a Discretised 2D X-Space"
)
if COLAB:
    fig.show(renderer="colab")
else:
    fig.show()

In [0]:
# Now we can handle decently sized (2D discrete) data space
hypothesis_id = 21 # we havn't trained the model, so need to specify which hypo
# we want to check
linear_model1 = LinearHypothesisSpace(50)
X_all, y_all = linear_model1.predict_all_X(hypothesis_id)
fig = go.Figure(
    data=[go.Scatter(
        x=X_all[:, 0], 
        y=X_all[:, 1], 
        marker_color=y_all,
        marker_size=12,
        marker_line_width=2,
        mode="markers")],
    layout_title_text="Prediction on a Discretised 2D X-Space"
)
if COLAB:
    fig.show(renderer="colab")
else:
    fig.show()

# Summarise

- We have built a omnipotently useless 2D classifier!
- We tried out a linear modeller.
- We have learned some useful Python and numpy skills.
- We have made nice pictures!