# Artificial Intelligence: Concepts, Challenges, and Opportunities (2020), exercises


## General instructions for all exercises

Before you turn this problem in, make sure everything runs as expected. First, **restart the kernel** (in the menubar, select Kernel$\rightarrow$Restart) and then **run all cells** (in the menubar, select Cell$\rightarrow$Run All).

Follow the instructions and fill in your solution under the line marked by tag

> YOUR CODE HERE
  
Having written the answer, execute the code cell by and pressing `Shift-Enter` key combination. The code is run, and it may print some information under the code cell. The focus automatically moves to the next cell and you may "execute" that cell by pressing `Shift-Enter` again, until you have reached the code cell which tests your solution. Execute that and follow the feedback. Usually it either says that the solution seems acceptable, or reports some errors. You can go back to your solution, modify it and repeat everything until you are satisfied. Then proceed to the next task.
   
Repeat the process for all tasks.

The notebook may also contain manually graded answers. Write your manualle graded answer under the line marked by tag:

> YOUR ANSWER HERE

Manually graded tasks may be text, pseudocode, or mathematical formulas. You can write formulas with $\LaTeX$-syntax by enclosing the formula with dollar signs (`$`), for example `$f(x)=2 \pi / \alpha$`, will produce $f(x)=2 \pi / \alpha$

When you have passed the tests in the notebook, and you are ready to submit your solutions, download the whole notebook, using menu `File -> Download as -> Notebook (.ipynb)`. Save the file in your hard disk, and submit it in [Moodle](https://moodle.uwasa.fi) under the corresponding excercise.

Your solution should be an executable Python code. Use the code already existing as an example of Python programing and read more from the numerous Python programming material from the Internet if necessary. 


In [None]:
NAME = ""
Student_number = ""

---

# Probabilisic reasoning and machine learning


In [None]:
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
sns.set()


## Task 1: Graph search 

Search the shortest path on the graph below from node 0 to node 6, when the weights of each edge is 1. You may use the `nx.astar_path()`-function or solve it manually. (Notice that if no heuristic function is given, then the search is actually using Djikstras shortest path algorihm, which finds the same path, but slower.)

Hint: If the graph below looks messy, evaluate the next cell again (by pressing Shft-Enter) and the graph is redrawn. It will look different, but it has still the same topology. Repet until you can see the graph correctly.

In [None]:
import networkx as nx

#rg=nx.random_tree(n=15, seed=1)
rg=nx.random_geometric_graph(n=15, radius=0.5, seed=1)
nx.draw(rg, with_labels=True, font_weight='normal', node_color='orange', node_size=500)


In [None]:
path=[0,]

# YOUR CODE HERE
raise NotImplementedError()

In [None]:
if len(path)<2:
    print("Please assign your path as Python list called path, eg path=[1,2,3,4,5,6]")
    

## Task 2, Probabilistic reasoning.

The apples and onions have been accidentally mixed in the grocery store. A shopkeeper developed a robot which can separate onions from apples by simply measuring their weights, since onions are usually lighter than apples. But not always. The following figure shows the weight distribution of apples and onions. 

Run the following cell first once to install pomegranate library. It will take a minute to run. Then you can continue as normal

In [None]:
conda install pomegranate

In [None]:
from numpy import mean , linspace
import pomegranate as pg

onions=pg.NormalDistribution(110,30)
apples=pg.NormalDistribution(170,25)

# Combined, all students
combined = pg.GeneralMixtureModel([apples, onions])

i=linspace(0,250)
plt.plot(i,onions.probability(i), label='Onions')
plt.plot(i,apples.probability(i), label='Apples')
plt.plot(i,combined.probability(i), label='Combined')
plt.xlabel('Weithg')
plt.title('Weight of Onions and Apples')
plt.legend()

The separation of apples from onions can be implemented using Bayesian classifier. Use the Bayes rule to find out what is the probability that an item weighting 150 g is an apple? 

- You can make exact calculations using the Pomegranate distribution objects. In this case, make sure that the final calculation is on the last line of the cell, so that it will be the output value of the cell.
- Or you can just look the approximate values from the picture and write just the correct value in the following cell.
- Either way, the value should be a propability between 0 and 1
- Please make sure the assert on the next cell is passed

In [None]:
# YOUR CODE HERE
raise NotImplementedError()

In [None]:
ans = _
print("Your answer is %f" % (ans))
if (ans<=0) or (ans>=1):
    print("Warning! The probability must be between 0 and 1!" )
assert((ans>0) and (ans<1)), "%d is not in acceptable range" % (ans)


## Task 3: Regression

The following code cell loads a one dimensional data set and makes a scatter plot of it. 


### 3 a)

Fit the regression model 

$$
    y = \beta_0 + \beta_1 x
$$

to the data, by optimising coefficients $\beta_0$ and $\beta_1$ with the interactive tool below, so that prediction error $\Sigma(y-\hat{y})^2$ is minimized. R$^2$-score should be better than 0.6. 

In [None]:
import ipywidgets as widgets
from IPython.display import display
from ipywidgets import interact, interactive, fixed, interact_manual
import pandas as pd

b0, b1 = (0,1)

D=pd.read_csv('regressiondata.csv')
x=np.array(D.iloc[:,0:1])
y=np.array(D.iloc[:,1:2])

@interact(b1w=widgets.FloatSlider(value=0.1, min=0, max=4, step=0.1, description=r'$\beta_1$'), 
          b0w=widgets.FloatSlider(value=0.1, min=-2, max=3, step=0.1, description=r'$\beta_0$'))
def fit(b1w,b0w):
    global x, ax, b0, b1
    fig,ax=plt.subplots(figsize=(5,5))
    ax.plot(x, b1w*x+b0w, c='b')
    ax.scatter(x,y, c='r')
    (b0, b1) = (b0w, b1w)

In [None]:
# You do not need to write anything here, just remove the Not Implemented Here line
print("y = %3.2f + %3.2f * x" % (b0,b1))
# YOUR CODE HERE
raise NotImplementedError()

In [None]:
if (b0<-1.5) or (b0>4):
    print("The intercept of the regression curve is out of range")
if (b1<0.1) or (b1>3):
    print("The slope of the regression curve is out of range")
    


### 3b)

Repeat the same task using LinearRegression object from Scikit Learn library. Tip: Use `fit_intercep=True` -parameter for LinearRegression object. Name your linear regerssor object as predictor.

In [None]:
from sklearn.linear_model import LinearRegression
# YOUR CODE HERE
raise NotImplementedError()

In [None]:
if 'predictor' not in locals():
    print("Error! You forgot to use name predictor for your regressor object!")
if predictor.fit_intercept==False:
    print("Error! Set fit_intercept=True to ")
if (predictor.intercept_<-0.9) or (predictor.intercept_>-0.1):
    print("Error! Intercept is not properly defined. Perhaps you forgot fit_intercept=True?")


## Task 4: Classification by Neural Network

Fill in the code below, to train a multilayer perceptron network to predict if a person survived or not from the Titanic. 

 - Create first a predictor usind `MLPClassifier` -module, name it as mlp
    - One hidden layer is enough, try parameter `hiddent_layer_sizes=(n)` where `n` is the number of hidden layers you want to use.
    - define also `max_iter=1000`, to allow the algorithm to train long enough
 - Fit the predictor to the data: `mlp.fit()`
 - Predict the classes 
 - Calculate and print the `accuracy_score`
 - Print the `confusion_matrix`
 - See the example from the lecture notes
 - Predict the probability of personX to survive from Titanic. Use `predict_proba()` -function of mlp. The function returns the probability of not surviving and probability of surviving.

In [None]:
from sklearn.metrics import accuracy_score, confusion_matrix
from sklearn.neural_network import MLPClassifier

# Read the data
titanic = pd.read_csv("titanic_train.csv")

# Fill missing cabin names with empty strings
titanic.Cabin.fillna(value='', inplace=True)

# Fill missing ages with mean value (=29)
titanic.Age.fillna(value=29, inplace=True)

# Select usefull columns (numerical) for input
X=titanic.iloc[:,[2,5,6,7,9]].copy()
X['Female'] = (titanic.Sex=='female').astype('int')

# Define a sample person profile to be predicted
personX=X.iloc[1:2,:].copy()
personX.Pclass=2
personX.Age=32
personX.SibSp=0
personX.Parch=0
personX.Fare=200
personX.Female=0

# Select survivability as output
y=titanic.iloc[:,1]


# YOUR CODE HERE
raise NotImplementedError()

In [None]:
if 'mlp' not in locals():
    print("Error! You forgot to use name mlp for your regressor object!")


## Task 5: Machine learning

Explain how machine learning is different than programming?

YOUR ANSWER HERE