# Introductory applied machine learning (INFR10069)
# Assignment 3 (Part B): Mini-Challenge [25%]

## Important Instructions

**It is important that you follow the instructions below to the letter - we will not be responsible for incorrect marking due to non-standard practices.**

1. <font color='red'>We have split Assignment 3 into two parts to make it easier for you to work on them separately and for the markers to give you feedback. This is part B of Assignment 3 - Part A is an introduction to Object Recognition. Both Assignments together are still worth 50% of CourseWork 2. **Remember to submit both notebooks (you can submit them separately).**</font>

1. You *MUST* have your environment set up as in the [README](https://github.com/michael-camilleri/IAML2018) and you *must activate this environment before running this notebook*:
```
source activate py3iaml
cd [DIRECTORY CONTAINING GIT REPOSITORY]
jupyter notebook
# Navigate to this file
```

1. Read the instructions carefully, especially where asked to name variables with a specific name. Wherever you are required to produce code you should use code cells, otherwise you should use markdown cells to report results and explain answers. In most cases we indicate the nature of answer we are expecting (code/text), and also provide the code/markdown cell where to put it

1. This part of the Assignment is the same for all students i.e. irrespective of whether you are taking the Level 10 version (INFR10069) or the Level-11 version of the course (INFR11182 and INFR11152).

1. The .csv files that you will be using are located at `./datasets` (i.e. use the `datasets` directory **adjacent** to this file).

1. In the textual answer, you are given a word-count limit of 600 words: exceeding this will lead to penalisation.

1. Make sure to distinguish between **attributes** (columns of the data) and **features** (typically referring only to the independent variables).

1. Make sure to show **all** your code/working. 

1. Write readable code. While we do not expect you to follow [PEP8](https://www.python.org/dev/peps/pep-0008/) to the letter, the code should be adequately understandable, with plots/visualisations correctly labelled. **Do** use inline comments when doing something non-standard. When asked to present numerical values, make sure to represent real numbers in the appropriate precision to exemplify your answer. Marks *WILL* be deducted if the marker cannot understand your logic/results.

1. **Collaboration:** You may discuss the assignment with your colleagues, provided that the writing that you submit is entirely your own. That is, you must NOT borrow actual text or code from other students. We ask that you provide a list of the people who you've had discussions with (if any). Please refer to the [Academic Misconduct](http://web.inf.ed.ac.uk/infweb/admin/policies/academic-misconduct) page for what consistutes a breach of the above.

### SUBMISSION Mechanics

**IMPORTANT:** You must submit this assignment by **Thursday 15/11/2018 at 16:00**. 

**Late submissions:** The policy stated in the School of Informatics is that normally you will not be allowed to submit coursework late. See the [ITO webpage](http://web.inf.ed.ac.uk/infweb/student-services/ito/admin/coursework-projects/late-coursework-extension-requests) for exceptions to this, e.g. in case of serious medical illness or serious personal problems.

**Resubmission:** If you submit your file(s) again, the previous submission is **overwritten**. We will mark the version that is in the submission folder at the deadline.

**N.B.**: This Assignment requires submitting **two files (electronically as described below)**:
 1. This Jupyter Notebook (Part B), *and*
 1. The Jupyter Notebook for Part A
 
All submissions happen electronically. To submit:

1. Fill out this notebook (as well as Part A), making sure to:
   1. save it with **all code/text and visualisations**: markers are NOT expected to run any cells,
   1. keep the name of the file **UNCHANGED**, *and*
   1. **sticking to the submission structure** (see below). This is especially true for the submission of your predictions and your textual answer.

1. Submit it using the `submit` functionality. To do this, you must be on a DICE environment. Open a Terminal, and:
   1. **On-Campus Students**: navigate to the location of this notebook and execute the following command:
   
      ```submit iaml cw2 03_A_ObjectRecognition.ipynb 03_B_MiniChallenge.ipynb```
      
   1. **Distance Learners:** These instructions also apply to those students who work on their own computer. First you need to copy your work onto DICE (so that you can use the `submit` command). For this, you can use `scp` or `rsync` (you may need to install these yourself). You can copy files to `student.ssh.inf.ed.ac.uk`, then ssh into it in order to submit. The following is an example. Replace entries in `[square brackets]` with your specific details: i.e. if your student number is for example s1234567, then `[YOUR USERNAME]` becomes `s1234567`.
   
    ```
    scp -r [FULL PATH TO 03_A_ObjectRecognition.ipynb] [YOUR USERNAME]@student.ssh.inf.ed.ac.uk:03_A_ObjectRecognition.ipynb
    scp -r [FULL PATH TO 03_B_MiniChallenge.ipynb] [YOUR USERNAME]@student.ssh.inf.ed.ac.uk:03_B_MiniChallenge.ipynb
    ssh [YOUR USERNAME]@student.ssh.inf.ed.ac.uk
    ssh student.login
    submit iaml cw2 03_A_ObjectRecognition.ipynb 03_B_MiniChallenge.ipynb
    ```
    
   What actually happens in the background is that your file is placed in a folder available to markers. If you submit a file with the same name into the same location, **it will *overwrite* your previous submission**. You should receive an automatic email confirmation after submission.
  


### Marking Breakdown

The Level 10 and Level 11 points are marked out of different totals, however these are all normalised to 100%. Note that Part A (Object Recognition) is worth 75% of the total Mark for Assignment 3, while Part B (this notebook) is worth 25%. Keep this in mind when allocating time for this assignment.

**70-100%** results/answer correct plus extra achievement at understanding or analysis of results. Clear explanations, evidence of creative or deeper thought will contribute to a higher grade.

**60-69%** results/answer correct or nearly correct and well explained.

**50-59%** results/answer in right direction but significant errors.

**40-49%** some evidence that the student has gained some understanding, but not answered the questions
properly.

**0-39%** serious error or slack work.

Note that while this is not a programming assignment, in questions which involve visualisation of results and/or long cold snippets, some marks may be deducted if the code is not adequately readable.

## Imports

Use the cell below to include any imports you deem necessary.

In [1]:
# Nice Formatting within Jupyter Notebook
%matplotlib inline
from IPython.display import display # Allows multiple displays from a single code-cell

# System functionality
import sys
sys.path.append('..')

# Import Here any Additional modules you use. To import utilities we provide, use something like:
#   from utils.plotter import plot_hinton

# Your Code goes here:
import os
import numpy as np
import pandas as pd
import scipy.stats.stats as stats
import matplotlib.pylab as plt
import seaborn as sns
from sklearn.naive_bayes import MultinomialNB
from sklearn.metrics import accuracy_score
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import confusion_matrix
from sklearn import preprocessing
from utils.plotter import scatter_jitter, plot_confusion_matrix,plot_SVM_DecisionBoundary
from sklearn.model_selection import KFold
from sklearn.ensemble import RandomForestClassifier
from sklearn import svm
from sklearn.metrics import log_loss
from pandas import DataFrame
from sklearn.tree import DecisionTreeClassifier

  from collections import Sequence
  from numpy.core.umath_tests import inner1d


# Mini challenge

In this second part of the assignment we will have a mini object-recognition challenge. Using the same type of data as in Part A, you are asked to find the best classifier for the person/no person classification task. You can apply any preprocessing steps to the data that you think fit and employ any classifier you like (with the provision that you can explain what the classifier is/preprocessing steps are doing). You can also employ any lessons learnt during the course, either from previous Assignments, the Labs or the lecture material to try and squeeze out as much performance as you possibly can. The only restriction is that all steps must be performed in `Python` by using the `numpy`, `pandas` and `sklearn` packages. You can also make use of `matplotlib` and `seaborn` for visualisation.

### DataSet Description

The datasets we use here are similar in composition but not the same as the ones used in Part A: *it will be useful to revise the description in that notebook*. Specifically, you have access to three new datasets: a training set (`Images_C_Train.csv`), a validation set (`Images_C_Validate.csv`), and a test set (`Images_C_Test.csv`). You must use the former two for training and evaluating your models (as you see fit). As before, the full data-set has 520 attributes (dimensions). Of these you only have access to the 500 features (`dim1` through `dim500`) to test your model on: i.e. the test set does not have any of the class labels.

### Model Evaluation

Your results will be evaluated in terms of the logarithmic loss metric, specifically the [logloss](http://scikit-learn.org/0.19/modules/model_evaluation.html#log-loss) function from SKLearn. You should familiarise yourself with this. To estimate this metric you will need to provide probability outputs, as opposed to discrete predictions which we have used so far to compute classification accuracies. Most models in `sklearn` implement a `predict_proba()` method which returns the probabilities for each class. For instance, if your test set consists of `N` datapoints and there are `K` class-labels, the method will return an `N` x `K` matrix (with rows summing to 1).

### Submission and Scoring

This part of Assignment 3 carries 25% of the total marks. Within this, you will be scored on two criteria:
 1. 80% of the mark will depend on the thoroughness of the exploration of various approaches. This will be assessed through your code, as well as a brief description (<600 words) justifying the approaches you considered, your exploration pattern and your suggested final approach (and why you chose it).
 1. 20% of the mark will depend on the quality of your predictions: this will be evaluated based on the logarithmic loss metric.
Note here that just getting exceptional performance is not enough: in fact, you should focus more on analysing your results that just getting the best score!

You have to submit the following:
 1. **All Code-Cells** which show your **working** with necessary output/plots already generated.
 1. In **TEXT** cell `#ANSWER_TEXT#` you are to write your explanation (<600 words) as described above. Keep this brief and to the point. **Make sure** to keep the token `#ANSWER_TEXT#` as the first line of the cell!
 1. In **CODE** cell `#ANSWER_PROB#` you are to submit your predictions. To do this:
    1. Once you have chosen your favourite model (and pre-processing steps) apply it to the test-set and estimate the posterior proabilities for the data points in the test set.
    1. Store these probabilities in a 2D numpy array named `pred_probabilities`, with predictions along the rows i.e. each row should be a complete probability distribution over whether the image contains a person or not. Note that due to the encoding of the `is_person` class, the negative case (i.e. there is no person) comes first.
    1. Execute the `#ANSWER_PROB#` code cell, making sure to not change anything. This cell will do some checks to ensure that you are submitting the right shape of array.

You may create as many code cells as you need (within reason) for training your models, evaluating the data etc: however, the text cell `#ANSWER_TEXT#` and code-cell `#ANSWER_PROB#` showing your answers must be the last two cells in the notebook.

# Data Process


In [2]:
# This is where your working code should start. Fell free to add as many code-cells as necessary.
#  Make sure however that all working code cells come BEFORE the #ANSWER_TEXT# and #ANSWER_PROB#
#  cells below.

# Your Code goes here:
#data processing#
data_path = os.path.join(os.getcwd(), 'datasets', 'Images_C_Train.csv')
image_c_train = pd.read_csv(data_path, delimiter = ',')
data_path = os.path.join(os.getcwd(), 'datasets', 'Images_C_Validate.csv')
image_c_valid = pd.read_csv(data_path, delimiter = ',')
data_path = os.path.join(os.getcwd(), 'datasets', 'Images_C_Test.csv')
image_c_test = pd.read_csv(data_path, delimiter = ',')

colu = image_c_train.columns.values.tolist()
x_c = colu[1:501]
y_c = colu[501:520]

X_tr = image_c_train.copy(deep=True)
for i in image_c_train.columns.values:
    if i in y_c:
        X_tr = X_tr.drop(i,axis=1) 
X_tr = X_tr.drop("imgId",axis=1)
y_tr = image_c_train.copy(deep=True)
for i in image_c_train.columns.values:
    if i in x_c:
        y_tr = y_tr.drop(i,axis=1)
y_tr = y_tr.drop("imgId",axis=1)       
X_tst = image_c_valid.copy(deep=True)

for j in image_c_valid.columns.values:
    if j in y_c:
        X_tst = X_tst.drop(j,axis=1) 
X_tst = X_tst.drop("imgId",axis=1)
y_tst = image_c_valid.copy(deep=True)
for j in image_c_valid.columns.values:
    if j  in x_c:
        y_tst =y_tst.drop(j,axis=1)
y_tst = y_tst.drop("imgId",axis=1)

print("X_tr: ",X_tr.shape[0],X_tr.shape[1])
print("y_tr: ",y_tr.shape[0],y_tr.shape[1])
print("X_valid: ",X_tst.shape[0],X_tst.shape[1])
print("y_valid: ",y_tst.shape[0],y_tst.shape[1])
#X_train = preprocessing.scale(X_tr)
X_train = X_tr
X_valid = X_tst
y_train = y_tr["is_person"]
y_valid = y_tst["is_person"]
test =image_c_test.drop("is_person",axis=1)


X_tr:  2113 500
y_tr:  2113 19
X_valid:  1113 500
y_valid:  1113 19


# Logistic Regression

In [3]:
#Logistic Regression with C FROM 1E-5 TO 1E8 #
'''
C =[]
step_length = 10
num_steps = 5
ini = 1e-1
for i in range(num_steps):
    C.append(ini*step_length**i)
'''
C = [0.00001,0.0001,0.001,0.01,0.1,1,10,100,1000,10000,100000,1000000,10000000,100000000,1000000000,10000000000]
los= []
for i in C: 
    lr= LogisticRegression(solver='lbfgs',C = i)
    lr.fit(X_train, y_train)
    ts_pred = lr.predict_proba(X_valid)
    loss=log_loss(y_valid,ts_pred)
    los.append(loss)
    print("C: ",i,"log loss:",loss)

index = np.argmin(los)
print("best c ", C[index]," log loss:" ,los[index])

lr= LogisticRegression(solver='lbfgs',C = C[index])
lr.fit(X_train, y_train)
ts_pred = lr.predict(X=X_valid)
t_pred = lr.predict_proba(X_valid)
ac =  lr.score(X_valid,y_valid)
print("accuracy:" ,ac)
print("log loss:",log_loss(y_valid,t_pred))
print("probability:",lr.predict_proba(test))


C:  1e-05 log loss: 0.6929976197936568
C:  0.0001 log loss: 0.6929499444636662
C:  0.001 log loss: 0.6929135228299935
C:  0.01 log loss: 0.6927187379790026
C:  0.1 log loss: 0.6909841103098261
C:  1 log loss: 0.6795215327264736
C:  10 log loss: 0.6415220959252403
C:  100 log loss: 0.5957157266914838
C:  1000 log loss: 0.5950515993088755
C:  10000 log loss: 0.6049464519540135
C:  100000 log loss: 0.6417776199184826
C:  1000000 log loss: 0.6076827262778323
C:  10000000 log loss: 0.6191261351515056
C:  100000000 log loss: 0.613938390856783
C:  1000000000 log loss: 0.6214977956412134
C:  10000000000 log loss: 0.6132137790169467
best c  1000  log loss: 0.5950515993088755
accuracy: 0.6900269541778976
log loss: 0.5950515993088755
probability: [[0.92590165 0.07409835]
 [0.90311802 0.09688198]
 [0.44387253 0.55612747]
 ...
 [0.41205117 0.58794883]
 [0.5955104  0.4044896 ]
 [0.12991919 0.87008081]]


# SVM with RBF kernel

In [4]:
# SVM with rbf kernel#
C = []
init = 1e2
step = 10
n =init
i = 0
C.append(n)
while n <=1e8:
    n = n*step
    i += 1
    if n <= 1e8:
        C.append(n)
los = []
for i in C: 
    svm_rbf = svm.SVC(kernel= 'rbf',gamma='auto',C=i,probability=True)
    svm_rbf.fit(X_train,y_train)
    t_pred = svm_rbf.predict_proba(X_valid)
    loss=log_loss(y_valid,t_pred)
    los.append(loss)
    print("C: ",i,"log loss:",loss)

index = np.argmin(los)
print("best c ", C[index]," log loss:" ,los[index])

svm_rbf = svm.SVC(kernel= 'rbf',gamma='auto',C=C[index],probability=True)
svm_rbf.fit(X_train,y_train)
ts_pred = svm_rbf.predict(X=X_valid)
t_pred = svm_rbf.predict_proba(X_valid)
ac = accuracy_score(y_valid,ts_pred)
print("accuracy:" ,ac)
print("log loss:",log_loss(y_valid,t_pred))
print("probability:",svm_rbf.predict_proba(test))
pred_probabilities = lr.predict_proba(test)

C:  100.0 log loss: 0.6803830415578842
C:  1000.0 log loss: 0.6141217693108101
C:  10000.0 log loss: 0.5944466031890102
C:  100000.0 log loss: 0.5864223197642955
C:  1000000.0 log loss: 0.62841607565592
C:  10000000.0 log loss: 0.6719942207669298
C:  100000000.0 log loss: 0.6862598757585264
best c  100000.0  log loss: 0.5864223197642955
accuracy: 0.683737646001797
log loss: 0.585955345105448
probability: [[0.87250906 0.12749094]
 [0.8999548  0.1000452 ]
 [0.41764679 0.58235321]
 ...
 [0.52683327 0.47316673]
 [0.617614   0.382386  ]
 [0.30254016 0.69745984]]


# SVM with poly kernel

In [5]:
# SVM with poly kernel#
C = []
init = 1e3
step = 10
n =init
i = 0
C.append(n)
while n <= 1e8:
    n = n*step
    i += 1
    if n <= 1e8:
        C.append(n)

degree = [1,5,10,15]
result = [[]]
los = []
for j in degree:
    print("degree",j)
    for i in C: 
        svm_poly = svm.SVC(kernel ='poly',degree =j,C=i,probability=True)
        svm_poly.fit(X_train, y_train)
        t_pred = svm_poly.predict_proba(X_valid)
        loss=log_loss(y_valid,t_pred)
        los.append(loss)
        result.append([j,i,los])
        print("C: ",i,"log loss:",loss)


degree 1
C:  1000.0 log loss: 0.6920665545890028
C:  10000.0 log loss: 0.6922709057682884
C:  100000.0 log loss: 0.6925186469443444
C:  1000000.0 log loss: 0.692849912381527
C:  10000000.0 log loss: 0.6929677150942555
C:  100000000.0 log loss: 0.6929911215847199
degree 5
C:  1000.0 log loss: 0.6929434882074487
C:  10000.0 log loss: 0.6929350874440406
C:  100000.0 log loss: 0.6929044263935302
C:  1000000.0 log loss: 0.6929476015567289
C:  10000000.0 log loss: 0.6929429790039205
C:  100000000.0 log loss: 0.6929490320580552
degree 10
C:  1000.0 log loss: 0.692981819038966
C:  10000.0 log loss: 0.6929484465533592
C:  100000.0 log loss: 0.6929396021278943
C:  1000000.0 log loss: 0.6929697608464853
C:  10000000.0 log loss: 0.6929692070712153
C:  100000000.0 log loss: 0.6929202589499875
degree 15
C:  1000.0 log loss: 0.6930062186973881
C:  10000.0 log loss: 0.6930000701244242
C:  100000.0 log loss: 0.6930006777599447
C:  1000000.0 log loss: 0.6930048048468985
C:  10000000.0 log loss: 0.693004

In [6]:
accura = DataFrame(result,columns=['degree','C','loss'])
accura = accura.iloc[1:30]
tem = accura["loss"].values
indext =np.argmin(tem)
d = accura.iloc[indext][0]
c = accura.iloc[indext][1]
print(d,c)

svm_poly = svm.SVC(kernel= 'poly',gamma='auto',C=c,degree=d,probability=True)
svm_poly.fit(X_train,y_train)
ts_pred = svm_poly.predict(X=X_valid)
t_pred = svm_poly.predict_proba(X_valid)
ac = accuracy_score(y_valid,ts_pred)
print("accuracy:" ,ac)
print("log loss:",log_loss(y_valid,t_pred))
print("probability:",svm_poly.predict_proba(test))

1.0 1000.0
accuracy: 0.5265049415992812
log loss: 0.6922388525088963
probability: [[0.55331099 0.44668901]
 [0.55201653 0.44798347]
 [0.55063354 0.44936646]
 ...
 [0.55079828 0.44920172]
 [0.55118389 0.44881611]
 [0.55022249 0.44977751]]


# Decision Tree

In [7]:
#decision tree #
max_depth = [10,13,18,20,50,60]
max_features =[100,200,300,400,450,500]
C = []
init = 1e-5
step = 10
n =init
i = 0
C.append(n)
while n <= 1e5:
    n = n*step
    i += 1
    if n <= 1e5:
        C.append(n)
result = [[]]
los = []
for j in max_depth:
    print("max_depth",j)
    for i in max_features: 
        dtc =DecisionTreeClassifier(max_depth = j,max_features =i,random_state = 0)
        dtc.fit(X_train, y_train)       
        t_pred = dtc.predict_proba(X_valid)
        loss=log_loss(y_valid,t_pred)
        los.append(loss)
        result.append([j,i,los])
        print("max_features: ",i,"log loss:",loss)

max_depth 10
max_features:  100 log loss: 8.090071595851184
max_features:  200 log loss: 7.674007007667829
max_features:  300 log loss: 8.20489007429301
max_features:  400 log loss: 6.0702291326481905
max_features:  450 log loss: 8.223281524216379
max_features:  500 log loss: 7.581903121195889
max_depth 13
max_features:  100 log loss: 11.192986912285408
max_features:  200 log loss: 11.805917983390536
max_features:  300 log loss: 11.601801652024534
max_features:  400 log loss: 10.736457862938062
max_features:  450 log loss: 11.683111110415812
max_features:  500 log loss: 10.493478674643733
max_depth 18
max_features:  100 log loss: 14.221867659678908
max_features:  200 log loss: 14.252374145274816
max_features:  300 log loss: 14.2127220025778
max_features:  400 log loss: 12.939569331791963
max_features:  450 log loss: 14.585107731902985
max_features:  500 log loss: 13.27193877759329
max_depth 20
max_features:  100 log loss: 13.890658563233893
max_features:  200 log loss: 15.0509498164506

In [8]:
accura = DataFrame(result,columns=['max_depth','max_features','loss'])
accura = accura.iloc[1:30]
tem = accura["loss"].values
indext =np.argmin(tem)
d = accura.iloc[indext][0]
f = accura.iloc[indext][1]
print(d,f)

dtc =DecisionTreeClassifier(max_depth = d,max_features =400,random_state = 0)
dtc.fit(X_train, y_train)
ts_pred = dtc.predict(X=X_valid)
t_pred = dtc.predict_proba(X_valid)
ac = accuracy_score(y_valid,ts_pred)
print("accuracy:" ,ac)
print("log loss:",log_loss(y_valid,t_pred))
print("probability:",dtc.predict_proba(test))

10.0 100.0
accuracy: 0.5893980233602875
log loss: 6.0702291326481905
probability: [[0.5        0.5       ]
 [0.98630137 0.01369863]
 [0.01851852 0.98148148]
 ...
 [0.97368421 0.02631579]
 [0.86046512 0.13953488]
 [0.11940299 0.88059701]]


#ANSWER_TEXT#

***Your answer goes here:***
For data preprocess,I didn't use the scale function(preprocessing.scale) to preprocessing the data set.Becaues it would leads to poor performance compared to the one without preprocessing.So for this part,I just divide the data into x_train,y_train,x_valid,y_valid and x_test.

I chosen the logistic regression, SVM with rbf kernel,SVM with polynomial kernel and decision tree model to fit the dataset. The C parameter in this model means the inverse of regularization strength and smaller values specify the stronger regularisation.

For the logistic regression model,I chosen the C parameter range from 1e-5 to 1e10 to find a better model.Through the experiments,when C is 1000,the log loss is 0.5952643162897312 which is lower than the log loss of the model with other C parameter.

For SVM with rbf kernel,I chosen the C parameter range from 1e2 to 1e8 to find a better model.Through the experiments,when C is 100000,the log loss is 0.5860065172417601 which is lower than the log loss of the model with other C parameter.

For SVM with polynomial kernel,I chosen the C parameter range from 1e3 to 1e8 and the degree from 1 to 15 to find a better model.From the experiments, we can find the degree don't have a large infulence on the model for this dataset.Through the experiments,when C is 100000 and the degree is 1,the log loss is 0.6922942322230741 which is lower than the log loss of the model with other C parameter and other degree parameters.

For decision tree model ,I chosen the max feature parameter range from 100 to 500 and the max depth range from 10 to 60 to find a better model.The max depth is the maximum depth of the decision tree and the number of features to conside when looking for the best split.Through the experiments,when max depth is 10 and max_feature is 100,the log loss is 6.0702291326481905 which is lower than the log loss of the model with other parameters.

In Summary,the SVM with rbf kernel when c parameter is set as 100000 perform best than other model in this experiment. And the experiment shows that decision tree model is not suitable for the classfication of this dataset.It has a much higher log loss than other models,which means it has lower accuracy than other models.


In [9]:
#ANSWER_PROB#
# Run this cell when you are ready to submit your test-set probabilities. This cell will generate some
# warning messages if something is not right: make sure to address them!
if pred_probabilities.shape != (1114, 2):
    print('Array is of incorrect shape. Rectify this before submitting.')
elif (pred_probabilities.sum(axis=1) != 1.0).all():
    print('Submitted values are not correct probabilities. Rectify this before submitting.')
else:
    for _prob in pred_probabilities:
        print('{:.8f}, {:.8f}'.format(_prob[0], _prob[1]))

0.92590165, 0.07409835
0.90311802, 0.09688198
0.44387253, 0.55612747
0.51831624, 0.48168376
0.75912179, 0.24087821
0.18934032, 0.81065968
0.34078238, 0.65921762
0.21456746, 0.78543254
0.98125152, 0.01874848
0.69436613, 0.30563387
0.40016234, 0.59983766
0.66855325, 0.33144675
0.48108785, 0.51891215
0.66359645, 0.33640355
0.08640746, 0.91359254
0.60718063, 0.39281937
0.09561121, 0.90438879
0.50569017, 0.49430983
0.78661987, 0.21338013
0.35545714, 0.64454286
0.88576424, 0.11423576
0.57336151, 0.42663849
0.50493642, 0.49506358
0.43845669, 0.56154331
0.79898324, 0.20101676
0.16468158, 0.83531842
0.34950848, 0.65049152
0.74892542, 0.25107458
0.39680863, 0.60319137
0.51079316, 0.48920684
0.70388125, 0.29611875
0.55314214, 0.44685786
0.88614715, 0.11385285
0.65548882, 0.34451118
0.69821537, 0.30178463
0.45157267, 0.54842733
0.19101881, 0.80898119
0.76740768, 0.23259232
0.80107782, 0.19892218
0.66968563, 0.33031437
0.80147661, 0.19852339
0.55246158, 0.44753842
0.51662381, 0.48337619
0.87315502,