## Example 1: Taxi trips within and outside Manhattan
### (Naive Bayes classifier with discrete-valued inputs)

Below we apply the Naive Bayes Classifier to predict whether a taxi trip happened within or outside Manhattan. We are given a sample of workday daytime taxi trips with speed and distance information, as well as the number of passengers and the size of the tip.  Speed, distance, and tip information were encoded as discrete categorical variables, with larger values corresponding to faster speeds, longer distances, and higher tips respectively.

In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from scipy import stats
from sklearn.model_selection import train_test_split

data1 = pd.read_csv("NYC_taxi_sample.csv")
data1_X = data1.iloc[:,1:] # tip, distance, speed, and number of passengers
data1_y = data1.iloc[:,0] # binary output: 1 if in Manhattan, 0 if outside
X_train, X_test, y_train, y_test = train_test_split(data1_X, data1_y, test_size=0.25, random_state=42)

Here is a simple implementation of the Naive Bayes estimator for the training sample statistics $P(x=x^*\:|\:y=b)$ and $P(y=b)$.

In [2]:
# Training a binary Naive Bayes Classifier with discrete input attributes.
# Assume that the binary output variable takes on values 0 or 1. 
def trainNaiveBayesDiscrete(X,y):
    prior = 1.*y.sum()/y.count()
    nbc = {'prior':prior}
    X_1 = X[y==1]
    X_0 = X[y==0]
    for j in X.columns:
        nbc[j+'_1'] = X_1[j].value_counts(normalize=True)
        nbc[j+'_0'] = X_0[j].value_counts(normalize=True)
    return nbc

def testNaiveBayesDiscrete(X,nbc):
    y_pred = pd.Series(index=X.index)
    for i in X.index:
        # compute odds of y=1
        y_pred[i] = nbc['prior']/(1-nbc['prior']) # prior odds
        for j in X.columns:
            thevalue = X.loc[i,j]
            if thevalue not in nbc[j+'_1']:
                y_pred[i] = y_pred[i]*1E-3
            if thevalue not in nbc[j+'_0']:
                y_pred[i] = y_pred[i]*1E3
            if (thevalue in nbc[j+'_1']) & (thevalue in nbc[j+'_0']):
                y_pred[i] = y_pred[i]*(nbc[j+'_1'][thevalue]+1E-3)/(nbc[j+'_0'][thevalue]+1E-3)
        # convert odds to probability of y=1
        y_pred[i] = y_pred[i]/(1.0+y_pred[i])
    return y_pred

In [3]:
naive_bayes_classifier = trainNaiveBayesDiscrete(X_train,y_train)
for i,j in naive_bayes_classifier.items():
    print i
    print j
    print
y_pred_train = testNaiveBayesDiscrete(X_train,naive_bayes_classifier)
y_pred_test = testNaiveBayesDiscrete(X_test,naive_bayes_classifier)

# measure accuracy for the binary prediction task
print 'In sample prediction accuracy:',1.0*sum((y_pred_train>0.5)==y_train)/len(y_train)
print 'Out of sample prediction accuracy:',1.0*sum((y_pred_test>0.5)==y_test)/len(y_test)

# measure accuracy of the predicted probabilities
print 'Log-likelihood (train):',sum(np.log(y_pred_train*y_train+(1-y_pred_train)*(1-y_train)))
print 'Log-likelihood (test):',sum(np.log(y_pred_test*y_test+(1-y_pred_test)*(1-y_test)))


tip_1
1    0.368285
2    0.175103
3    0.169938
4    0.166322
6    0.062500
5    0.057851
Name: tip, dtype: float64

tip_0
1    0.558252
4    0.166505
3    0.097573
2    0.092718
5    0.050485
6    0.034466
Name: tip, dtype: float64

pass_0
1    0.704369
2    0.130583
4    0.109223
3    0.055825
Name: pass, dtype: float64

pass_1
1    0.645661
2    0.196281
3    0.094525
4    0.063533
Name: pass, dtype: float64

speed_1
2    0.411157
1    0.310434
3    0.158058
4    0.072314
5    0.033574
6    0.014463
Name: speed, dtype: float64

dist_0
6    0.241748
1    0.208252
2    0.177184
5    0.141748
3    0.119417
4    0.111650
Name: dist, dtype: float64

prior
0.4844844844844845

speed_0
2    0.225243
3    0.204854
4    0.199029
5    0.135437
6    0.118932
1    0.116505
Name: speed, dtype: float64

dist_1
1    0.440083
2    0.262397
3    0.136364
4    0.097624
6    0.037190
5    0.026343
Name: dist, dtype: float64

In sample prediction accuracy: 0.714964964965
Out of sample prediction accurac

## Example 2. Classification of individual vs. commercial properities
### (Gaussian Naive Bayes classifier with real-valued inputs)

Same dataset as last week's. Based on the sample of characteristics and the prices of the single unit residential and commercial properties sold in zip code 11201 (downtown Brooklyn) between the years 2009 and 2012, build a classifier defining if the sold property was actually residential or commercial.

In [4]:
data2 = pd.read_csv("NYC_individual_commercial.csv")
data2_X = data2.iloc[:,:4]
data2_y = data2.iloc[:,4]

# reduce correlation between attributes when possible
# land --> ratio of land area to inside area; price --> price per square foot
data2_X['land']=data2_X['land']/data2_X['area']
data2_X['price']=data2_X['price']/data2_X['area']

X_train, X_test, y_train, y_test = train_test_split(data2_X, data2_y, test_size=0.33, random_state=90)

Here is a simple implementation of the Gaussian Naive Bayes estimator for the training sample statistics $\mu(x\:|\:y=b)$, $\sigma(x\:|\:y=b)$, and $P(y=b)$.

In [5]:
# Training a binary Gaussian Naive Bayes Classifier with real-valued input attributes.
# Assume that the binary output variable takes on values 0 or 1. 
def trainGaussianNaiveBayes(X,y):
    prior = 1.*y.sum()/y.count()
    nbc = {'prior':prior}
    X_1 = X[y==1]
    X_0 = X[y==0]
    for j in X.columns:
        nbc[j+'_mu1'] = X_1[j].mean()
        nbc[j+'_sigma1'] = X_1[j].std()
        nbc[j+'_mu0'] = X_0[j].mean()
        nbc[j+'_sigma0'] = X_0[j].std()
    return nbc

def testGaussianNaiveBayes(X,nbc):
    y_pred = pd.Series(index=X.index)
    for i in X.index:
        # compute odds of y=1
        y_pred[i] = nbc['prior']/(1-nbc['prior']) # prior odds
        for j in X.columns:
            thevalue = X.loc[i,j]
            pdf1 = stats.norm.pdf(thevalue,loc=nbc[j+'_mu1'],scale=nbc[j+'_sigma1'])
            pdf0 = stats.norm.pdf(thevalue,loc=nbc[j+'_mu0'],scale=nbc[j+'_sigma0'])
            y_pred[i] = y_pred[i]*pdf1/pdf0 if pdf0 > 0 else 1E10
        # convert odds to probability of y=1
        y_pred[i] = y_pred[i]/(1.0+y_pred[i])
    return y_pred

In [6]:
naive_bayes_classifier = trainGaussianNaiveBayes(X_train,y_train)
for i,j in naive_bayes_classifier.items():
    print i
    print j
    print
y_pred_train = testGaussianNaiveBayes(X_train,naive_bayes_classifier)
y_pred_test = testGaussianNaiveBayes(X_test,naive_bayes_classifier)

# measure accuracy for the binary prediction task
print 'In sample prediction accuracy:',1.0*sum((y_pred_train>0.5)==y_train)/len(y_train)
print 'Out of sample prediction accuracy:',1.0*sum((y_pred_test>0.5)==y_test)/len(y_test)

year_mu0
1916.07894737

year_mu1
1935.54166667

year_sigma0
51.0693125071

year_sigma1
27.7190550794

land_sigma0
0.289267846218

area_mu1
16960.5

land_sigma1
0.842190437851

area_mu0
2902.39473684

prior
0.3870967741935484

area_sigma0
1252.30446042

area_sigma1
25506.8382391

price_mu1
379.154952494

price_mu0
901.807993187

price_sigma1
313.206564718

price_sigma0
352.921030077

land_mu0
0.608817069219

land_mu1
0.842132095044

In sample prediction accuracy: 0.887096774194
Out of sample prediction accuracy: 0.8125


## Use the Package from Sklearn

http://scikit-learn.org/stable/modules/generated/sklearn.naive_bayes.GaussianNB.html#sklearn.naive_bayes.GaussianNB

http://scikit-learn.org/stable/modules/naive_bayes.html

In [7]:
from sklearn.naive_bayes import GaussianNB
gnb = GaussianNB()
trained_model = gnb.fit(X_train,y_train)
y_pred_train = trained_model.predict_proba(X_train)[:,1]
y_pred_test = trained_model.predict_proba(X_test)[:,1]

# measure accuracy for the binary prediction task
print 'In sample prediction accuracy:',1.0*sum((y_pred_train>0.5)==y_train)/len(y_train)
print 'Out of sample prediction accuracy:',1.0*sum((y_pred_test>0.5)==y_test)/len(y_test)


In sample prediction accuracy: 0.887096774194
Out of sample prediction accuracy: 0.78125


# Practice: Spam classification

1. Title:  SPAM E-mail Database

2. Sources:
   (a) Creators: Mark Hopkins, Erik Reeber, George Forman, Jaap Suermondt
        Hewlett-Packard Labs, 1501 Page Mill Rd., Palo Alto, CA 94304
   (b) Donor: George Forman (gforman at nospam hpl.hp.com)  650-857-7835
   (c) Generated: June-July 1999

3. Past Usage:
   (a) Hewlett-Packard Internal-only Technical Report. External forthcoming.
   (b) Determine whether a given email is spam or not.
   (c) ~7% misclassification error.
       False positives (marking good mail as spam) are very undesirable.
       If we insist on zero false positives in the training/testing set,
       20-25% of the spam passed through the filter.

4. Relevant Information:
        The "spam" concept is diverse: advertisements for products/web
        sites, make money fast schemes, chain letters, pornography...
	The collection of spam e-mails came from the postmaster and 
	individuals who had filed spam.  The collection of non-spam 
	e-mails came from filed work and personal e-mails, and hence
	the word 'george' and the area code '650' are indicators of 
	non-spam.  These are useful when constructing a personalized 
	spam filter.  One would either have to blind such non-spam 
	indicators or get a very wide collection of non-spam to 
	generate a general purpose spam filter.

        For background on spam:
        Cranor, Lorrie F., LaMacchia, Brian A.  Spam! 
        Communications of the ACM, 41(8):74-83, 1998.

5. Number of Instances: 4601 (1813 Spam = 39.4%)

6. Number of Attributes: 58 (57 continuous, 1 nominal class label)

7. Attribute Information:
The last column of 'spambase.data' denotes whether the e-mail was 
considered spam (1) or not (0), i.e. unsolicited commercial e-mail.  
Most of the attributes indicate whether a particular word or
character was frequently occuring in the e-mail.  The run-length
attributes (55-57) measure the length of sequences of consecutive 
capital letters.  For the statistical measures of each attribute, 
see the end of this file.  Here are the definitions of the attributes:

48 continuous real [0,100] attributes of type word_freq_WORD 
= percentage of words in the e-mail that match WORD,
i.e. 100 * (number of times the WORD appears in the e-mail) / 
total number of words in e-mail.  A "word" in this case is any 
string of alphanumeric characters bounded by non-alphanumeric 
characters or end-of-string.

6 continuous real [0,100] attributes of type char_freq_CHAR
= percentage of characters in the e-mail that match CHAR,
i.e. 100 * (number of CHAR occurences) / total characters in e-mail

1 continuous real [1,...] attribute of type capital_run_length_average
= average length of uninterrupted sequences of capital letters

1 continuous integer [1,...] attribute of type capital_run_length_longest
= length of longest uninterrupted sequence of capital letters

1 continuous integer [1,...] attribute of type capital_run_length_total
= sum of length of uninterrupted sequences of capital letters
= total number of capital letters in the e-mail

1 nominal {0,1} class attribute of type spam
= denotes whether the e-mail was considered spam (1) or not (0), 
i.e. unsolicited commercial e-mail.  


8. Missing Attribute Values: None

9. Class Distribution:
	Spam	  1813  (39.4%)
	Non-Spam  2788  (60.6%)



In [8]:
import urllib
data = urllib.urlopen("https://archive.ics.uci.edu/ml/machine-learning-databases/spambase/spambase.data").read()
data_name=urllib.urlopen("https://archive.ics.uci.edu/ml/machine-learning-databases/spambase/spambase.names").read()
# Read the data
data=data.split("\r\n")
data_spam=[]
for i in range(len(data)):
    if len(data[i])>0:
        temp=data[i].split(",")
        #change from str to float
        t_l=[]
        for j in range(len(temp)):
            t_l.append(float(temp[j]))
        data_spam.append(t_l)

#Read the column names
temp=data_name.split("\r\n")
column_names=[]
for i in temp:
    if (i.startswith('word') or i.startswith('char') or i.startswith('capital')):
        column_names.append(i.split(":")[0])
column_names.append("spam") 

In [9]:
data_spam=pd.DataFrame(data_spam)
data_spam.columns=column_names
data_spam_X=data_spam.iloc[:,0:-1]
data_spam_y=data_spam.iloc[:,-1]

X_train, X_test, y_train, y_test = train_test_split(data_spam_X, data_spam_y, test_size=0.2, random_state=2015)

### (1) Use the Gaussian naive Bayes code that we provided to build your model on the training data, and report the out-of-sample accuracy on your testing data.

In [10]:
naive_bayes_classifier = trainGaussianNaiveBayes(X_train,y_train)
for i,j in naive_bayes_classifier.items():
    print i
    print j
    print
y_pred_train = testGaussianNaiveBayes(X_train,naive_bayes_classifier)
y_pred_test = testGaussianNaiveBayes(X_test,naive_bayes_classifier)

# measure accuracy for the binary prediction task
print 'In sample prediction accuracy:',1.0*sum((y_pred_train>0.5)==y_train)/len(y_train)
print 'Out of sample prediction accuracy:',1.0*sum((y_pred_test>0.5)==y_test)/len(y_test)

capital_run_length_longest_mu1
105.362369338

word_freq_order_mu0
0.0403697104677

word_freq_order_mu1
0.163881533101

word_freq_technology_sigma1
0.153492100953

word_freq_technology_sigma0
0.493084088594

capital_run_length_total_sigma1
840.181296349

capital_run_length_total_sigma0
342.372301885

word_freq_over_sigma1
0.308615828551

word_freq_over_sigma0
0.204060966394

word_freq_our_sigma0
0.624885344842

word_freq_our_sigma1
0.699794475793

word_freq_cs_sigma1
0.00263981838674

word_freq_cs_sigma0
0.458638883556

word_freq_data_mu1
0.0163135888502

word_freq_data_mu0
0.156672605791

char_freq_!_mu0
0.103965256125

word_freq_credit_sigma0
0.104193139862

word_freq_credit_sigma1
0.817201281856

word_freq_parts_sigma1
0.0545213282898

char_freq_#_sigma0
0.216995866924

word_freq_all_sigma1
0.470737730849

word_freq_all_sigma0
0.504677497056

word_freq_hpl_sigma0
1.05290980736

word_freq_hpl_sigma1
0.0906897432444

char_freq_(_sigma0
0.265129768708

word_freq_remove_sigma1
0.59965951



In sample prediction accuracy: 0.811684782609
Out of sample prediction accuracy: 0.831704668838


### (2) Use the Sklearn package to double check your solution. 

In [11]:
y_pred = gnb.fit(X_train,y_train).predict(X_test)
print (y_pred==y_test).sum()*1.0/len(y_pred)

0.8371335504885994


# Semi-supervised EM classifier

## Example 3. Taxi trip classification with partially missing labels

In [12]:
# same dataset as before
data1 = pd.read_csv("NYC_taxi_sample.csv")
data1_X = data1.iloc[:,1:] # tip, distance, speed, and number of passengers
data1_y = data1.iloc[:,0] # binary output: 1 if in Manhattan, 0 if outside
X_train, X_test, y_train, y_test = train_test_split(data1_X, data1_y, test_size=0.25, random_state=42)

In [13]:
import random

# now let's delete 99% of the labels from the training dataset and see what happens
random.seed(2015)
Label_index=random.sample(list(range(len(X_train))),int(len(X_train)*0.01))
Unlabel_index=[x for x in list(range(len(X_train))) if x not in Label_index]

X_train_Labeled=X_train.iloc[Label_index,:]
X_train_Unlabeled=X_train.iloc[Unlabel_index,:]   
y_train_Labeled=y_train.iloc[Label_index]

Let's see how well our Naive Bayes Classifier does using only the small sample of labeled training examples.

In [15]:
print len(X_train_Labeled.index)
print len(X_train_Unlabeled.index)

39
3957


In [16]:
naive_bayes_classifier = trainNaiveBayesDiscrete(X_train_Labeled,y_train_Labeled)
for i,j in naive_bayes_classifier.items():
    print i
    print j
    print
y_pred_test = testNaiveBayesDiscrete(X_test,naive_bayes_classifier)

# measure accuracy for the binary prediction task
print 'Out of sample prediction accuracy:',1.0*sum((y_pred_test>0.5)==y_test)/len(y_test)

tip_1
2    0.333333
1    0.333333
4    0.238095
3    0.095238
Name: tip, dtype: float64

tip_0
1    0.500000
3    0.222222
4    0.166667
6    0.055556
5    0.055556
Name: tip, dtype: float64

pass_0
1    0.555556
4    0.333333
3    0.055556
2    0.055556
Name: pass, dtype: float64

pass_1
1    0.809524
2    0.142857
4    0.047619
Name: pass, dtype: float64

speed_1
1    0.380952
2    0.333333
5    0.095238
4    0.095238
6    0.047619
3    0.047619
Name: speed, dtype: float64

dist_0
1    0.277778
5    0.222222
4    0.166667
2    0.166667
6    0.111111
3    0.055556
Name: dist, dtype: float64

prior
0.5384615384615384

speed_0
2    0.333333
4    0.277778
5    0.111111
3    0.111111
1    0.111111
6    0.055556
Name: speed, dtype: float64

dist_1
1    0.476190
2    0.190476
4    0.142857
3    0.142857
5    0.047619
Name: dist, dtype: float64

Out of sample prediction accuracy: 0.642160540135


Now let's see how well we can do using both labeled and unlabeled data.

In [17]:
def initializeNaiveBayesRandom(X_Unlabeled):
    nbc = {'prior':0.5}
    for j in X_Unlabeled.columns:
        thevalues = X_Unlabeled[j].unique()
        nbc[j+'_1'] = {}
        nbc[j+'_0'] = {}
        for jj in thevalues:
            nbc[j+'_1'][jj] = np.random.rand()
            nbc[j+'_0'][jj] = np.random.rand()
    return nbc
    
def EM(X_Labeled,y_Labeled,X_Unlabeled,num_iters):

    # initialize
    
    t = 0
    
    if len(y_Labeled) > 0:
        nbc = trainNaiveBayesDiscrete(X_Labeled,y_Labeled)
    else:
        nbc = initializeNaiveBayesRandom(X_Unlabeled)
    
    while True:
        t = t + 1
        print 'Iteration',t,'of',num_iters
        
        # E step - classify with nbc for unlabeled data only
        y_pred_Unlabeled = testNaiveBayesDiscrete(X_Unlabeled,nbc)
        
        # M step
        X_for_M_step = pd.concat([X_Labeled,X_Unlabeled]) 
        y_for_M_step = pd.concat([y_Labeled,y_pred_Unlabeled])
        prior = 1.*y_for_M_step.sum()/y_for_M_step.count()
        nbc = {'prior':prior}
        for j in X_for_M_step.columns:
            nbc[j+'_1'] = {}
            nbc[j+'_0'] = {}
            for theindex in X_for_M_step.index:
                current_X = X_for_M_step.loc[theindex,j]
                current_y = y_for_M_step.loc[theindex]
                if current_X in nbc[j+'_1']:
                    nbc[j+'_1'][current_X] += current_y
                else:
                    nbc[j+'_1'][current_X] = current_y
                if current_X in nbc[j+'_0']:
                    nbc[j+'_0'][current_X] += (1.0-current_y)
                else:
                    nbc[j+'_0'][current_X] = 1.0-current_y
            # normalize probabilities
            tempsum = 0.0
            for k in nbc[j+'_1']:
                tempsum += nbc[j+'_1'][k]
            for k in nbc[j+'_1']:
                nbc[j+'_1'][k] /= tempsum
            tempsum = 0.0
            for k in nbc[j+'_0']:
                tempsum += nbc[j+'_0'][k]
            for k in nbc[j+'_0']:
                nbc[j+'_0'][k] /= tempsum            
                       
        if t==num_iters:
            break
            
    return nbc

In [18]:
naive_bayes_classifier=EM(X_train_Labeled,y_train_Labeled,X_train_Unlabeled,num_iters=50)
for i,j in naive_bayes_classifier.items():
    print i
    print j
    print
y_pred_test = testNaiveBayesDiscrete(X_test,naive_bayes_classifier)

# measure accuracy for the binary prediction task
print 'Out of sample prediction accuracy:',1.0*sum((y_pred_test>0.5)==y_test)/len(y_test)

Iteration 1 of 50
Iteration 2 of 50
Iteration 3 of 50
Iteration 4 of 50
Iteration 5 of 50
Iteration 6 of 50
Iteration 7 of 50
Iteration 8 of 50
Iteration 9 of 50
Iteration 10 of 50
Iteration 11 of 50
Iteration 12 of 50
Iteration 13 of 50
Iteration 14 of 50
Iteration 15 of 50
Iteration 16 of 50
Iteration 17 of 50
Iteration 18 of 50
Iteration 19 of 50
Iteration 20 of 50
Iteration 21 of 50
Iteration 22 of 50
Iteration 23 of 50
Iteration 24 of 50
Iteration 25 of 50
Iteration 26 of 50
Iteration 27 of 50
Iteration 28 of 50
Iteration 29 of 50
Iteration 30 of 50
Iteration 31 of 50
Iteration 32 of 50
Iteration 33 of 50
Iteration 34 of 50
Iteration 35 of 50
Iteration 36 of 50
Iteration 37 of 50
Iteration 38 of 50
Iteration 39 of 50
Iteration 40 of 50
Iteration 41 of 50
Iteration 42 of 50
Iteration 43 of 50
Iteration 44 of 50
Iteration 45 of 50
Iteration 46 of 50
Iteration 47 of 50
Iteration 48 of 50
Iteration 49 of 50
Iteration 50 of 50
tip_1
{1: 0.42453937648732804, 2: 0.15221500735592802, 3: 0

# Unsupervised EM clustering

## Example 4. Taxi trip clustering with no labels

In [19]:
X_train_Unlabeled=X_train # assume all observations are unlabeled
X_train_Labeled=X_train.iloc[[],:] # empty
y_train_Labeled=y_train.iloc[[]] # empty

In [20]:
naive_bayes_classifier=EM(X_train_Labeled,y_train_Labeled,X_train_Unlabeled,num_iters=50)
for i,j in naive_bayes_classifier.items():
    print i
    print j
    print
y_pred_test = testNaiveBayesDiscrete(X_test,naive_bayes_classifier)

# check if labels switched
if (1.0*sum((y_pred_test>0.5)==y_test)/len(y_test) < 0.5):
    y_pred_test = 1.0-y_pred_test
print 'Out of sample prediction accuracy:',1.0*sum((y_pred_test>0.5)==y_test)/len(y_test)

Iteration 1 of 50
Iteration 2 of 50
Iteration 3 of 50
Iteration 4 of 50
Iteration 5 of 50
Iteration 6 of 50
Iteration 7 of 50
Iteration 8 of 50
Iteration 9 of 50
Iteration 10 of 50
Iteration 11 of 50
Iteration 12 of 50
Iteration 13 of 50
Iteration 14 of 50
Iteration 15 of 50
Iteration 16 of 50
Iteration 17 of 50
Iteration 18 of 50
Iteration 19 of 50
Iteration 20 of 50
Iteration 21 of 50
Iteration 22 of 50
Iteration 23 of 50
Iteration 24 of 50
Iteration 25 of 50
Iteration 26 of 50
Iteration 27 of 50
Iteration 28 of 50
Iteration 29 of 50
Iteration 30 of 50
Iteration 31 of 50
Iteration 32 of 50
Iteration 33 of 50
Iteration 34 of 50
Iteration 35 of 50
Iteration 36 of 50
Iteration 37 of 50
Iteration 38 of 50
Iteration 39 of 50
Iteration 40 of 50
Iteration 41 of 50
Iteration 42 of 50
Iteration 43 of 50
Iteration 44 of 50
Iteration 45 of 50
Iteration 46 of 50
Iteration 47 of 50
Iteration 48 of 50
Iteration 49 of 50
Iteration 50 of 50
tip_1
{1: 0.42631581330989854, 2: 0.1500659844989672, 3: 0.