In [0]:
# section 1

# **Breast Cancer Data Description**

Citation Request:
   This breast cancer domain was obtained from the University Medical Centre,
   Institute of Oncology, Ljubljana, Yugoslavia.  Thanks go to M. Zwitter and 
   M. Soklic for providing the data.  Please include this citation if you plan
   to use this database.

1. Title: Breast cancer data (Michalski has used this)

2. Sources: 
   -- Matjaz Zwitter & Milan Soklic (physicians)
      Institute of Oncology 
      University Medical Center
      Ljubljana, Yugoslavia
   -- Donors: Ming Tan and Jeff Schlimmer (Jeffrey.Schlimmer@a.gp.cs.cmu.edu)
   -- Date: 11 July 1988

3. Past Usage: (Several: here are some)
     -- Michalski,R.S., Mozetic,I., Hong,J., & Lavrac,N. (1986). The 
      Multi-Purpose Incremental Learning System AQ15 and its Testing 
      Application to Three Medical Domains.  In Proceedings of the 
      Fifth National Conference on Artificial Intelligence, 1041-1045,
      Philadelphia, PA: Morgan Kaufmann.
      -- accuracy range: 66%-72%
     -- Clark,P. & Niblett,T. (1987). Induction in Noisy Domains.  In 
      Progress in Machine Learning (from the Proceedings of the 2nd
      European Working Session on Learning), 11-30, Bled, 
      Yugoslavia: Sigma Press.
      -- 8 test results given: 65%-72% accuracy range
     -- Tan, M., & Eshelman, L. (1988). Using weighted networks to 
      represent classification knowledge in noisy domains.  Proceedings 
      of the Fifth International Conference on Machine Learning, 121-134,
      Ann Arbor, MI.
      -- 4 systems tested: accuracy range was 68%-73.5%
    -- Cestnik,G., Konenenko,I, & Bratko,I. (1987). Assistant-86: A
      Knowledge-Elicitation Tool for Sophisticated Users.  In I.Bratko
      & N.Lavrac (Eds.) Progress in Machine Learning, 31-45, Sigma Press.
      -- Assistant-86: 78% accuracy

4. Relevant Information:
     This is one of three domains provided by the Oncology Institute
     that has repeatedly appeared in the machine learning literature.
     (See also lymphography and primary-tumor.)

     This data set includes 201 instances of one class and 85 instances of
     another class.  The instances are described by 9 attributes, some of
     which are linear and some are nominal.

5. Number of Instances: 286

6. Number of Attributes: 9 + the class attribute

7. Attribute Information:
   1. Class: no-recurrence-events, recurrence-events
   2. age: 10-19, 20-29, 30-39, 40-49, 50-59, 60-69, 70-79, 80-89, 90-99.
   3. menopause: lt40, ge40, premeno.
   4. tumor-size: 0-4, 5-9, 10-14, 15-19, 20-24, 25-29, 30-34, 35-39, 40-44, 45-49, 50-54, 55-59.
   5. inv-nodes: 0-2, 3-5, 6-8, 9-11, 12-14, 15-17, 18-20, 21-23, 24-26, 27-29, 30-32, 33-35, 36-39.
   6. node-caps: yes, no.
   7. deg-malig: 1, 2, 3.
   8. breast: left, right.
   9. breast-quad: left-up, left-low, right-up,	right-low, central.
  10. irradiat:	yes, no.

8. Missing Attribute Values: (denoted by "?")
   Attribute #:  Number of instances with missing values:
   6.             8
   9.             1.

9. Class Distribution:
    1. no-recurrence-events: 201 instances
    2. recurrence-events: 85 instances

In [0]:
# section 2

In [3]:
import pandas as pd
import numpy as np
url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/breast-cancer/breast-cancer.data'
df = pd.read_csv(url, header=None)
df.columns = ["Class","age","menopause","tumor-size","inv-nodes","node-caps","deg-malig","breast","breast-quad","irradiat"]
df.head()
with pd.option_context('display.max_rows', None, 'display.max_columns', None):  # more options can be specified also
    print(df)

                    Class    age menopause tumor-size inv-nodes node-caps  \
0    no-recurrence-events  30-39   premeno      30-34       0-2        no   
1    no-recurrence-events  40-49   premeno      20-24       0-2        no   
2    no-recurrence-events  40-49   premeno      20-24       0-2        no   
3    no-recurrence-events  60-69      ge40      15-19       0-2        no   
4    no-recurrence-events  40-49   premeno        0-4       0-2        no   
5    no-recurrence-events  60-69      ge40      15-19       0-2        no   
6    no-recurrence-events  50-59   premeno      25-29       0-2        no   
7    no-recurrence-events  60-69      ge40      20-24       0-2        no   
8    no-recurrence-events  40-49   premeno      50-54       0-2        no   
9    no-recurrence-events  40-49   premeno      20-24       0-2        no   
10   no-recurrence-events  40-49   premeno        0-4       0-2        no   
11   no-recurrence-events  50-59      ge40      25-29       0-2        no   

In [0]:
# section 3

In [6]:
from scipy.stats import chi2_contingency
categorical_features = ["Class","age","menopause","tumor-size","inv-nodes","node-caps","deg-malig","breast","breast-quad","irradiat"]
feature_num = len(categorical_features)

result = pd.DataFrame(index=categorical_features, columns=categorical_features)

for i in range(feature_num-1):
  for j in range(i+1,feature_num):
    print("======================================================================================")
    print("==========================>Comparing ",categorical_features[i]," vs. ",categorical_features[j])
    print("======================================================================================")
    
    #Get the title of the two columns that we are intersted
    col_of_interest = []
    col_of_interest.append(categorical_features[i])
    col_of_interest.append(categorical_features[j])

    #get the sub dataframe with the selected two colums
    sub_df = df[col_of_interest].dropna()

    # display(sub_df)

    # Display groupby count for one feature, no use here
    # stats_feature1_sum = sub_df[[categorical_features[i]]].groupby([categorical_features[i]]).size().reset_index(name='counts')
    # display(stats_feature1_sum)
    # stats_feature1_sum = sub_df[[categorical_features[j]]].groupby([categorical_features[j]]).size().reset_index(name='counts')
    # display(stats_feature1_sum)

    #display the groupby count for two selected features
    stats_duo_features_sum = sub_df[col_of_interest].groupby(col_of_interest).size().reset_index(name='counts')
    # display(stats_duo_features_sum)

    #change values of one feature as the colums of the table
    stats_duo_features_sum_inverted = stats_duo_features_sum.pivot(index=categorical_features[i], columns=categorical_features[j])['counts'].fillna(0)
    
    #change all datatype to int
    stats_duo_features_sum_inverted = stats_duo_features_sum_inverted.astype('int32')

    print("\n\n" + "Summation Table" + "\n")
    display(stats_duo_features_sum_inverted)

    #make hypophysis
    H0 = categorical_features[i] + ' and ' +  categorical_features[j] + ' are independent.'
    Ha = categorical_features[i] + ' and ' +  categorical_features[j] + ' are dependent.'

    #get nparray data from the summation table
    table = np.array(stats_duo_features_sum_inverted.values)

    # display(table)

    alpha  = 0.05

    '''
    #The 
    stat, pval, dof, exp_table = chi2_contingency(table)
    function Returns
    stat :The test statistic.
    pval: The p-value of the test
    dof : Degrees of freedom
    exp_table :The expected frequencies, based on the marginal sums of the table.
    '''

    #Do the chi2 contingency calculation
    stat, pval, dof, exp_table = chi2_contingency(table)

    #Display result
    print("\n\n")
    if pval > 0.05:
      print('Accept null hypothesis.', H0)
      result.iloc[i,j] = "independent"
    else:
      print('Reject null hypothesis.', Ha)
      result.iloc[i,j] = "dependent"
    print("\n\n")

    

#Summation of the test
display(result)






Summation Table



age,20-29,30-39,40-49,50-59,60-69,70-79
Class,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
no-recurrence-events,1,21,63,71,40,5
recurrence-events,0,15,27,25,17,1





Accept null hypothesis. Class and age are independent.





Summation Table



menopause,ge40,lt40,premeno
Class,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
no-recurrence-events,94,5,102
recurrence-events,35,2,48





Accept null hypothesis. Class and menopause are independent.





Summation Table



tumor-size,0-4,10-14,15-19,20-24,25-29,30-34,35-39,40-44,45-49,5-9,50-54
Class,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1
no-recurrence-events,7,27,23,34,36,35,12,16,2,4,5
recurrence-events,1,1,7,16,18,25,7,6,1,0,3





Accept null hypothesis. Class and tumor-size are independent.





Summation Table



inv-nodes,0-2,12-14,15-17,24-26,3-5,6-8,9-11
Class,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
no-recurrence-events,167,1,3,0,19,7,4
recurrence-events,46,2,3,1,17,10,6





Reject null hypothesis. Class and inv-nodes are dependent.





Summation Table



node-caps,?,no,yes
Class,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
no-recurrence-events,5,171,25
recurrence-events,3,51,31





Reject null hypothesis. Class and node-caps are dependent.





Summation Table



deg-malig,1,2,3
Class,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
no-recurrence-events,59,102,40
recurrence-events,12,28,45





Reject null hypothesis. Class and deg-malig are dependent.





Summation Table



breast,left,right
Class,Unnamed: 1_level_1,Unnamed: 2_level_1
no-recurrence-events,103,98
recurrence-events,49,36





Accept null hypothesis. Class and breast are independent.





Summation Table



breast-quad,?,central,left_low,left_up,right_low,right_up
Class,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
no-recurrence-events,0,17,75,71,18,20
recurrence-events,1,4,35,26,6,13





Accept null hypothesis. Class and breast-quad are independent.





Summation Table



irradiat,no,yes
Class,Unnamed: 1_level_1,Unnamed: 2_level_1
no-recurrence-events,164,37
recurrence-events,54,31





Reject null hypothesis. Class and irradiat are dependent.





Summation Table



menopause,ge40,lt40,premeno
age,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
20-29,0,0,1
30-39,0,1,35
40-49,9,0,81
50-59,59,4,33
60-69,55,2,0
70-79,6,0,0





Reject null hypothesis. age and menopause are dependent.





Summation Table



tumor-size,0-4,10-14,15-19,20-24,25-29,30-34,35-39,40-44,45-49,5-9,50-54
age,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1
20-29,0,0,0,0,0,0,1,0,0,0,0
30-39,2,2,5,6,6,7,3,4,0,1,0
40-49,2,8,5,21,18,20,7,5,1,1,2
50-59,3,9,10,14,21,20,7,8,0,1,3
60-69,0,8,9,8,9,13,1,3,2,1,3
70-79,1,1,1,1,0,0,0,2,0,0,0





Accept null hypothesis. age and tumor-size are independent.





Summation Table



inv-nodes,0-2,12-14,15-17,24-26,3-5,6-8,9-11
age,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
20-29,1,0,0,0,0,0,0
30-39,24,0,0,0,6,4,2
40-49,68,3,3,0,10,3,3
50-59,71,0,3,0,12,6,4
60-69,44,0,0,1,8,4,0
70-79,5,0,0,0,0,0,1





Accept null hypothesis. age and inv-nodes are independent.





Summation Table



node-caps,?,no,yes
age,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
20-29,0,1,0
30-39,0,28,8
40-49,1,71,18
50-59,4,71,21
60-69,2,46,9
70-79,1,5,0





Accept null hypothesis. age and node-caps are independent.





Summation Table



deg-malig,1,2,3
age,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
20-29,0,1,0
30-39,7,17,12
40-49,18,48,24
50-59,25,39,32
60-69,17,24,16
70-79,4,1,1





Accept null hypothesis. age and deg-malig are independent.





Summation Table



breast,left,right
age,Unnamed: 1_level_1,Unnamed: 2_level_1
20-29,0,1
30-39,21,15
40-49,41,49
50-59,56,40
60-69,30,27
70-79,4,2





Accept null hypothesis. age and breast are independent.





Summation Table



breast-quad,?,central,left_low,left_up,right_low,right_up
age,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
20-29,0,0,0,0,0,1
30-39,0,5,14,10,4,3
40-49,0,3,33,31,11,12
50-59,1,8,38,32,6,11
60-69,0,4,24,22,2,5
70-79,0,1,1,2,1,1





Accept null hypothesis. age and breast-quad are independent.





Summation Table



irradiat,no,yes
age,Unnamed: 1_level_1,Unnamed: 2_level_1
20-29,1,0
30-39,26,10
40-49,64,26
50-59,81,15
60-69,41,16
70-79,5,1





Accept null hypothesis. age and irradiat are independent.





Summation Table



tumor-size,0-4,10-14,15-19,20-24,25-29,30-34,35-39,40-44,45-49,5-9,50-54
menopause,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1
ge40,4,13,15,23,19,28,6,13,2,2,4
lt40,0,1,2,2,0,2,0,0,0,0,0
premeno,4,14,13,25,35,30,13,9,1,2,4





Accept null hypothesis. menopause and tumor-size are independent.





Summation Table



inv-nodes,0-2,12-14,15-17,24-26,3-5,6-8,9-11
menopause,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
ge40,94,1,3,1,16,10,4
lt40,7,0,0,0,0,0,0
premeno,112,2,3,0,20,7,6





Accept null hypothesis. menopause and inv-nodes are independent.





Summation Table



node-caps,?,no,yes
menopause,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
ge40,5,100,24
lt40,2,5,0
premeno,1,117,32





Reject null hypothesis. menopause and node-caps are dependent.





Summation Table



deg-malig,1,2,3
menopause,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
ge40,34,49,46
lt40,4,1,2
premeno,33,80,37





Reject null hypothesis. menopause and deg-malig are dependent.





Summation Table



breast,left,right
menopause,Unnamed: 1_level_1,Unnamed: 2_level_1
ge40,72,57
lt40,5,2
premeno,75,75





Accept null hypothesis. menopause and breast are independent.





Summation Table



breast-quad,?,central,left_low,left_up,right_low,right_up
menopause,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
ge40,1,12,49,47,8,12
lt40,0,0,3,3,0,1
premeno,0,9,58,47,16,20





Accept null hypothesis. menopause and breast-quad are independent.





Summation Table



irradiat,no,yes
menopause,Unnamed: 1_level_1,Unnamed: 2_level_1
ge40,100,29
lt40,7,0
premeno,111,39





Accept null hypothesis. menopause and irradiat are independent.





Summation Table



inv-nodes,0-2,12-14,15-17,24-26,3-5,6-8,9-11
tumor-size,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
0-4,8,0,0,0,0,0,0
10-14,26,0,0,0,1,1,0
15-19,26,1,1,0,0,1,1
20-24,36,0,0,1,12,1,0
25-29,40,1,1,0,8,3,1
30-34,37,1,1,0,10,7,4
35-39,13,0,2,0,0,1,3
40-44,14,0,1,0,5,2,0
45-49,2,0,0,0,0,1,0
5-9,4,0,0,0,0,0,0





Accept null hypothesis. tumor-size and inv-nodes are independent.





Summation Table



node-caps,?,no,yes
tumor-size,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
0-4,0,8,0
10-14,0,27,1
15-19,1,25,4
20-24,2,39,9
25-29,3,41,10
30-34,2,42,16
35-39,0,12,7
40-44,0,16,6
45-49,0,2,1
5-9,0,4,0





Accept null hypothesis. tumor-size and node-caps are independent.





Summation Table



deg-malig,1,2,3
tumor-size,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
0-4,3,4,1
10-14,14,12,2
15-19,9,15,6
20-24,10,27,13
25-29,11,26,17
30-34,13,21,26
35-39,2,7,10
40-44,5,9,8
45-49,1,1,1
5-9,2,2,0





Reject null hypothesis. tumor-size and deg-malig are dependent.





Summation Table



breast,left,right
tumor-size,Unnamed: 1_level_1,Unnamed: 2_level_1
0-4,4,4
10-14,16,12
15-19,15,15
20-24,26,24
25-29,29,25
30-34,35,25
35-39,10,9
40-44,10,12
45-49,2,1
5-9,3,1





Accept null hypothesis. tumor-size and breast are independent.





Summation Table



breast-quad,?,central,left_low,left_up,right_low,right_up
tumor-size,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
0-4,0,5,1,0,2,0
10-14,0,1,12,12,2,1
15-19,0,3,15,8,3,1
20-24,0,4,19,20,3,4
25-29,0,2,22,17,8,5
30-34,1,4,18,21,5,11
35-39,0,0,9,6,0,4
40-44,0,0,9,9,0,4
45-49,0,1,1,0,0,1
5-9,0,1,1,0,1,1





Reject null hypothesis. tumor-size and breast-quad are dependent.





Summation Table



irradiat,no,yes
tumor-size,Unnamed: 1_level_1,Unnamed: 2_level_1
0-4,8,0
10-14,25,3
15-19,24,6
20-24,41,9
25-29,37,17
30-34,44,16
35-39,15,4
40-44,15,7
45-49,1,2
5-9,3,1





Accept null hypothesis. tumor-size and irradiat are independent.





Summation Table



node-caps,?,no,yes
inv-nodes,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
0-2,3,201,9
12-14,0,1,2
15-17,0,1,5
24-26,0,0,1
3-5,2,15,19
6-8,0,3,14
9-11,3,1,6





Reject null hypothesis. inv-nodes and node-caps are dependent.





Summation Table



deg-malig,1,2,3
inv-nodes,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
0-2,67,98,48
12-14,0,0,3
15-17,0,1,5
24-26,0,0,1
3-5,3,20,13
6-8,0,7,10
9-11,1,4,5





Reject null hypothesis. inv-nodes and deg-malig are dependent.





Summation Table



breast,left,right
inv-nodes,Unnamed: 1_level_1,Unnamed: 2_level_1
0-2,115,98
12-14,2,1
15-17,3,3
24-26,1,0
3-5,14,22
6-8,12,5
9-11,5,5





Accept null hypothesis. inv-nodes and breast are independent.





Summation Table



breast-quad,?,central,left_low,left_up,right_low,right_up
inv-nodes,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
0-2,1,18,79,73,18,24
12-14,0,0,0,1,2,0
15-17,0,0,3,2,0,1
24-26,0,0,1,0,0,0
3-5,0,2,17,12,0,5
6-8,0,1,7,4,3,2
9-11,0,0,3,5,1,1





Accept null hypothesis. inv-nodes and breast-quad are independent.





Summation Table



irradiat,no,yes
inv-nodes,Unnamed: 1_level_1,Unnamed: 2_level_1
0-2,183,30
12-14,0,3
15-17,5,1
24-26,0,1
3-5,19,17
6-8,8,9
9-11,3,7





Reject null hypothesis. inv-nodes and irradiat are dependent.





Summation Table



deg-malig,1,2,3
node-caps,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
?,5,1,2
no,66,103,53
yes,0,26,30





Reject null hypothesis. node-caps and deg-malig are dependent.





Summation Table



breast,left,right
node-caps,Unnamed: 1_level_1,Unnamed: 2_level_1
?,6,2
no,116,106
yes,30,26





Accept null hypothesis. node-caps and breast are independent.





Summation Table



breast-quad,?,central,left_low,left_up,right_low,right_up
node-caps,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
?,0,0,4,3,1,0
no,1,18,85,76,18,24
yes,0,3,21,18,5,9





Accept null hypothesis. node-caps and breast-quad are independent.





Summation Table



irradiat,no,yes
node-caps,Unnamed: 1_level_1,Unnamed: 2_level_1
?,2,6
no,188,34
yes,28,28





Reject null hypothesis. node-caps and irradiat are dependent.





Summation Table



breast,left,right
deg-malig,Unnamed: 1_level_1,Unnamed: 2_level_1
1,37,34
2,65,65
3,50,35





Accept null hypothesis. deg-malig and breast are independent.





Summation Table



breast-quad,?,central,left_low,left_up,right_low,right_up
deg-malig,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
1,0,7,28,22,7,7
2,0,10,50,43,11,16
3,1,4,32,32,6,10





Accept null hypothesis. deg-malig and breast-quad are independent.





Summation Table



irradiat,no,yes
deg-malig,Unnamed: 1_level_1,Unnamed: 2_level_1
1,64,7
2,98,32
3,56,29





Reject null hypothesis. deg-malig and irradiat are dependent.





Summation Table



breast-quad,?,central,left_low,left_up,right_low,right_up
breast,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
left,1,11,78,36,17,9
right,0,10,32,61,7,24





Reject null hypothesis. breast and breast-quad are dependent.





Summation Table



irradiat,no,yes
breast,Unnamed: 1_level_1,Unnamed: 2_level_1
left,117,35
right,101,33





Accept null hypothesis. breast and irradiat are independent.





Summation Table



irradiat,no,yes
breast-quad,Unnamed: 1_level_1,Unnamed: 2_level_1
?,1,0
central,19,2
left_low,82,28
left_up,72,25
right_low,17,7
right_up,27,6





Accept null hypothesis. breast-quad and irradiat are independent.





Unnamed: 0,Class,age,menopause,tumor-size,inv-nodes,node-caps,deg-malig,breast,breast-quad,irradiat
Class,,independent,independent,independent,dependent,dependent,dependent,independent,independent,dependent
age,,,dependent,independent,independent,independent,independent,independent,independent,independent
menopause,,,,independent,independent,dependent,dependent,independent,independent,independent
tumor-size,,,,,independent,independent,dependent,independent,dependent,independent
inv-nodes,,,,,,dependent,dependent,independent,independent,dependent
node-caps,,,,,,,dependent,independent,independent,dependent
deg-malig,,,,,,,,independent,independent,dependent
breast,,,,,,,,,dependent,independent
breast-quad,,,,,,,,,,independent
irradiat,,,,,,,,,,
