In [1]:
# Load the dataset
mushrooms <- read.csv("mushrooms.csv", stringsAsFactors = TRUE)

# Display the structure of the dataset
str(mushrooms)

'data.frame':	8124 obs. of  23 variables:
 $ type                    : Factor w/ 2 levels "edible","poisonous": 2 1 1 2 1 1 1 1 2 1 ...
 $ cap_shape               : Factor w/ 6 levels "bell","conical",..: 3 3 1 3 3 3 1 1 3 1 ...
 $ cap_surface             : Factor w/ 4 levels "fibrous","grooves",..: 4 4 4 3 4 3 4 3 3 4 ...
 $ cap_color               : Factor w/ 10 levels "brown","buff",..: 1 10 9 9 4 10 9 9 9 10 ...
 $ bruises                 : Factor w/ 2 levels "no","yes": 2 2 2 2 1 2 2 2 2 2 ...
 $ odor                    : Factor w/ 9 levels "almond","anise",..: 8 1 2 8 7 1 1 2 8 1 ...
 $ gill_attachment         : Factor w/ 2 levels "attached","free": 2 2 2 2 2 2 2 2 2 2 ...
 $ gill_spacing            : Factor w/ 2 levels "close","crowded": 1 1 1 1 2 1 1 1 1 1 ...
 $ gill_size               : Factor w/ 2 levels "broad","narrow": 2 1 1 2 1 1 1 1 2 1 ...
 $ gill_color              : Factor w/ 12 levels "black","brown",..: 1 1 2 2 1 2 5 2 8 5 ...
 $ stalk_shape             : Factor w/

In [2]:
# Remove the one-level feature "veil_type"
mushrooms$veil_type <- NULL

# Check the result
str(mushrooms)

'data.frame':	8124 obs. of  22 variables:
 $ type                    : Factor w/ 2 levels "edible","poisonous": 2 1 1 2 1 1 1 1 2 1 ...
 $ cap_shape               : Factor w/ 6 levels "bell","conical",..: 3 3 1 3 3 3 1 1 3 1 ...
 $ cap_surface             : Factor w/ 4 levels "fibrous","grooves",..: 4 4 4 3 4 3 4 3 3 4 ...
 $ cap_color               : Factor w/ 10 levels "brown","buff",..: 1 10 9 9 4 10 9 9 9 10 ...
 $ bruises                 : Factor w/ 2 levels "no","yes": 2 2 2 2 1 2 2 2 2 2 ...
 $ odor                    : Factor w/ 9 levels "almond","anise",..: 8 1 2 8 7 1 1 2 8 1 ...
 $ gill_attachment         : Factor w/ 2 levels "attached","free": 2 2 2 2 2 2 2 2 2 2 ...
 $ gill_spacing            : Factor w/ 2 levels "close","crowded": 1 1 1 1 2 1 1 1 1 1 ...
 $ gill_size               : Factor w/ 2 levels "broad","narrow": 2 1 1 2 1 1 1 1 2 1 ...
 $ gill_color              : Factor w/ 12 levels "black","brown",..: 1 1 2 2 1 2 5 2 8 5 ...
 $ stalk_shape             : Factor w/

In [3]:
# Display counts and proportions of edible vs. poisonous mushrooms
table(mushrooms$type)
round(proportions(table(mushrooms$type)) * 100, 1)


   edible poisonous 
     4208      3916 


   edible poisonous 
     51.8      48.2 

In [4]:
# install.package("OneR")
library(OneR)

# Train the model on the whole data set, modeling "type" as a function of all other variables
mushrooms_1R <- OneR(type ~ ., data = mushrooms)

# Display the contents of the trained model
mushrooms_1R


Call:
OneR.formula(formula = type ~ ., data = mushrooms)

Rules:
If odor = almond   then type = edible
If odor = anise    then type = edible
If odor = creosote then type = poisonous
If odor = fishy    then type = poisonous
If odor = foul     then type = poisonous
If odor = musty    then type = poisonous
If odor = none     then type = edible
If odor = pungent  then type = poisonous
If odor = spicy    then type = poisonous

Accuracy:
8004 of 8124 instances classified correctly (98.52%)


In [5]:
# Generate predictions using the 1R model
mushrooms_1R_pred <- predict(mushrooms_1R, mushrooms)

library(gmodels)
CrossTable(mushrooms$type, mushrooms_1R_pred, dnn = c("Actual", "Predicted"), prop.chisq = FALSE, prop.r = FALSE, prop.c = FALSE)


 
   Cell Contents
|-------------------------|
|                       N |
|         N / Table Total |
|-------------------------|

 
Total Observations in Table:  8124 

 
             | Predicted 
      Actual |    edible | poisonous | Row Total | 
-------------|-----------|-----------|-----------|
      edible |      4208 |         0 |      4208 | 
             |     0.518 |     0.000 |           | 
-------------|-----------|-----------|-----------|
   poisonous |       120 |      3796 |      3916 | 
             |     0.015 |     0.467 |           | 
-------------|-----------|-----------|-----------|
Column Total |      4328 |      3796 |      8124 | 
-------------|-----------|-----------|-----------|

 


In [7]:
# install.packages("RWeka")
library(RWeka)

# Train the model on the whole data set, modeling "type" as a function of all other variables
mushrooms_JRip <- JRip(type ~ ., data = mushrooms)

# Display the contents of the trained model
mushrooms_JRip

# Display the summary of the trained model
summary(mushrooms_JRip)

JRIP rules:

(odor = foul) => type=poisonous (2160.0/0.0)
(gill_size = narrow) and (gill_color = buff) => type=poisonous (1152.0/0.0)
(gill_size = narrow) and (odor = pungent) => type=poisonous (256.0/0.0)
(odor = creosote) => type=poisonous (192.0/0.0)
(spore_print_color = green) => type=poisonous (72.0/0.0)
(stalk_surface_below_ring = scaly) and (stalk_surface_above_ring = silky) => type=poisonous (68.0/0.0)
(habitat = leaves) and (cap_color = white) => type=poisonous (8.0/0.0)
(stalk_color_above_ring = yellow) => type=poisonous (8.0/0.0)
 => type=edible (4208.0/0.0)

Number of Rules : 9



=== Summary ===

Correctly Classified Instances        8124              100      %
Incorrectly Classified Instances         0                0      %
Kappa statistic                          1     
Mean absolute error                      0     
Root mean squared error                  0     
Relative absolute error                  0      %
Root relative squared error              0      %
Total Number of Instances             8124     

=== Confusion Matrix ===

    a    b   <-- classified as
 4208    0 |    a = edible
    0 3916 |    b = poisonous

In [9]:
library(C50)

# Generate the C5.0 model using "rules = TRUE"
mushrooms_c50 <- C5.0(type ~ ., data = mushrooms, rules = TRUE)

# Display the contents of the trained model
summary(mushrooms_c50)


Call:
C5.0.formula(formula = type ~ ., data = mushrooms, rules = TRUE)


C5.0 [Release 2.07 GPL Edition]  	Mon Feb 13 21:37:45 2023
-------------------------------

Class specified by attribute `outcome'

Read 8124 cases (22 attributes) from undefined.data

Rules:

Rule 1: (4148/4, lift 1.9)
	cap_surface in {fibrous, scaly, smooth}
	odor in {almond, anise, none}
	stalk_color_below_ring in {gray, orange, pink, red, white}
	spore_print_color in {black, brown, buff, chocolate, orange, purple,
                              white, yellow}
	->  class edible  [0.999]

Rule 2: (3500/12, lift 1.9)
	cap_surface in {fibrous, scaly, smooth}
	odor in {almond, anise, none}
	stalk_root in {bulbous, club, equal, rooted}
	spore_print_color in {black, brown, purple, white}
	->  class edible  [0.996]

Rule 3: (3796, lift 2.1)
	odor in {creosote, fishy, foul, musty, pungent, spicy}
	->  class poisonous  [1.000]

Rule 4: (72, lift 2.0)
	spore_print_color = green
	->  class poisonous  [0.986]

Rule 5: (24,

In [10]:
# Generate the C5.0 model using "rules = TRUE"
mushrooms_c50_boost3 <- C5.0(type ~ ., data = mushrooms, rules = TRUE, trials = 3)

# Display the contents of the trained model
summary(mushrooms_c50_boost3)


Call:
C5.0.formula(formula = type ~ ., data = mushrooms, rules = TRUE, trials = 3)


C5.0 [Release 2.07 GPL Edition]  	Mon Feb 13 21:38:37 2023
-------------------------------

Class specified by attribute `outcome'

Read 8124 cases (22 attributes) from undefined.data

-----  Trial 0:  -----

Rules:

Rule 0/1: (4148/4, lift 1.9)
	cap_surface in {fibrous, scaly, smooth}
	odor in {almond, anise, none}
	stalk_color_below_ring in {gray, orange, pink, red, white}
	spore_print_color in {black, brown, buff, chocolate, orange, purple,
                              white, yellow}
	->  class edible  [0.999]

Rule 0/2: (3500/12, lift 1.9)
	cap_surface in {fibrous, scaly, smooth}
	odor in {almond, anise, none}
	stalk_root in {bulbous, club, equal, rooted}
	spore_print_color in {black, brown, purple, white}
	->  class edible  [0.996]

Rule 0/3: (3796, lift 2.1)
	odor in {creosote, fishy, foul, musty, pungent, spicy}
	->  class poisonous  [1.000]

Rule 0/4: (72, lift 2.0)
	spore_print_color = green