# CANDIDATE-ELIMINATION Learning Algorithm in Concept Learning

Subject Code: 18AIL66

Program No.: 2

_Implements Candidate Elimination algorithm for concept learning over the given training     data and returns the boundry set of specific and general hypotheses._

In [1]:
# imports required packages

import pandas as pd
import numpy as np

## Preparing Data

In [2]:
# Reads relevant data

data = pd.read_csv("../../Data/enjoysport.csv")

In [3]:
# Views the data

display(data)

Unnamed: 0,Sky,AirTemp,Humidity,Wind,Water,Forecast,EnjoySport
0,Sunny,Warm,Normal,Strong,Warm,Same,Yes
1,Sunny,Warm,High,Strong,Warm,Same,Yes
2,Rainy,Cold,High,Strong,Warm,Change,No
3,Sunny,Warm,High,Strong,Cool,Change,Yes


In [4]:
# X represents a set of instances over which concept of learning is defined

X = np.array(data.iloc[:,0:-1])

In [5]:
# Shows training examples (without target)

display(X)

array([['Sunny', 'Warm', 'Normal', 'Strong', 'Warm', 'Same'],
       ['Sunny', 'Warm', 'High', 'Strong', 'Warm', 'Same'],
       ['Rainy', 'Cold', 'High', 'Strong', 'Warm', 'Change'],
       ['Sunny', 'Warm', 'High', 'Strong', 'Cool', 'Change']],
      dtype=object)

In [6]:
# Stores target in a seperate series
target = np.array(data.iloc[:,-1])

In [7]:
display(target)

array(['Yes', 'Yes', 'No', 'Yes'], dtype=object)

## Applying Candidate-Elimination Algorithm to Get Specific & General Hopothesis Boundry Sets

_**Psedocode for the Candidate-Elimination algorithm**_

1. Initialize G to the set of maximally general hypotheses in H
2. Initialize S to the set of maximally specific hypotheses in H
3. For each training example d, do
    - If d is a positive example
        - Remove from G any hypothesis inconsistent with d
        - For each hypothesis s in S that is not consistent with d ,-
            - Remove s from S
            - Add to S all minimal generalizations h of s such that
                - h is consistent with d, and some member of G is more general than h
            - Remove from S any hypothesis that is more general than another hypothesis in S
    - If d is a negative example
        - Remove from S any hypothesis inconsistent with d
        - For each hypothesis g in G that is not consistent with d
            - Remove g from G
            - Add to G all minimal specializations h of g such that
                - h is consistent with d, and some member of S is more specific than h
            - Remove from G any hypothesis that is less general than another hypothesis in G

In [8]:
def train(X, target):
    """
    Encapsulates the Candidate-Elimination algorithm
    
    Attributes
    ----------
    X: dataframe
        instances of training examples
    target: series
        the label against each instance
    """
            
    # Initializes boundry for specific hypothesis
    specific_h = X[0].copy()
    
    # Initializes boundry for general hypotheses
    general_h = [["?" for i in range(len(specific_h))] for i in range(len(specific_h))]
    
    print("\nInitialization:\nSpecific Boundry: {}\nGeneral Boundry: {}".format(
        specific_h, general_h))
    
    # Iterates through the example instances
    for i, h in enumerate(X):
        print("\nAfter Instance #", i+1 , ":", h, "[POSITIVE]" if target[i] == "Yes" else "[NEGATIVE]")
        
        if target[i] == "Yes":
            for x in range(len(specific_h)): 
                if h[x]!= specific_h[x]:                    
                    specific_h[x] ='?'                     
                    general_h[x][x] ='?'

        elif target[i] == "No":            
            for x in range(len(specific_h)): 
                if h[x]!= specific_h[x]:                    
                    general_h[x][x] = specific_h[x]                
                else:                    
                    general_h[x][x] = '?'        

        print("Specific Bundary:", specific_h)         
        print("Generic Boundary:", general_h)

    # Removes most general hypotheses from the general boundry set
    indices = [i for i, val in enumerate(general_h) if val == ['?', '?', '?', '?', '?', '?']]    
    for i in indices:   
        general_h.remove(['?', '?', '?', '?', '?', '?']) 

    return specific_h, general_h 

In [9]:
# Calls training function passing training data and target

specific_boundry, general_boundry = train(X, target)


Initialization:
Specific Boundry: ['Sunny' 'Warm' 'Normal' 'Strong' 'Warm' 'Same']
General Boundry: [['?', '?', '?', '?', '?', '?'], ['?', '?', '?', '?', '?', '?'], ['?', '?', '?', '?', '?', '?'], ['?', '?', '?', '?', '?', '?'], ['?', '?', '?', '?', '?', '?'], ['?', '?', '?', '?', '?', '?']]

After Instance # 1 : ['Sunny' 'Warm' 'Normal' 'Strong' 'Warm' 'Same'] [POSITIVE]
Specific Bundary: ['Sunny' 'Warm' 'Normal' 'Strong' 'Warm' 'Same']
Generic Boundary: [['?', '?', '?', '?', '?', '?'], ['?', '?', '?', '?', '?', '?'], ['?', '?', '?', '?', '?', '?'], ['?', '?', '?', '?', '?', '?'], ['?', '?', '?', '?', '?', '?'], ['?', '?', '?', '?', '?', '?']]

After Instance # 2 : ['Sunny' 'Warm' 'High' 'Strong' 'Warm' 'Same'] [POSITIVE]
Specific Bundary: ['Sunny' 'Warm' '?' 'Strong' 'Warm' 'Same']
Generic Boundary: [['?', '?', '?', '?', '?', '?'], ['?', '?', '?', '?', '?', '?'], ['?', '?', '?', '?', '?', '?'], ['?', '?', '?', '?', '?', '?'], ['?', '?', '?', '?', '?', '?'], ['?', '?', '?', '?', '?',

In [10]:
# Shows the boundry sets

print("Hypothesis in Specific Boundry:", specific_boundry, "\n")
print("Hypotheses in General Boundry:", general_boundry)

Hypothesis in Specific Boundry: ['Sunny' 'Warm' '?' 'Strong' '?' '?'] 

Hypotheses in General Boundry: [['Sunny', '?', '?', '?', '?', '?'], ['?', 'Warm', '?', '?', '?', '?']]
