## Experiment 2
---
Implement Candidate Elimination Algorithm.  

A little theory:

Concept: a subset of objects or events defined over a larger set.

It is a boolean valued function, defined over the larger set.



- Given a set of examples labeled as members or non-members of a concept, concept- learning consists of automatically inferring the general definition of this concept.

- In other words, concept-learning consists of approximating a boolean-valued function from training examples of its input and output.

- The concept to be learned is called the Target Concept (denoted by
c: X--> {0,1})

- The set of Training Examples is a set of instances, x, along with their target concept value c(x).

- Members of the concept (instances for which c(x)=1) are called positive examples.

- Nonmembers of the concept (instances for which c(x)=0) are called negative examples.

- H represents the set of all possible hypotheses. H is determined by the human designer’s choice of a hypothesis representation.

- The goal of concept-learning is to find a hypothesis h:X --> {0,1} such that h(x)=c(x) for all x in X

CEA finds the version space V by identifying its general and specific boundaries, G and S. G contains the most general hypotheses in V, whereas the most specific ones are in S. We can obtain all the hypotheses in V by specializing the ones from G until we reach S or by generalizing those from S until we get G.

Let’s denote as V(G, S) the version space whose general and specific boundaries are G and S. At the start of CEA, G contains only the most general hypothesis in H. Similarly, S contains only the most specific hypothesis in H.

In [3]:
import csv

with open(r'C:\Users\kannu\OneDrive\Documents\github-repos\University\year-3\sem-6\appl-ml-industries\data\weather-1.csv') as csvFile:
    examples = [tuple(line) for line in csv.reader(csvFile)]

# initialize the specific and general boundaries
specific_boundary = ['0' for _ in range(len(examples[0])-1)]
general_boundary = ['?' for _ in range(len(examples[0])-1)]

# loop through each training example and update the boundary
for example in examples:
    if example[-1] == 'yes':
        # update the specific boundary
        for i in range(len(specific_boundary)):
            if specific_boundary[i] == '0':
                specific_boundary[i] = example[i]
            elif example[i] != specific_boundary[i]:
                specific_boundary[i] = '?'
        
        # update the general boundary
        for i in range(len(general_boundary)):
            if example[i] != specific_boundary[i]:
                general_boundary[i] = '?'
    else:
        # update the general boundary
        for i in range(len(general_boundary)):
            if example[i] != specific_boundary[i] and general_boundary[i] != '?':
                general_boundary[i] = '?'
                
print("Specific Boundary: ", specific_boundary)
print("General Boundary: ", general_boundary)


Specific Boundary:  ['?', '?', 'normal', '?']
General Boundary:  ['?', '?', '?', '?']
