### Find S Algorithm

In [None]:
# Code by Bhavy Kharbanda
# Sap Id: 500082531

In [20]:
# Reference of the website for the algorithm:
# https://www.edureka.co/blog/find-s-algorithm-in-machine-learning/
# https://www.youtube.com/watch?v=O6vwN74aSGY

In [21]:
# > What happens in the Find-S algorithm?

# 1. Initialize ‘h’ to the most specific hypothesis.
# 2. The Find-S algorithm only considers the positive examples and eliminates negative examples. For each positive example, the algorithm checks for each attribute in the example. If the attribute value is the same as the hypothesis value, the algorithm moves on without any changes. But if the attribute value is different than the hypothesis value, the algorithm changes it to ‘?’.

In [22]:
# > Algorithm
# 1. The process starts with initializing ‘h’ with the most specific hypothesis, generally, it is the first positive example in the data set.
# 2. We check for each positive example. If the example is negative, we will move on to the next example but if it is a positive example we will consider it for the next step.
# 3. We will check if each attribute in the example is equal to the hypothesis value.
# 4. If the value matches, then no changes are made.
# 5. If the value does not match, the value is changed to ‘?’.
# 6. We do this until we reach the last positive example in the data set.

In [24]:
# Importing Libraries
import pandas as pd
import numpy as np

In [25]:
# Reading the data in the csv file
data = pd.read_csv("weather_dataset1.csv")
print("Dataset being used: ")
print(data)

Dataset being used: 
      Time Weather      Temp Company Humidity    Wind Target
0  Morning   Sunny      Warm     Yes     Mild  Strong    Yes
1  Evening   Rainy      Cold      No     Mild  Normal     No
2  Morning   Sunny  Moderate     Yes   Normal  Normal    Yes
3  Evening   Sunny      Cold     Yes     High  Strong    Yes


In [26]:
type(data)

pandas.core.frame.DataFrame

In [27]:
# Array of all the attributes
# Converting the dataframe to an array except the target values.
att = np.array(data)[:, :-1]
print("The attributes are: \n", att)

The attributes are: 
 [['Morning' 'Sunny' 'Warm' 'Yes' 'Mild' 'Strong']
 ['Evening' 'Rainy' 'Cold' 'No' 'Mild' 'Normal']
 ['Morning' 'Sunny' 'Moderate' 'Yes' 'Normal' 'Normal']
 ['Evening' 'Sunny' 'Cold' 'Yes' 'High' 'Strong']]


In [28]:
# Separating the target that has positive and negative examples
# Converting the tareget column to an array 

target = np.array(data)[:, -1]
print("The target is: \n", target)

The target is: 
 ['Yes' 'No' 'Yes' 'Yes']


In [29]:
# Main training function for the Algorithm:

def train1(gen_att, tar):
    for i, val in enumerate(tar):
        if val == "Yes":
            Spec_Hypothesis = gen_att[i].copy()
            break

    for i, val in enumerate(gen_att):
        if tar[i] == "Yes":
            for x in range(len(Spec_Hypothesis)):
                if val[x] != Spec_Hypothesis[x]:
                    Spec_Hypothesis[x] = '?'
                else:
                    pass

    return Spec_Hypothesis

In [30]:
# The Final Hypothesis is:

print("The final hypothesis is:",train1(att,target))

The final hypothesis is: ['?' 'Sunny' '?' 'Yes' '?' '?']


### Trying the same exmple on another dataset

In [31]:
# Reading the data in the csv file
data2 = pd.read_csv("weather_dataset2.csv")
print("Dataset being used: ")
print(data2)

Dataset being used: 
     sky airtemp humidity    wind water forcast enjoysport
0  sunny    warm   normal  strong  warm    same        yes
1  sunny    warm     high  strong  warm    same        yes
2  rainy    cold     high  strong  warm  change         no
3  sunny    warm     high  strong  cool  change        yes


In [32]:
type(data2)

pandas.core.frame.DataFrame

In [33]:
# Array of all the attributes
# Converting the dataframe to an array except the target values.
att2 = np.array(data2)[:, :-1]
print("The attributes are: \n", att2)

The attributes are: 
 [['sunny' 'warm' 'normal' 'strong' 'warm' 'same']
 ['sunny' 'warm' 'high' 'strong' 'warm' 'same']
 ['rainy' 'cold' 'high' 'strong' 'warm' 'change']
 ['sunny' 'warm' 'high' 'strong' 'cool' 'change']]


In [34]:
# Separating the target that has positive and negative examples
# Converting the tareget column to an array 

target2 = np.array(data2)[:, -1]
print("The target is: \n", target2)

The target is: 
 ['yes' 'yes' 'no' 'yes']


In [35]:
# Main training function for the Algorithm:
def train2(gen_att2, tar2):
    for i, val in enumerate(tar2):
        if val == "yes":
            obt_hypothesis = gen_att2[i].copy()
            break

    for i, val in enumerate(gen_att2):
        if tar2[i] == "yes":
            for x in range(len(obt_hypothesis)):
                if val[x] != obt_hypothesis[x]:
                    obt_hypothesis[x] = '?'
                else:
                    pass
    
    return obt_hypothesis

In [36]:
# The Final Hypothesis is:
print("The final hypothesis is:", train2(att2,target2))

The final hypothesis is: ['sunny' 'warm' '?' 'strong' '?' '?']
