# For a given set of training data examples stored in a .CSV file, implement and demonstrate the Find-S algorithm to output a description of the set of all hypotheses consistent with the training examples.

## The Find-S algorithm is a machine learning algorithm that finds the most specific hypothesis that fits all positive training examples. It's a basic concept learning algorithm.

Data Structure:
- Each row represents a training example
- Each column (except last) represents an attribute
- Last column ('EnjoysMarket') is the target concept
- 'Yes' means positive example, 'No' means negative example

Algorithm Steps:
- Start with a very specific hypothesis
- Look at each positive training example
- Make hypothesis more general if it fails to cover the example
- Keep hypothesis as specific as possible while covering all positive examples

Generalization Rules:
- If attribute values differ, replace with '?'
- '?' means "any value" (most general value)
- Keep specific values when they're consistent

Output:
- Shows the most specific hypothesis that covers all positive examples
- '?' indicates attributes that don't matter

In [2]:
import csv
import pandas as pd

def find_s_algorithm(data):
    # Initialize hypothesis with the first positive example
    hypothesis = None
    
    # Process each training example
    for index, row in data.iterrows():
        # Consider only positive examples
        if row['EnjoysMarket'] == 'Yes':  # Assuming 'EnjoysMarket' is our target
            # For first positive example, initialize hypothesis
            if hypothesis is None:
                hypothesis = row[:-1].copy()  # Copy all attributes except the last (target)
            # For subsequent positive examples
            else:
                for attr in hypothesis.index:
                    # If attribute values are different, make it more general
                    if hypothesis[attr] != row[attr]:
                        hypothesis[attr] = '?'
    
    return hypothesis

df = pd.read_csv('weather_data.csv')
print("Training Data:")
print(df)
print("\nApplying Find-S Algorithm...")

# Apply Find-S algorithm
hypothesis = find_s_algorithm(df)
print("\nFinal Hypothesis:")
print(hypothesis)

Training Data:
  Weather Temperature Humidity    Wind EnjoysMarket
0   Sunny        Warm   Normal  Strong          Yes
1   Rainy        Cold     High  Strong           No
2   Sunny        Warm     High    Weak          Yes
3  Cloudy        Warm     High    Weak          Yes
4   Sunny        Cold   Normal    Weak           No

Applying Find-S Algorithm...

Final Hypothesis:
Weather           ?
Temperature    Warm
Humidity          ?
Wind              ?
Name: 0, dtype: object


This means:
- Temperature must be 'Warm'
- Other attributes can be any value