<a href="https://colab.research.google.com/github/2303a51854/AIML/blob/main/labexam.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Prediction of South Indian travel Destinations using Holiday Data

1. Identify the top 5 attributes for south India destinations

2. Identify the attribute with most liked travel spots

3. Find the max and min attributes of choice for south India tourism


4. What is the role of beaches, theatres, malls, parks in south India tourism


5. Identify the sports with most attributes in south India

6. Apply either Classification Model or Clustering Model to evaluate the dataset

In [18]:
import pandas as pd


data = {
    'Destination': ['Kochi', 'Chennai', 'Ooty', 'Pondicherry', 'Madurai', 'Mysore', 'Kovalam', 'Bangalore'],
    'Attributes': [
        ['Beaches', 'Temples', 'Parks'],
        ['Beaches', 'Malls', 'Parks'],
        ['Temples', 'Mountains', 'Parks'],
        ['Beaches', 'Cultural Heritage'],
        ['Temples', 'Cultural Heritage', 'Parks'],
        ['Cultural Heritage', 'Parks'],
        ['Beaches', 'Parks', 'Water Sports'],
        ['Malls', 'Parks', 'Cultural Heritage']
    ]
}

df = pd.DataFrame(data)


attributes = df['Attributes'].explode().value_counts()
top_5_attributes = attributes.head(5)
print("Top 5 Attributes:")
print(top_5_attributes)

Top 5 Attributes:
Attributes
Parks                7
Beaches              4
Cultural Heritage    4
Temples              3
Malls                2
Name: count, dtype: int64


In [19]:

df['Rating'] = [4.5, 4.7, 4.3, 4.6, 4.8, 4.4, 4.7, 4.2]


df_exploded = df.explode('Attributes')


avg_rating_per_attribute = df_exploded.groupby('Attributes')['Rating'].mean().sort_values(ascending=False)
most_liked_attribute = avg_rating_per_attribute.head(1)
print("Most Liked Attribute:")
print(most_liked_attribute)

Most Liked Attribute:
Attributes
Water Sports    4.7
Name: Rating, dtype: float64


In [20]:

attribute_counts = df['Attributes'].explode().value_counts()


max_attribute = attribute_counts.idxmax()
min_attribute = attribute_counts.idxmin()

print("Max Attribute of Choice:", max_attribute)
print("Min Attribute of Choice:", min_attribute)

Max Attribute of Choice: Parks
Min Attribute of Choice: Mountains


In [21]:

beaches_role = df[df['Attributes'].apply(lambda x: 'Beaches' in x)]
theatres_role = df[df['Attributes'].apply(lambda x: 'Theatres' in x)]
malls_role = df[df['Attributes'].apply(lambda x: 'Malls' in x)]
parks_role = df[df['Attributes'].apply(lambda x: 'Parks' in x)]


print("Destinations with Beaches:")
print(beaches_role['Destination'])
print("\nDestinations with Theatres:")
print(theatres_role['Destination'])
print("\nDestinations with Malls:")
print(malls_role['Destination'])
print("\nDestinations with Parks:")
print(parks_role['Destination'])

Destinations with Beaches:
0          Kochi
1        Chennai
3    Pondicherry
6        Kovalam
Name: Destination, dtype: object

Destinations with Theatres:
Series([], Name: Destination, dtype: object)

Destinations with Malls:
1      Chennai
7    Bangalore
Name: Destination, dtype: object

Destinations with Parks:
0        Kochi
1      Chennai
2         Ooty
4      Madurai
5       Mysore
6      Kovalam
7    Bangalore
Name: Destination, dtype: object


In [22]:

df['Attributes'] = df['Attributes'].apply(lambda x: x + ['Cricket', 'Football', 'Water Sports'])


sports = df['Attributes'].explode().value_counts()


sports_with_most_attributes = sports.head(5)
print("Sports with Most Attributes:")
print(sports_with_most_attributes)

Sports with Most Attributes:
Attributes
Water Sports    9
Cricket         8
Football        8
Parks           7
Beaches         4
Name: count, dtype: int64


In [23]:
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.cluster import KMeans


df['Attributes_Str'] = df['Attributes'].apply(lambda x: ' '.join(x))


vectorizer = TfidfVectorizer(stop_words='english')
X = vectorizer.fit_transform(df['Attributes_Str'])


kmeans = KMeans(n_clusters=3, random_state=42)
df['Cluster'] = kmeans.fit_predict(X)


print("Clustering Results:")
print(df[['Destination', 'Cluster']])

Clustering Results:
   Destination  Cluster
0        Kochi        2
1      Chennai        2
2         Ooty        0
3  Pondicherry        1
4      Madurai        1
5       Mysore        1
6      Kovalam        2
7    Bangalore        1
