## Loading the data

In [1]:
from sklearn.datasets import load_iris
import pandas as pd

# Load the iris dataset
iris = load_iris()
# Convert to DataFrame
df = pd.DataFrame(iris.data, columns=iris.feature_names)
# Add the species column
df['species'] = iris.target

# Display the first few rows of the DataFrame
print(df.head())


   sepal length (cm)  sepal width (cm)  petal length (cm)  petal width (cm)  \
0                5.1               3.5                1.4               0.2   
1                4.9               3.0                1.4               0.2   
2                4.7               3.2                1.3               0.2   
3                4.6               3.1                1.5               0.2   
4                5.0               3.6                1.4               0.2   

   species  
0        0  
1        0  
2        0  
3        0  
4        0  


## Splitting the data

In [2]:
from sklearn.model_selection import train_test_split

# Features and Labels
X = df.drop('species', axis=1)
y = df['species']

# Splitting the data - 80% train, 20% test
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)


## Build and train the models.

In [5]:
from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score

# Decision Tree Model
dt_model = DecisionTreeClassifier(random_state=42)
dt_model.fit(X_train, y_train)

# Random Forest Model
rf_model = RandomForestClassifier()
rf_model.fit(X_train, y_train)

# Predictions
dt_predictions = dt_model.predict(X_test)
rf_predictions = rf_model.predict(X_test)

# Accuracy
print("Decision Tree Accuracy:", accuracy_score(y_test, dt_predictions))
print("Random Forest Accuracy:", accuracy_score(y_test, rf_predictions))


Decision Tree Accuracy: 1.0
Random Forest Accuracy: 1.0


## Feature importance comparision

In [6]:
# Feature importances from both models
dt_importances = dt_model.feature_importances_
rf_importances = rf_model.feature_importances_

# Create a DataFrame for better visualization
feature_importances_df = pd.DataFrame({
    'Feature': iris.feature_names,
    'DT Importance': dt_importances,
    'RF Importance': rf_importances
})

print(feature_importances_df)


             Feature  DT Importance  RF Importance
0  sepal length (cm)       0.000000       0.076213
1   sepal width (cm)       0.016670       0.028866
2  petal length (cm)       0.906143       0.439660
3   petal width (cm)       0.077186       0.455262


The comparison of feature importances between Decision Trees and Random Forests can provide insights into how each model perceives the importance of the features. While both can highlight important features, the Random Forest's aggregated view might offer a more balanced perspective, especially if some features have interaction effects that a single Decision Tree might miss.

### Conclusion
If the primary concern is predictive accuracy and generalizability, and computational resources are not a constraint, a Random Forest model may be considered better.
Interms of interpretability and computational efficiency, or if you're working with a relatively simple dataset where overfitting is not a major concern, a Decision Tree might be sufficient.