# Feature Selection for Survival Prediction on the Titanic Dataset

## Introduction
This analysis aims to identify the most relevant features for predicting passenger survival on the Titanic. Using various visualizations, we examine how features such as passenger class, sex, age, fare, and embarkation point impact survival rates. These insights will guide feature selection for building a predictive model.


### Passenger Class (Pclass)

![Survival by Passenger Class](figures/survival_by_class.png)

Passengers in higher classes (e.g., first class) had significantly higher survival rates compared to those in lower classes. This suggests that `Pclass` is an important predictor of survival, potentially reflecting socioeconomic status and access to resources during the evacuation.


### Sex

![Survival by Sex](figures/survival_by_sex.png)

Survival rates are noticeably higher for females compared to males, likely due to priority given to women during evacuation. `Sex` is thus a strong predictor of survival and should be included in the model.


### Age

![Age Distribution of Survivors vs Non-Survivors](figures/age_distribution.png)

The age distribution plot shows that children had higher survival rates compared to adults. This indicates that age may influence survival, possibly because younger passengers were given priority. Therefore, `Age` is a relevant feature for prediction.


### Fare

![Fare Distribution by Survival](figures/fare_distribution.png)

Passengers who paid higher fares tend to have higher survival rates. `Fare` could serve as a proxy for socioeconomic status, making it relevant for survival prediction.


### Embarkation Point (Embarked)

![Survival Rate by Embarkation Point](figures/survival_by_embarked.png)

Survival rates vary slightly depending on embarkation points. While this feature may not be as strong as others, it could still provide additional context, especially in combination with `Pclass`.


## Conclusion

This analysis highlights key features that are likely to be effective predictors of survival on the Titanic. `Pclass`, `Sex`, and `Age` emerge as the most important features, given their strong correlations with survival. `Fare` and `Embarked` may also provide value, especially when combined with other features. These features will be considered in developing a survival prediction model, as they offer insights into passenger demographics and socioeconomic factors that influenced survival.
