# Coastal Hazard Assessment Project

## Introduction
This notebook documents the process of collecting and analyzing coastal environmental data using a custom Android application. The data is aimed at assessing coastal hazards and includes observations of various sea and plant life as well as beach composition.

### Background
The data was collected using a bespoke Android application designed for this purpose. Users can submit observations of different environmental variables such as types of sea life, beach composition, and other relevant metrics.

![Android App Screenshot](INSERT_IMAGE_LINK_HERE)

### Data Collection
The application allows users to input data while on-site at various beach locations. This data includes geographical coordinates, environmental conditions, and other pertinent observations.

## Data Preprocessing
The collected data is stored in Google Firestore and periodically exported to CSV format for analysis. This section will discuss the process of transforming raw data from Firestore into a structured format suitable for analysis.



## Feature Engineering

In order to facilitate analysis, we will create new metrics based on the collected data:

- **Sea Life Metric**: A composite score reflecting the diversity and abundance of sea life at each location.
- **Plant Life Metric**: A similar composite score for plant life.
- **Beach Composition Index**: An index representing the composition of the beach based on the proportions of sand, pebbles, rocks, and boulders.



In [None]:
# Load the dataset
import pandas as pd

df = pd.read_csv('data.csv')  # Update this with the current CSV file
df.head()


In [None]:
# Feature Engineering
# Import necessary libraries
import pandas as pd

# Load the dataset
file_path = '/mnt/data/data.csv'
data = pd.read_csv(file_path)

# Feature Engineering
# Sea Life Metric: Average of Anemones, Barnacles, Mussels, Oysters, Snails
data['Sea_Life_Metric'] = data[['Anemones', 'Barnacles', 'Mussels', 'Oysters', 'Snails']].mean(axis=1)

# Plant Life Metric: Since we have no specific plant data, we can assume it as zero or modify accordingly
# Assuming zero for now. If there are specific columns to be used, they can be added here.
data['Plant_Life_Metric'] = 0  # Modify this based on available plant data

# Composition Index: Average of Sand, Pebbles, Rocks, Stone
data['Composition_Index'] = data[['Sand', 'Pebbles', 'Rocks', 'Stone']].mean(axis=1)

# Display the first few rows of the updated dataframe
data.head()


Visual Analysis: Sea Life Distribution

The first part of our analysis focuses on the distribution of sea life across different coastal areas. By visualizing this data, we aim to identify patterns and areas of rich biodiversity, which are crucial for maintaining ecological balance and supporting marine life. The scatter plot below, combined with a regression line, illustrates the relationship between different areas and their sea life diversity.

Interpretation: [Here, describe what the scatter plot and regression line reveal about sea life distribution and any patterns or outliers you observe.]

import seaborn as sns
import matplotlib.pyplot as plt

# Scatter plot with regression line
sns.lmplot(x='Composition_Index', y='Sea_Life_Metric', data=df, aspect=2, line_kws={'color': 'red'})
plt.title('Relationship between Composition Index and Sea Life Metric')
plt.xlabel('Composition Index')
plt.ylabel('Sea Life Metric')
plt.show()


Visual Analysis: Plant Life Distribution

Next, we examine the distribution of plant life along the coast. Similar to our analysis of sea life, this visualization helps us identify areas with rich plant biodiversity, which can indicate healthy coastal ecosystems. The scatter plot and regression line below show how plant life varies across different regions.

Interpretation: [Here, describe what the scatter plot and regression line reveal about plant life distribution and any notable trends or deviations.]

# Scatter plot with regression line
sns.lmplot(x='Composition_Index', y='Plant_Life_Metric', data=df, aspect=2, line_kws={'color': 'green'})
plt.title('Relationship between Composition Index and Plant Life Metric')
plt.xlabel('Composition Index')
plt.ylabel('Plant Life Metric')
plt.show()


Visual Analysis: Composition Index

We also developed a Composition Index to quantify the physical makeup of the beach, including the presence of sand, pebbles, rocks, and boulders. This index can help us understand how the beach's physical characteristics might contribute to coastal erosion or habitat stability. The scatter plot below shows the relationship between the Composition Index and various coastal areas.

Interpretation: [Here, explain the implications of the Composition Index values, any correlations with other features, and what this means for coastal health.]

# Scatter plot with regression line
sns.lmplot(x='Sea_Life_Metric', y='Plant_Life_Metric', data=df, aspect=2, line_kws={'color': 'blue'})
plt.title('Interrelation between Sea Life and Plant Life Metrics')
plt.xlabel('Sea Life Metric')
plt.ylabel('Plant Life Metric')
plt.show()


Conclusion and Recommendations

Based on our visual analyses and feature engineering, we can draw several conclusions about coastal health and hazards. Areas with higher biodiversity in sea life and plant life tend to indicate healthier ecosystems. However, the physical composition of the beaches also plays a critical role in determining their vulnerability to erosion and other hazards.

Recommendations: [Here, provide recommendations based on your findings, such as conservation efforts, further studies needed, or policy suggestions.]