# **Setup, Data Preparation Imports, and Initial Visualization** üõ†Ô∏è

This cell performs the essential **initial setup** for the earthquake prediction project. It loads necessary libraries for data handling, machine learning, and visualization, and initializes a key data preprocessing utility.

### üìö **Core Libraries**
* **`csv`** and **`numpy` (`np`)**: Used for basic file reading and efficient numerical/array operations.
* **`tensorflow` (`tf`)**: The primary library for **building, compiling, and training the neural network model**.
* **`matplotlib.pyplot` (`plt`)**: Essential for **visualization** of the training results (accuracy and loss).

### ‚öôÔ∏è **Data Preparation**
* **`sklearn.impute.SimpleImputer`**: A crucial utility for **handling missing data** (NaN values).
* The code initializes an imputer (`imp`) set to use the **mean strategy** to fill in any missing features, ensuring the dataset is clean and ready for model training.

### üìà **Post-Training Visualization**
* The final lines plot the **model's training history** (`fitting.history`).
* It separates the plots for **accuracy** and **loss** to visually assess model performance and convergence over the training epochs.

**In summary, this cell imports the necessary machine learning toolkit and defines the initial strategy for handling imperfect real-world data.**


In [None]:
# Import necessary libraries
import csv
import numpy as np
import tensorflow as tf
from sklearn.impute import SimpleImputer
import matplotlib.pyplot as plt


# Initialize an imputer to handle missing data using the mean strategy
imp = SimpleImputer(missing_values=np.nan, strategy='mean',keep_empty_features=True)

# Load the dataset from 'train.csv'
# skip_header=1 skips the header row
data = np.genfromtxt('train.csv', delimiter=',', skip_header=1)

# Separate features (inputs) and labels (outputs)
# The first 35 columns are features, the 36th column is the label
inputs = data[:, 0:35]
outputs = data[:, 36]

# Extract the 7th column (index 6) from the inputs for testing (optional)
test = inputs[:, 6]
print("Sample of the 7th feature column:", test) # Print a sample of the extracted column

# Fit the imputer on the training data and transform it to handle missing values
imp.fit(inputs)
inputs=imp.transform(inputs)

# Define the deep learning model structure using a Sequential model
model = tf.keras.Sequential([
    tf.keras.layers.Flatten(), # Flatten the input layer
    tf.keras.layers.Dense(55, activation = "relu"), # First dense layer with ReLU activation
    tf.keras.layers.Dense(20,activation= 'relu'), # Second dense layer with ReLU activation
    tf.keras.layers.Dense(20,activation= 'relu'), # Third dense layer with ReLU activation
    tf.keras.layers.Dense(20,activation= 'relu'), # Fourth dense layer with ReLU activation
    tf.keras.layers.Dense(20,activation= 'relu'), # Fifth dense layer with ReLU activation
    tf.keras.layers.Dense(4, activation = 'softmax') # Output layer with Softmax activation for multi-class classification
])

# Compile the model
model.compile(
    loss = 'sparse_categorical_crossentropy', # Loss function for multi-class classification
    optimizer = tf.keras.optimizers.Adam(learning_rate= 0.00001), # Adam optimizer with a specified learning rate
    metrics= ['accuracy'], # Metric to monitor during training
)

# Train the model
# fitting stores the training history
fitting = model.fit(
    inputs, # Input features
    outputs, # Output labels
    epochs = 100000 # Number of training epochs
)

# Print the training accuracy history
print("Training Accuracy History:", fitting.history['accuracy'])

# Visualize the training accuracy
plt.plot(fitting.history['accuracy'])
plt.title('Model Accuracy') # Add a title to the plot
plt.xlabel('Epoch') # Add a label to the x-axis
plt.ylabel('Accuracy') # Add a label to the y-axis
plt.show() # Display the plot

# Visualize the training loss
plt.plot(fitting.history['loss'])
plt.title('Model Loss') # Add a title to the plot
plt.xlabel('Epoch') # Add a label to the x-axis
plt.ylabel('Loss') # Add a label to the y-axis
plt.show() # Display the plot