# **ADD HERE THE NOTEBOOK NAME**

## Objectives

* Write here your notebook objective, for example, "Fetch data from Kaggle and save as raw data", or "engineer features for modelling"

## Inputs

* Write here which data or information you need to run the notebook

## Outputs

* Write here which files, code or artefacts you generate by the end of the notebook

## Additional Comments


* In case you have any additional comments that don't fit in the previous bullets, please state them here.


---

# Install python packages in the notebooks

<span style="color:red;">IMPORTANT!!! Change "ml-template-forked" to the name that you have given your GitHub/GitPod Workspace.</span>

In [1]:
%pip install -r /workspace/manned-unmanned-airplane-classifer/requirements.txt

Defaulting to user installation because normal site-packages is not writeable
Note: you may need to restart the kernel to use updated packages.


---

# Change working directory

* We are assuming you will store the notebooks in a subfolder, therefore when running the notebook in the editor, you will need to change the working directory.  

We need to change the working directory from its current folder to its parent folder
* We access the current directory with os.getcwd()

In [2]:
import os
current_dir = os.getcwd()
current_dir

'/workspace/manned-unmanned-airplane-classifer/jupyter_notebooks'

We want to make the parent of the current directory the new current directory.
* os.path.dirname() gets the parent directory
* os.chir() defines the new current directory

In [3]:
os.chdir(os.path.dirname(current_dir))
print("You set a new current directory")

You set a new current directory


Confirm the new current directory

In [4]:
current_dir = os.getcwd()
current_dir

'/workspace/manned-unmanned-airplane-classifer'

---

---

## Table of Content

- [Section 1](#section-1)
- [Section 2](#section-2)
- [Save files to workspace](#save-files-to-workspace)


---

In [None]:
# --- Step 1: Import Libraries
import joblib
import pandas as pd
from sklearn.metrics import classification_report, confusion_matrix
import seaborn as sns
import matplotlib.pyplot as plt

# Biplanes vs Monoplanes from 1930s to early 1940s

In [None]:
# --- Step 2: Load Test Data and Models
X_train, X_test, y_train, y_test = joblib.load('aircraft_data_scaled.pkl')
models = {
    'KNN': joblib.load('model_knn.pkl'),
    'Random Forest': joblib.load('model_random_forest.pkl'),
    'SVM': joblib.load('model_svm.pkl'),
    'Neural Net': joblib.load('model_neural_net.pkl')
}

In [None]:
# --- Step 3: Evaluate and Compare All Models
summary = []
for name, model in models.items():
    y_pred = model.predict(X_test)
    report = classification_report(y_test, y_pred, output_dict=True)
    summary.append({
        'Model': name,
        'Accuracy': report['accuracy'],
        'Precision (class 0)': report['0']['precision'],
        'Recall (class 0)': report['0']['recall'],
        'Precision (class 1)': report['1']['precision'],
        'Recall (class 1)': report['1']['recall']
    })

summary_df = pd.DataFrame(summary)
print(summary_df.round(2))

In [None]:
# --- Step 4: Confusion Matrix Visualization for Best Model
best_model_name = summary_df.sort_values(by='Accuracy', ascending=False).iloc[0]['Model']
print(f"\nBest Model: {best_model_name}")

best_model = models[best_model_name]
y_pred = best_model.predict(X_test)
cm = confusion_matrix(y_test, y_pred)
sns.heatmap(cm, annot=True, fmt='d', cmap='Blues')
plt.title(f'Confusion Matrix - {best_model_name}')
plt.xlabel('Predicted')
plt.ylabel('Actual')
plt.show()


---

# Section 2


Section 2 content

---

# Save files to workspace

We will generate the following files
* Train set
* Test set
* Data cleaning and Feature Engineering pipeline
* Modeling pipeline
* etc.

In [None]:
topic = 'topic'  # datasets
notebook = 'notebook'  # collections
version = 'v1'
file_path = f'outputs/{notebook}/{notebook}/{version}'

try:
    os.makedirs(name=file_path)
except Exception as e:
    print(e)

In [None]:
import os
try:
  os.makedirs(name='outputs/datasets/collection') # create outputs/datasets/collection folder
except Exception as e:
  print(e)

df.to_csv(f"outputs/datasets/collection/TelcoCustomerChurn.csv",index=False)

## Train Set

Note that ...

In [None]:
print(X_train.shape)
X_train.head()

X_train.to_csv(f"{file_path}/X_train.csv", index=False)

In [None]:
y_train

In [None]:
y_train.to_csv(f"{file_path}/y_train.csv", index=False)