## Step 1: Load the Saved Model

In [None]:
from tensorflow.keras.models import load_model

# Load the saved model
nn_model_sf = load_model('../model/single_family_nn_model.keras')
nn_model_th = load_model('../model/townhouse_nn_model.keras')

## Step 2: Prepare the New Dataset
- Ensure that your new dataset is preprocessed in the same way as the dataset you used to train the model. This includes:
    -  Handling Missing Values: Fill or drop any missing data as needed.
    -   Feature Scaling: Use the same scaling method (e.g., StandardScaler) that was applied to the original training data. You'll need to load the same scaler or apply the same scaling method to the new data.


In [None]:
# Load and preprocess the new dataset
new_data = pd.read_csv('new_dataset.csv')

# Convert 'Date' to datetime if present
new_data['Date'] = pd.to_datetime(new_data['Date'], format='%Y-%m')

# Remove commas and convert GDP column to numeric if needed
new_data['All industries GDP'] = new_data['All industries GDP'].replace({',': ''}, regex=True).astype(float)

# Handle missing values
new_data.fillna(new_data.mean(numeric_only=True), inplace=True)

# Select features (excluding target variables if present)
X_new = new_data.drop(columns=['Date', 'Single_Family_Benchmark_SA', 'Townhouse_Benchmark_SA'], errors='ignore')

# Apply the same scaling as was used for the training data
X_new_scaled = scaler.transform(X_new)

## Step 3: Make Predictions
Use the loaded model to make predictions on the new dataset.

In [None]:
# Predict Single Family Benchmark SA for the new data
predictions_sf = nn_model_sf.predict(X_new_scaled)

# Predict Townhouse Benchmark SA for the new data
predictions_th = nn_model_th.predict(X_new_scaled)

## Step 4: Interpret the Predictions
The output **predictions_sf** and **predictions_th** will be arrays of predicted values for Single_Family_Benchmark_SA and Townhouse_Benchmark_SA, respectively.

In [None]:
# Convert predictions to a DataFrame (optional)
predictions_df = pd.DataFrame({
    'Predicted_Single_Family_Benchmark_SA': predictions_sf.flatten(),
    'Predicted_Townhouse_Benchmark_SA': predictions_th.flatten()
})

# Optionally, save the predictions to a CSV file
predictions_df.to_csv('predicted_house_prices.csv', index=False)


## Additional Considerations:
- **Consistency:** Ensure that the features in your new dataset match the features used to train the model. Any mismatch in columns will cause errors.
- **Scaling:** Always apply the same scaler used during training to your new data. If the scaler was saved, load it and use it to transform the new data.

By following these steps, you can effectively use your saved neural network models to make predictions on new datasets, ensuring that your preprocessing steps remain consistent with the original training process

# Automation

To automate the process of loading a saved model, preprocessing a new dataset, making predictions, and saving the results, you can create a Python script or function. 

Below is an example of how to structure this into a reusable function.

## Step 1: Create the Automation Function

In [None]:
import pandas as pd
from tensorflow.keras.models import load_model
from sklearn.preprocessing import StandardScaler

def predict_house_prices(model_path, scaler, input_csv, output_csv):
    # Load the saved model
    model = load_model(model_path)

    # Load the new dataset
    new_data = pd.read_csv(input_csv)

    # Convert 'Date' to datetime if present
    if 'Date' in new_data.columns:
        new_data['Date'] = pd.to_datetime(new_data['Date'], format='%Y-%m')

    # Remove commas and convert GDP column to numeric if needed
    if 'All industries GDP' in new_data.columns:
        new_data['All industries GDP'] = new_data['All industries GDP'].replace({',': ''}, regex=True).astype(float)

    # Handle missing values
    new_data.fillna(new_data.mean(numeric_only=True), inplace=True)

    # Select features (excluding target variables if present)
    features_to_drop = ['Date', 'Single_Family_Benchmark_SA', 'Townhouse_Benchmark_SA']
    X_new = new_data.drop(columns=features_to_drop, errors='ignore')

    # Apply the same scaling as was used for the training data
    X_new_scaled = scaler.transform(X_new)

    # Make predictions
    predictions = model.predict(X_new_scaled)

    # Create a DataFrame with the predictions
    predictions_df = pd.DataFrame(predictions, columns=['Predicted_House_Prices'])

    # Combine with the original data if needed
    result_df = pd.concat([new_data, predictions_df], axis=1)

    # Save the predictions to a CSV file
    result_df.to_csv(output_csv, index=False)
    print(f'Predictions saved to {output_csv}')

# Example usage
if __name__ == "__main__":
    # Assume you have the scaler saved as a pickle file, load it
    import pickle
    with open('scaler.pkl', 'rb') as f:
        scaler = pickle.load(f)
    
    # Predict Single Family prices
    predict_house_prices('single_family_nn_model.keras', scaler, 'new_dataset.csv', 'predicted_single_family_prices.csv')

    # Predict Townhouse prices
    predict_house_prices('townhouse_nn_model.keras', scaler, 'new_dataset.csv', 'predicted_townhouse_prices.csv')


## Step 2: Save and Load the Scaler
When you trained your model, you used a StandardScaler to normalize the data. 

You need to save this scaler and load it when predicting new data:

In [None]:
#Saving the Scaler:
import pickle

# Assuming you have already fitted the scaler
with open('scaler.pkl', 'wb') as f:
    pickle.dump(scaler, f)

Loading the Scaler:

- The **predict_house_prices** function above assumes the scaler is loaded before being passed to the function.

## Step 3: Running the Script
Save the script to a .py file (e.g., predict_house_prices.py). You can then run the script from the command line:

![image](https://github.com/hitechparadigm/team_project_2/blob/team-project-2/img/automate_script.png)