<a href="https://colab.research.google.com/github/DavidSenseman/BIO1173/blob/main/Class_02_4.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

---------------------------
**COPYRIGHT NOTICE:** This Jupyterlab Notebook is a Derivative work of [Jeff Heaton](https://github.com/jeffheaton) licensed under the Apache License, Version 2.0 (the "License"); You may not use this file except in compliance with the License. You may obtain a copy of the License at

> [http://www.apache.org/licenses/LICENSE-2.0](http://www.apache.org/licenses/LICENSE-2.0)

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

------------------------

# **BIO 1173: Intro Computational Biology**

##### **Module 2: Neural Networks with Tensorflow and Keras**

* Instructor: [David Senseman](mailto:David.Senseman@utsa.edu), [Department of Biology, Health and the Environment](https://sciences.utsa.edu/bhe/), [UTSA](https://www.utsa.edu/)

### Module 2 Material

* Part 2.1: Introduction to Neural Networks with Tensorflow and Keras
* Part 2.2: Encoding Feature Vectors
* Part 2.3: Controlling Overfitting
* **Part 2.4: Saving and Loading a Keras Neural Network**

## Google CoLab Instructions

You MUST run the following code cell to get credit for this class lesson. By running this code cell, you will map your GDrive to /content/drive and print out your Google GMAIL address. Your Instructor will use your GMAIL address to verify the author of this class lesson.

In [None]:
# You must run this cell first
try:
    from google.colab import drive
    drive.mount('/content/drive', force_remount=True)
    from google.colab import auth
    auth.authenticate_user()
    Colab = True
    print("Note: Using Google CoLab")
    import requests
    gcloud_token = !gcloud auth print-access-token
    gcloud_tokeninfo = requests.get('https://www.googleapis.com/oauth2/v3/tokeninfo?access_token=' + gcloud_token[0]).json()
    print(gcloud_tokeninfo['email'])
except:
    print("**WARNING**: Your GMAIL address was **not** printed in the output below.")
    print("**WARNING**: You will NOT receive credit for this lesson.")
    Colab = False

You should see the following output except your GMAIL address should appear on the last line.

![__](https://biologicslab.co/BIO1173/images/class_01/class_01_6_image01A.png)

If your GMAIL address does not appear your lesson will **not** be graded.

### Create Custom Functions

Run the cell below to create the function needed for this lesson.

In [None]:
# Simple function to print out elasped time
def hms_string(sec_elapsed):
    h = int(sec_elapsed / (60 * 60))
    m = int((sec_elapsed % (60 * 60)) / 60)
    s = sec_elapsed % 60
    return "{}:{:>02}:{:>05.2f}".format(h, m, s)

# **Saving and Loading a Keras Neural Network**

Complex neural networks will take a _long_ time to fit/train.  It is helpful to be able to save a trained neural network so that you can reload it and using it again.  Again, a reloaded neural network will **not** require retraining.  

Keras provides the following two formats for saving neural networks:

* **JSON** - Stores the neural network structure (no weights) in the [JSON file format](https://en.wikipedia.org/wiki/JSON).
* **Keras** - Stores the complete neural network (with weights) in the native Keras format.

Usually, you will want to save in native `Keras` format.

### Example 1A: Build, Compile and Train Classification Neural Network

The code in `Example 1` builds, compiles and trains a neural network called `or_model` that can classify the `Quality` of an orange based on its physical and chemical characteristics.

The code in the cell below reads the **`Orange Quality dataset`** from the course HTTP server and creates a DataFrame called **`or_df`** (i.e. "orange" DataFrame).

In order to create a feature vector, the 3 non-numeric columns in the dataset: `Color`, `Variety` and `Blemished` must be pre-processed. Mapping strings to integers is used to take care of the column `Color` while one-hot encoding is used to take care the column `Variety`. To take care of the column `Blemish`, it will simply be excluded (dropped) from the column list when generating the `X-values`.

There are 7 columns in `or_df` that are numeric:
* `Size (cm)`
* `Weight (g)`
* `Brix (Sweetness)`
* `pH (Acidity)`
* `Softness (1-5)`
* `HarvestTime (days)`
* `Ripeness (1-5)`

The following code chunk identifies these numeric columns:
```text
numeric_cols = or_df.select_dtypes(include=['int64', 'float64']).columns
```
Using this variable `numeric_cols`, we can normalizes all numeric values to their Z-scores with this code chunk:
```text
or_df[numeric_cols] = or_df[numeric_cols].apply(zscore)
```

We generate the `X` feature vector (`or_X`) by simply listing each column name that we wish to include as follows:
```text
# Generate X-values
or_X = or_df[['Size (cm)', 'Weight (g)', 'Brix (Sweetness)', 'pH (Acidity)',
       'Softness (1-5)', 'HarvestTime (days)', 'Ripeness (1-5)',
        'Color']].values
or_X = np.asarray(or_X).astype('float32')
```
It is important to emphasize the need to write each column name **exactly** as it appears in the `or_df` DataFrame.

Since we are building a classification neural network, we will need to one-hot encode the column `Quality (1-5)` which contains the `Y-values`.
```text
# Generate Y-values
dummies = pd.get_dummies(or_df['Quality (1-5)'], dtype=int) # Classification
or_Y = dummies.values
or_Y = np.asarray(or_Y).astype('float32')
```

It should be noted that this column is already numeric, so we are **not** using one-hot encoding to replace string values with an integer. Rather, one-hot encoding the `Y-values` is necessary to give the `Y-values` the **correct format** for a classification neural network.

The standard loss function for multi-class problems is `categorical cross-entropy` (softmax loss). This function expects the **true label** to be a probability distribution over the classes, i.e. a vector that contains a `1` for the correct class and `0s` everywhere else. This is exactly what one-hot encoding does.

Finally, we will train the model with `verbose = 0`. Therefore we will not see any output during training.


In [None]:
# Example 1A: Build, Compile and Train Classification Model

# ------------------------------------------------------------
# 0️⃣  Imports
# ------------------------------------------------------------
import pandas as pd
import time
import numpy as np
from scipy.stats import zscore
from keras.models import Sequential
from keras.layers import Dense, Input
from keras.layers import BatchNormalization, Dropout
from keras.callbacks import EarlyStopping, ModelCheckpoint
import numpy as np

# ------------------------------------------------------------
# 1️⃣  Parameters
# ------------------------------------------------------------
EPOCHS        = 100
PATIENCE      = 10
VERBOSE       = 0     # 0 means no output during training

# ------------------------------------------------------------
# 2️⃣  Load data
# ------------------------------------------------------------
or_df = pd.read_csv(
    "https://biologicslab.co/BIO1173/data/orange_quality.csv",
    na_values=['NA', '?'])


# ------------------------------------------------------------
# 5️⃣  Preprocessing
# ------------------------------------------------------------

# Map str to int
mapping = {'Orange':0,'Deep Orange':1,
           'Light Orange':2,'Orange-Red':3,
           'Yellow-Orange':4}
or_df['Color'] = or_df['Color'].map(mapping)

# Standardise all numeric column with z‑score
numeric_cols = or_df.select_dtypes(include=['int64', 'float64']).columns
or_df[numeric_cols] = or_df[numeric_cols].apply(zscore)


# Generate X-values
or_X = or_df[['Size (cm)', 'Weight (g)', 'Brix (Sweetness)', 'pH (Acidity)',
       'Softness (1-5)', 'HarvestTime (days)', 'Ripeness (1-5)',
        'Color']].values
or_X = np.asarray(or_X).astype('float32')

# Generate Y-values
dummies = pd.get_dummies(or_df['Quality (1-5)'], dtype=int) # Classification
or_Y = dummies.values
or_Y = np.asarray(or_Y).astype('float32')

# ------------------------------------------------------------
# 6️⃣  Build & compile model
# ------------------------------------------------------------

# Build model
or_model = Sequential([
    Input(shape=(or_X.shape[1],)),
    Dense(128, activation='relu'),
    BatchNormalization(),
    Dropout(0.3),
    Dense(64, activation='relu'),
    BatchNormalization(),
    Dropout(0.3),
    Dense(or_Y.shape[1], activation='softmax')
])

or_model.compile(
    loss='categorical_crossentropy',
    optimizer='adam',
    metrics=['accuracy']
)

# ------------------------------------------------------------
# 7️⃣  Callbacks (monitor accuracy)
# ------------------------------------------------------------
checkpoint_path = "or_best_classification_model.keras"
callbacks = [
    EarlyStopping(
        monitor="val_accuracy",
        patience=PATIENCE,
        restore_best_weights=True,
        mode="max"          # <– add this line
    ),
    ModelCheckpoint(
        filepath=checkpoint_path,
        monitor="val_accuracy",
        save_best_only=True,
        mode="max"
    ),
]

# ------------------------------------------------------------
# 8️⃣  Train model
# ------------------------------------------------------------
print(f"------Training Starting for {EPOCHS} epochs --------------")
start_time = time.time()
or_history = or_model.fit(
    or_X, or_Y,
    epochs=EPOCHS,
    validation_split=0.2,
    callbacks=callbacks,
    verbose=VERBOSE,
)

# ---------------------------------------------------------------------------
# 9️⃣ Inspect training
# ---------------------------------------------------------------------------
print(f"\nTraining finished.")
print(f"Best val accuracy: {np.max(or_history.history['val_accuracy']):.4f}")

# --- NEW: print the *last* epoch that was executed -------------
if or_history.epoch:                       # safety guard – should always be true
    last_epoch = or_history.epoch[-1] + 1  # +1 → human‑friendly (1‑based) count
    print(f"Training ran through epoch #{last_epoch} (total {len(or_history.epoch)} epochs).")
else:
    print("No epochs were run.")

elapsed_time = time.time() - start_time
print("Elapsed time: {}".format(hms_string(elapsed_time)))

If the code is correct you should see something _similar_ to the following output

![__](https://biologicslab.co:/BIO1173/images/class_02/class_02_4_image01B.png)

The `or_model` neural network trained very quickly (< 1 min) but the best validation accuracy (`val accuracy`) is only about 40-45%. It should also be noted that in this particular run, `EarlyStopping` terminated training at the 25th epoch.

### Example 1B: Visualize Training

The code in the cell below generates two side-by-side plots, an **Accuracy Curve** and a **Loss Curve**. These curves provide a visual way to follow what happened during training of `or_model`.

In [None]:
# Example 1B: Visualize training

import numpy as np

# Show the best validation accuracy
best_val_acc = np.max(or_history.history['val_accuracy'])
print(f"Best validation accuracy: {best_val_acc:.4f}")

# Show the best validation loss
best_val_loss = np.min(or_history.history['val_loss'])
print(f"Best validation loss: {best_val_loss:.4f}")

# Plot training history
import matplotlib.pyplot as plt

plt.figure(figsize=(12,5))

plt.subplot(1,2,1)
plt.plot(or_history.history['accuracy'], label='train')
plt.plot(or_history.history['val_accuracy'], label='val')
plt.title('Accuracy')
plt.xlabel('Epoch')
plt.legend()

plt.subplot(1,2,2)
plt.plot(or_history.history['loss'], label='train')
plt.plot(or_history.history['val_loss'], label='val')
plt.title('Loss')
plt.xlabel('Epoch')
plt.legend()

plt.show()


If the code is correct you should see something _similar_ to the following output

![__](https://biologicslab.co:/BIO1173/images/class_02/class_02_4_image02B.png)

Here's an interpretation of the training and validation curves shown in the example above:

####**Accuracy Curve (Left Plot)**
* **Training Accuracy** steadily increases over epochs, reaching about **0.6** by epoch 25. This suggests the model is learning and fitting the training data well.
* **Validation Accuracy** improves initially but **plateaus around 0.4** after epoch 5, indicating that the model's performance on unseen data stops improving early on.

####**Loss Curve (Right Plot)**
* **Training Loss** decreases consistently, reaching around 1.0 by epoch 25, which aligns with the increasing training accuracy.
* **Validation Loss** drops slightly in the first few epochs but then **stabilizes around 2.0**, showing that the model isn't generalizing well to the validation set.

#### **Summary**
* **Best Validation Accuracy:** 0.4286
* **Best Validation Loss:** 1.7553

#### **Interpretation**

The model is likely **overfitting**: it's learning the training data well but not generalizing to the validation data.

### **Exercise 1: Build and Train a Classification Neural Network**

In the cell below build and train a new classification neural network called `ap_model`.

Start by `copy-and-paste` Example 1 into the cell below.

Read the **`Apple Quality dataset`** and creat a DataFrame called `ap_df` ("apple" DataFrame) using this code chunk:
~~~text
ap_df = pd.read_csv(
    "https://biologicslab.co/BIO1173/data/apple_quality.csv",
    na_values=['NA', '?'])
~~~

The goal of your training your neural network model `ap_model` will be to learn how to predict the classification apples using the values in the following columns: 'Size', 'Weight', 'Sweetness', 'Crunchiness', 'Juiciness', 'Acidity' and 'Ripeness'. Since all of these columns are numeric you should standardize their values by converting them to their z-scores. You can simply re-use the code in `Example 1` making sure to change the prefix `or_` to `ap_`.

Again, since all of these columns are numeric, there is no need map any strings to integers as was done in `Example 1`. The easiest and _safest_ way to accomplish this is to simply comment-out the unecessary code in `Example 1` as follows:
```text
# Map str to int
#mapping = {'Orange':0,'Deep Orange':1,
#           'Light Orange':2,'Orange-Red':3,
#           'Yellow-Orange':4}
#or_df['Color'] = or_df['Color'].map(mapping)
```
Commenting-out code instead of simply deleting it has the distinct advantage that if you make a mistake, is relatively easy to correct it by simply adding (or removing) the **`#`** before the line.


To generate your X feature vector change this code chunk
```text
# Generate X-values
or_X = or_df[['Size (cm)', 'Weight (g)', 'Brix (Sweetness)', 'pH (Acidity)',
       'Softness (1-5)', 'HarvestTime (days)', 'Ripeness (1-5)',
        'Color']].values
or_X = np.asarray(or_X).astype('float32')
```
> to read as

```text
# Generate X‑values
ap_X = ap_df[['Size', 'Weight', 'Sweetness', 'Crunchiness',
              'Juiciness', 'Acidity', 'Ripeness']].values
ap_X = np.asarray(ap_X).astype('float32')
```
This will make sure on the correct columns are used to generate your `X` feature vector.

Since you are building a classification neural network, you will need to one-Hot encode the column called `Quality` containing your `Y-values`. You can simple re-use the code in `Example 1`, making sure to use the correct name for the target column (i.e `Quality`).

Finally, train (fit) your model on your X-values (`ap_X`) and your Y-values (`ap_Y`) for 100 epochs with verbose set to `0`.  

In [None]:
# Insert your code for Exercise 1A here



If the code is correct you should see something _similar_ to the following output

![__](https://biologicslab.co:/BIO1173/images/class_02/class_02_4_image03B.png)

Your `ap_model` neural network doesn't appear to have done a very good job either since the best validation accuracy (`val accuracy`) is again only about 40-45%. It should also be noted that in this particular run, `EarlyStopping` terminated training at the 46th epoch.

### **Exercise 1B: Visualize Training**

In the cell below write the code to generate two side-by-side plots, an **Accuracy Curve** and a **Loss Curve** for your `ap_model`.

In [None]:
# Insert your code for Exercise 1B here



If the code is correct you should see something _similar_ to the following output

![__](https://biologicslab.co:/BIO1173/images/class_02/class_02_4_image04B.png)

### **Analysis**

The following is an analysis of the two plots shown above as example outputs.

#### **Left Plot: Accuracy**
* **Training Accuracy** (blue line):
* * Starts around 0.75 and steadily increases to about 0.93 by epoch 40.
* **Validation Accuracy** (orange line):
* * Starts higher than training accuracy (~0.85) and quickly rises to ~0.95, stabilizing after epoch 20.
* **Interpretation:**
* The model is learning effectively and generalizing well.
* Validation accuracy being consistently higher than training accuracy suggests **good generalization** and possibly regularization at play.

#### **Right Plot: Loss**
* **Training Loss** (blue line):
* * Starts around 0.55 and decreases steadily to below 0.2.
* **Validation Loss** (orange line):
* * Starts similarly but drops faster, stabilizing around 0.14, which is lower than training loss.
* **Interpretation:**
* The model is not overfitting; in fact, it performs better on validation data than on training data.
* This could be due to:
* * Dropout or other regularization techniques
* * Well-shuffled and representative validation set
* * Early stopping or careful tuning

#### **Summary Stats**
* **Best Validation Accuracy:** 0.9563
* **Best Validation Loss:** 0.141
------------------

#### **Overall Assessment:**

Your model shows **excellent performance** and **strong generalization**.

----------------------------------------
## **Time and Cost of Training Large Language Models (LLMs)**

Large Language Models (LLMs) require a lot of time and money to train. Here is some of the available data as of September 2025.

#### **Largest LLMs (as of September 2025)**

| Model | Approx. size (parameters) | Rough training duration | Rough training cost | Source |
|-------|---------------------------|-------------------------|---------------------|--------|
| **PaLM 2‑G** (Google) | **540 B** | ~4 months on ≈ 5 000–10 000 GPUs (≈ 2 million GPU‑hours) | **$200 M – $250 M** | Google AI blog (2024), “PaLM 2: Language Models for the Web” |
| **GPT‑4** (OpenAI) | **175 B** (largest released variant) | ~3 months on ≈ 10 000–15 000 GPUs (≈ 1.3 million GPU‑hours) | **$30 M – $50 M** | OpenAI press release (2023), estimates from *Bloomberg* and *The Verge* |
| **Claude 3** (Anthropic) | **200 B** | ~3 months on ≈ 8 000 GPUs (≈ 1 million GPU‑hours) | **$30 M – $60 M** | Anthropic blog (2024) |
| **LLaMA‑2‑70B** (Meta) | **70 B** | ~1 month on ≈ 1 500 GPUs (≈ 0.2 million GPU‑hours) | **$5 M – $10 M** | Meta AI research paper (2023) |


####**Quick take-aways**

1. **Largest publicly‑known LLM (as of 2025):**  
   *PaLM 2‑G* – **540 billion** parameters, the only model known to exceed the 175‑B‑parameter range of GPT‑4.

2. **Training time:**  
   Even the smallest “state-of-the-art” models require **weeks to months** on **thousands of GPUs**.  
   * PaLM 2-G ≈ **2 million GPU-hours** → ~4 months on a 5 000‑GPU cluster.

3. **Training cost:**  
   Costs run in the **tens to hundreds of millions** of dollars.  
   * 175B-parameter models: **\$30-50M**  
   * 540B-parameter models: **\$200-250M**

> **Bottom line:**  
> The field is rapidly moving toward ever larger models, but the practical ceiling is still in the *hundreds of billions* of parameters.  Training such a model is a **multi-month, multi‑million‑GPU‑hour operation** that costs **\$30-250 million**, depending on size and hardware budget.

---------------------


### Example 2: Determine the Model's Accuracy

The overall objective of this assignment is to convince you that can save a _trained_ neural network to a file, and then later, recreate the neural network from the file, **without changing the model's accuracy**.

#### **Why is this important?**

As you already know, it can take significant time and processing power to train even relatively small neural networks that we created so far in this course. Neural networks that are used commercially (think "Siri" or "Alexa" or ChatGPT) are many times larger and require enormous resources as well as weeks (or months) to train (See above).

Obviously, if you had to train a neural network every time you wanted to use it, it won't be very practical and there would be little interest in "AI". However, once the neural network has been trained, you can save it to a file, and then re-use it over and over again, without any loss in the neural network's ability to solve problems (i.e. loss in accuracy).      

The code in the cell below calculates ability of the `or_model` neural network to predict an orange's quality (`Y-value`) based on its physical and chemical characteristics (`X-values`). The important point here is not the accuracy _per se_ but that this accuracy will not be changed by saving the model and later re-creating the model.

Here is summary of what the different accuracy metrics mean and how they are calculated.

#### **1. Precision**  
**Definition**  
The proportion of *predicted positive* instances that are actually positive.  

$$
\text{Precision}= \frac{\text{True Positives (TP)}}{\text{True Positives (TP)}+\text{False Positives (FP)}}
$$

*A high precision means that when the model says “class X”, it is usually correct.*

#### **2. Recall**  
**Definition**  
The proportion of *actual positive* instances that the model correctly identifies.  
$$
\text{Recall}= \frac{\text{True Positives (TP)}}{\text{True Positives (TP)}+\text{False Negatives (FN)}}
$$  

*A high recall means that the model finds almost all of the examples belonging to a class.*

#### **3. F1-score**  
**Definition**  
The harmonic mean of precision and recall, balancing the two.  
$$
\text{F1}= 2 \times \frac{\text{Precision}\times\text{Recall}}{\text{Precision}+\text{Recall}}
$$  

*It is a single metric that penalizes extreme values of either precision or recall.*

#### **4. Support**  
**Definition**  
The number of *actual* instances of a given class in the dataset (i.e., the class count).  

*It tells you how many samples the metrics are based on; larger support gives more reliable statistics.*

--------------------

### **Quick Recap**

- **Precision** - “Of the times we predicted this class, how often were we right?”  
  Formula: $(TP/(TP+FP)$)

- **Recall** - “Of all the times this class actually occurred, how often did we predict it?”  
  Formula: $(TP/(TP+FN)$)

- **F1-score** - A balanced measure of precision and recall  
  Formula: $(2\times\frac{P\cdot R}{P+R}$)

- **Support** - How many true examples of the class exist  
  (Just the count of that class in the data)

You can use these metrics to understand not just overall accuracy but *how* the model behaves for each individual class, especially when the classes are **imbalanced**.


### Example 2: Determine the Model's Accuracy Before Saving to Disk

The code in the cell below calculates 4 **accuracy metrics** about the `or_model` before we save it to disk. To keep these values separate, we will assign the prefix `or_before_` to each metric.

In [None]:
# Example 2: Determine the model's accuracy before saving to disk

import os
import numpy as np
from sklearn.metrics import accuracy_score, classification_report, log_loss

# 1️⃣  Predict probabilities
or_before_pred_prob = or_model.predict(or_X)

# 2️⃣  Convert to class indices
true_cls  = np.argmax(or_Y, axis=1)
pred_cls  = np.argmax(or_before_pred_prob, axis=1)

# 3️⃣  Compute metrics
or_before_acc      = accuracy_score(true_cls, pred_cls)
or_before_class_report    = classification_report(true_cls, pred_cls, zero_division=0)
or_before_log_loss = log_loss(true_cls, or_before_pred_prob)

# 4️⃣  Prepare the output string (the same text that will be printed)
output_text = (
    f"Accuracy: {or_before_acc:.4f}\n\n"
    f"Classification report (before the model was saved):\n{or_before_class_report}\n"
    f"Log‑loss: {or_before_log_loss:.4f}"
)

# 5️⃣  Print to console (unchanged behaviour)
print(output_text)

If the code is correct you should see something _similar_ to the following output

![__](https://biologicslab.co:/BIO1173/images/class_02/class_02_4_image13B.png)

### **Exercise 2: Determine the Model's Accuracy**

In the cell below, write the code to calculate the 4 **accuracy metrics** shown in `Example 2` about the `ap_model` before we save it to disk. To keep these values separate, assign the prefix `ap_before_` to each metric.

In [None]:
# Insert your code for Exercise 2 here



If your code is correct you should see something _similar_ to the following output:

![__](https://biologicslab.co/BIO1173/images/class_02/class_02_4_image12B.png)

According to the output shown above, your `ap_model` is better than 95% accurate when it comes to predicting apple quality. Apparently, it's a little easier to predict an apple's `Quality` with a classification neural network than to predict orange quality.

### Example 3: Save the Model

The code in the cell below saves the _trained_ neural network `or_model` as a file in two different file formats: `JSON` and `keras`.

Each file in the current working directory (`save_path = "."`). The filename of the JSON file is `or_model.json` while the filename of the `keras` file is `or_model.keras`.

In [None]:
# Example 3: Save the model & list the directory vertically

import os
from keras.models import save_model

# ------------------------------------------------------------------
# 1️⃣  Define the folder where everything will be stored
save_path = "."                      # current working directory
os.makedirs(save_path, exist_ok=True)   # just in case

# ------------------------------------------------------------------
# 2️⃣  Save the network architecture as JSON (no weights)
or_model_json = or_model.to_json()
json_path = os.path.join(save_path, "or_model.json")
with open(json_path, "w", encoding="utf-8") as json_file:
    json_file.write(or_model_json)

# ------------------------------------------------------------------
# 3️⃣  Save the full model in Keras' native format (architecture + weights)
keras_path = os.path.join(save_path, "or_model.keras")
or_model.save(keras_path)

# ------------------------------------------------------------------
# 4️⃣  Print the files in the directory **vertically** (one file per line)
print("\nFiles in the current directory:")
for fname in sorted(os.listdir(save_path)):
    print(f" - {fname}")


If your code is correct you should see something _similar_ to the following output:

![__](https://biologicslab.co/BIO1173/images/class_02/class_02_4_image14B.png)

After running the code cell above, there should now be two new files in your `current directory`, **`or_model.jason`** and **`or_model.keras`**.

### **Exercise 3: Save the Model**

In the code cell below save your _trained_ neural network `ap_model` as a JSON file with the filename, `ap_model.json`, and as a native Keras file with the filenmane `ap_model.keras`. Save both files to your current working directory (`save_path = "."`).

In [None]:
# Insert your code for Exercise 3 here



If your code is correct you should see something similar to the following output:

![__](https://biologicslab.co/BIO1173/images/class_02/class_02_4_image15B.png)

You should now see the two more files with your neural network, **`ap_model.json`** and **`ap_model.keras`**.

The advantage of the `JSON` format is that it can be visually inspected--just click on the file name in the file browser panel. The `JSON` file perserves the model's _architecture_ which you can see by looking at the `JSON` file, but not the `weights`. So if you want to use the model, you will need to train it all over again.

On the other hand, you can't view the contents of the `keras` file, since it is not UTF-8 encoded (it's formated). Neverthelss, you should always save your model in the **`kereas`** format since this **preserves architecture and the values of the weights** of the model's connections. By preserving these values you don't have to waste time retraining the model again.

### Example 4: Create New Model from Saved Model

Once a trained model has been saved in the native `keras` format, it is a simple matter to read the file and make an **exact copy** of the model using the Keras function **`load_model()`** as shown in the cell below. In Example 4 we have given the re-loaded neural network the name `or2_model` to differentiate it from the one that we built previously, `or_model`.  

In [None]:
# Example 4: Create new model from saved model

from keras.models import load_model

# Look in current folder
save_path = "."

# Create or2_model from the saved model
or2_model = load_model(os.path.join(save_path,"or_model.keras"))

# Print out model summary
or2_model.summary()

If your code is correct you should see the following output:

![__](https://biologicslab.co/BIO1173/images/class_02/class_02_4_image09B.png)

### **Exercise 4: Create New Model from Saved Model**

In the cell below create a new neural network called `ap2_model` from the file `ap_model.keras` that you saved to your current directory in **Exercise 3**. Print out a summary of your new `ap2_model`.

In [None]:
# Insert your code for Exercise 4 here



If your code is correct you should see the following output:

![__](https://biologicslab.co/BIO1173/images/class_02/class_02_4_image10B.png)

### Example 5: Determine the Model's Accuracy After Restoring from Disk

The code in the cell below calculates the same 4 **accuracy metrics** about the our newely created `or2_model`. To keep these values separate, we will assign the prefix `or2_after_` to each metric.

In [None]:
# Example 5: Determine the model's accuracy after restoring from disk

import os
import numpy as np
from sklearn.metrics import accuracy_score, classification_report, log_loss

# Before saving-----------------------------------------------------------------

print("Accuracy Metrics of `or_model` before saving to disk:\n")
# 1️⃣  Predict probabilities
or_before_pred_prob = or_model.predict(or_X)

# 2️⃣  Convert to class indices
true_cls  = np.argmax(or_Y, axis=1)
pred_cls  = np.argmax(or_before_pred_prob, axis=1)

# 3️⃣  Compute metrics
or_before_acc      = accuracy_score(true_cls, pred_cls)
or_before_class_report    = classification_report(true_cls, pred_cls, zero_division=0)
or_before_log_loss = log_loss(true_cls, or_before_pred_prob)

# 4️⃣  Prepare the output string (the same text that will be printed)
output_text = (
    f"Accuracy: {or_before_acc:.4f}\n\n"
    f"Classification report (before the model was saved):\n{or_before_class_report}\n"
    f"Log‑loss: {or_before_log_loss:.4f}"
)

# 5️⃣  Print to console (unchanged behaviour)
print(output_text)

# After restoring --------------------------------------------------------------
print("\n\nAccuracy Metrics of `or2_model` after restoring from disk:\n")

# 1️⃣  Predict probabilities
or2_after_pred_prob = or2_model.predict(or_X)

# 2️⃣  Convert to class indices
true_cls  = np.argmax(or_Y, axis=1)
pred_cls  = np.argmax(or2_after_pred_prob, axis=1)

# 3️⃣  Compute metrics
or2_after_acc      = accuracy_score(true_cls, pred_cls)
or2_after_class_report    = classification_report(true_cls, pred_cls, zero_division=0)
or2_after_log_loss = log_loss(true_cls, or2_after_pred_prob)

# 4️⃣  Prepare the output string (the same text that will be printed)
output_text = (
    f"Accuracy: {or2_after_acc:.4f}\n\n"
    f"Classification report (after the model was saved):\n{or2_after_class_report}\n"
    f"Log‑loss: {or2_after_log_loss:.4f}"
)

# 5️⃣  Print to console (unchanged behaviour)
print(output_text)


If the code is correct the output from the last cell should be **identical** to the output generated in `Example 2`

![__](https://biologicslab.co:/BIO1173/images/class_02/class_02_4_image16B.png)

As you can see, there is **_no difference_** in the accuracy of the saved model compared to the original one.

>> ### **_Train Once_...Use Anywhere!**

Big generative models like `ChatGTP` can take days or even months to train, but once they are trained and saved, they can process new data very fast at very little cost.

### **Exercise 5: Determine the Model's Accuracy After Restoring from Disk**

In the cell below write the code to calculate the same 4 **accuracy metrics** about your newely created `ap2_model`. To keep these values separate, assign the prefix `ap2_after_` to each metric.

In [None]:
# Insert your code for Exercise 5 here



If the code is correct the output from the last cell should be **identical** to the output that you generated in **`Exercise 2`**

![__](https://biologicslab.co:/BIO1173/images/class_02/class_02_4_image17B.png)

## **Lesson Turn-in**

When you have completed and run all of the code cells, use the **File --> Print.. --> Save to PDF** to generate a PDF of your Colab notebook. Save your PDF as `Class_02_4.lastname.pdf` where _lastname_ is your last name, and upload the file to Canvas.

## **Lizard Tail**


## **Stable Diffusion**

![___](https://upload.wikimedia.org/wikipedia/commons/thumb/8/82/Astronaut_Riding_a_Horse_%28SD3.5%29.webp/1024px-Astronaut_Riding_a_Horse_%28SD3.5%29.webp.png)

>*An image generated with Stable Diffusion 3.5 based on the text prompt*

**Stable Diffusion** is a deep learning, text-to-image model released in 2022 based on diffusion techniques. The generative artificial intelligence technology is the premier product of `Stability AI` and is considered to be a part of the ongoing artificial intelligence boom.

It is primarily used to generate detailed images conditioned on text descriptions, though it can also be applied to other tasks such as inpainting, outpainting, and generating image-to-image translations guided by a text prompt.[3] Its development involved researchers from the CompVis Group at Ludwig Maximilian University of Munich and Runway with a computational donation from Stability and training data from non-profit organizations.

**Stable Diffusion** is a latent diffusion model, a kind of deep generative artificial neural network. Its code and model weights have been released publicly, and it can run on most consumer hardware equipped with a modest GPU with at least 4 GB VRAM. This marked a departure from previous proprietary text-to-image models such as DALL-E and Midjourney which were accessible only via cloud services.

**Development**

Stable Diffusion originated from a project called Latent Diffusion, developed in Germany by researchers at Ludwig Maximilian University in Munich and Heidelberg University. Four of the original 5 authors (Robin Rombach, Andreas Blattmann, Patrick Esser and Dominik Lorenz) later joined Stability AI and released subsequent versions of Stable Diffusion.

The technical license for the model was released by the CompVis group at Ludwig Maximilian University of Munich. Development was led by Patrick Esser of Runway and Robin Rombach of CompVis, who were among the researchers who had earlier invented the latent diffusion model architecture used by Stable Diffusion. Stability AI also credited EleutherAI and LAION (a German nonprofit which assembled the dataset on which Stable Diffusion was trained) as supporters of the project.

**Technology**

The denoising process used by Stable Diffusion. The model generates images by iteratively denoising random noise until a configured number of steps have been reached, guided by the CLIP text encoder pretrained on concepts along with the attention mechanism, resulting in the desired image depicting a representation of the trained concept.

**Architecture**

Diffusion models, introduced in 2015, are trained with the objective of removing successive applications of Gaussian noise on training images, which can be thought of as a sequence of denoising autoencoders. The name diffusion is from the thermodynamic diffusion, since they were first developed with inspiration from thermodynamics.

Models in Stable Diffusion series before SD 3 all used a variant of diffusion models, called latent diffusion model (LDM), developed in 2021 by the CompVis (Computer Vision & Learning) group at LMU Munich.

Stable Diffusion consists of 3 parts: the variational autoencoder (VAE), U-Net, and an optional text encoder. The VAE encoder compresses the image from pixel space to a smaller dimensional latent space, capturing a more fundamental semantic meaning of the image. Gaussian noise is iteratively applied to the compressed latent representation during forward diffusion. The U-Net block, composed of a ResNet backbone, denoises the output from forward diffusion backwards to obtain a latent representation. Finally, the VAE decoder generates the final image by converting the representation back into pixel space.

The denoising step can be flexibly conditioned on a string of text, an image, or another modality. The encoded conditioning data is exposed to denoising U-Nets via a cross-attention mechanism. For conditioning on text, the fixed, pretrained CLIP ViT-L/14 text encoder is used to transform text prompts to an embedding space. Researchers point to increased computational efficiency for training and generation as an advantage of LDMs.

With 860 million parameters in the U-Net and 123 million in the text encoder, Stable Diffusion is considered relatively lightweight by 2022 standards, and unlike other diffusion models, it can run on consumer GPUs, and even CPU-only if using the OpenVINO version of Stable Diffusion.

**SD XL**
The XL version uses the same LDM architecture as previous versions, except larger: larger UNet backbone, larger cross-attention context, two text encoders instead of one, and trained on multiple aspect ratios (not just the square aspect ratio like previous versions).

The SD XL Refiner, released at the same time, has the same architecture as SD XL, but it was trained for adding fine details to preexisting images via text-conditional img2img.

**SD 3.0**

Main article: Diffusion model § Rectified flow
The 3.0 version completely changes the backbone. Not a UNet, but a Rectified Flow Transformer, which implements the rectified flow method with a Transformer.

The Transformer architecture used for SD 3.0 has three "tracks", for original text encoding, transformed text encoding, and image encoding (in latent space). The transformed text encoding and image encoding are mixed during each transformer block.

The architecture is named "multimodal diffusion transformer (MMDiT), where the "multimodal" means that it mixes text and image encodings inside its operations. This differs from previous versions of DiT, where the text encoding affects the image encoding, but not vice versa.

**Training data**

Stable Diffusion was trained on pairs of images and captions taken from LAION-5B, a publicly available dataset derived from Common Crawl data scraped from the web, where 5 billion image-text pairs were classified based on language and filtered into separate datasets by resolution, a predicted likelihood of containing a watermark, and predicted "aesthetic" score (e.g. subjective visual quality). The dataset was created by LAION, a German non-profit which receives funding from Stability AI. The Stable Diffusion model was trained on three subsets of LAION-5B: laion2B-en, laion-high-resolution, and laion-aesthetics v2 5+. A third-party analysis of the model's training data identified that out of a smaller subset of 12 million images taken from the original wider dataset used, approximately 47% of the sample size of images came from 100 different domains, with Pinterest taking up 8.5% of the subset, followed by websites such as WordPress, Blogspot, Flickr, DeviantArt and Wikimedia Commons. An investigation by Bayerischer Rundfunk showed that LAION's datasets, hosted on Hugging Face, contain large amounts of private and sensitive data.

**Training procedures**

The model was initially trained on the laion2B-en and laion-high-resolution subsets, with the last few rounds of training done on LAION-Aesthetics v2 5+, a subset of 600 million captioned images which the LAION-Aesthetics Predictor V2 predicted that humans would, on average, give a score of at least 5 out of 10 when asked to rate how much they liked them. The LAION-Aesthetics v2 5+ subset also excluded low-resolution images and images which LAION-5B-WatermarkDetection identified as carrying a watermark with greater than 80% probability. Final rounds of training additionally dropped 10% of text conditioning to improve Classifier-Free Diffusion Guidance.

The model was trained using 256 Nvidia A100 GPUs on Amazon Web Services for a total of 150,000 GPU-hours, at a cost of $600,000.

![__](https://upload.wikimedia.org/wikipedia/commons/f/f6/Stable_Diffusion_architecture.png)

>Diagram of the latent diffusion architecture used by Stable Diffusion


**Limitations**

Stable Diffusion has issues with degradation and inaccuracies in certain scenarios. Initial releases of the model were trained on a dataset that consists of 512×512 resolution images, meaning that the quality of generated images noticeably degrades when user specifications deviate from its "expected" 512×512 resolution; the version 2.0 update of the Stable Diffusion model later introduced the ability to natively generate images at 768×768 resolution. Another challenge is in generating human limbs due to poor data quality of limbs in the LAION database. The model is insufficiently trained to replicate human limbs and faces due to the lack of representative features in the database, and prompting the model to generate images of such type can confound the model. Stable Diffusion XL (SDXL) version 1.0, released in July 2023, introduced native 1024x1024 resolution and improved generation for limbs and text.

Accessibility for individual developers can also be a problem. In order to customize the model for new use cases that are not included in the dataset, such as generating anime characters ("waifu diffusion"), new data and further training are required. Fine-tuned adaptations of Stable Diffusion created through additional retraining have been used for a variety of different use-cases, from medical imaging to algorithmically generated music. However, this fine-tuning process is sensitive to the quality of new data; low resolution images or different resolutions from the original data can not only fail to learn the new task but degrade the overall performance of the model. Even when the model is additionally trained on high quality images, it is difficult for individuals to run models in consumer electronics. For example, the training process for waifu-diffusion requires a minimum 30 GB of VRAM, which exceeds the usual resource provided in such consumer GPUs as Nvidia's GeForce 30 series, which has only about 12 GB.

The creators of Stable Diffusion acknowledge the potential for algorithmic bias, as the model was primarily trained on images with English descriptions. As a result, generated images reinforce social biases and are from a western perspective, as the creators note that the model lacks data from other communities and cultures. The model gives more accurate results for prompts that are written in English in comparison to those written in other languages, with western or white cultures often being the default representation.

**End-user fine-tuning**

To address the limitations of the model's initial training, end-users may opt to implement additional training to fine-tune generation outputs to match more specific use-cases, a process also referred to as personalization. There are three methods in which user-accessible fine-tuning can be applied to a Stable Diffusion model checkpoint:

An "embedding" can be trained from a collection of user-provided images, and allows the model to generate visually similar images whenever the name of the embedding is used within a generation prompt.[44] Embeddings are based on the "textual inversion" concept developed by researchers from Tel Aviv University in 2022 with support from Nvidia, where vector representations for specific tokens used by the model's text encoder are linked to new pseudo-words. Embeddings can be used to reduce biases within the original model, or mimic visual styles.

A "hypernetwork" is a small pretrained neural network that is applied to various points within a larger neural network, and refers to the technique created by NovelAI developer Kurumuz in 2021, originally intended for text-generation transformer models. Hypernetworks steer results towards a particular direction, allowing Stable Diffusion-based models to imitate the art style of specific artists, even if the artist is not recognised by the original model; they process the image by finding key areas of importance such as hair and eyes, and then patch these areas in secondary latent space.
DreamBooth is a deep learning generation model developed by researchers from Google Research and Boston University in 2022 which can fine-tune the model to generate precise, personalised outputs that depict a specific subject, following training via a set of images which depict the subject.