# Saving a Model

Colab link [here](https://colab.research.google.com/drive/1qNgCG7wlgfcuxrnyL6JbpBV9TAwNILr4?usp=sharing)

<br>

Now that we have a model trained. We want to save it so we can access it later. We use the `.pth` extension when we save files.

<br>

When we save a model, we save it's weights, not the model itself. This may be a little confusing at first, but as we look at the example below the theory should become clearer.

In [None]:
# import necessary modules
import torch

# initialize model
model = fakeModel()

# train model
train(model)

# now that the model is trained, we can save its weights
torch.save(model.state_dict(),  'model_weights.pth')

***
# Loading a Model

Now we need to load our trained model weights. Let's see how to do this.

In [None]:
# define a new model
model = fakeModel()

model.load_state_dict(torch.load('model_weights.pth'))

***
# Checkpointing

Now lets discuss checkpointing a model. By checkpointing I mean saving the entire training state. This includes the epoch, optimizer state, loss and model weights.

Here's an example of creating and then loading a model checkpoint

In [None]:
# create checkpoint dictionary
checkpoint = {
    'epoch' = epoch,
    'model_state_dict' = model.state_dict(),
    'optimizer_state_dict' = optimizer.state_dict(),
    'loss' = loss
}

# save the checkpoint
torch.save(checkpoint, 'checkpoint.pth')

# loading the checkpoint

model = fakeModel()
optimizer = torch.optim.Adam(model.parameters(), lr=0.001)

checkpoint = torch.load('checkpoint.pth')
model.load_state_dict(checkpoint['model_state_dict'])
optimizer.load_state_dict(checkpoint['optimizer_state_dict'])
start_epoch = checkpoint['epoch'] + 1
loss = checkpoint['loss']

***
# Best Practices

There are several best practices for checkpointing models. Here are some key pointers:

<br>

1. Name your checkpoints with a descriptive filename (i.e. 'epoch_21_cp.pth').

<br>

2. Checkpoint as often as you're willing to. If your model only takes up 2MB, its perfectly fine to checkpoint it every epoch. On the contrary, if the model is 1GB, you may want to checkpoint every ten epochs.