Modern implementation of SpyNet in PyTorch, along some useful utilities such as optical flow transformations.
As stated in the paper's abstract:
We learn to compute optical flow by combining a classical spatial-pyramid formulation with deep learning. This estimates large motions in a coarse-to-fine approach by warping one image of a pair at each pyramid level by the current flow estimate and computing an update to the flow.
This model is 96% smaller the the FlowNet in terms of parameters.
import spynet
import torchvision.transfroms as T
from PIL import Image
tfms = T.Compose([
T.ToTensor(),
T.Normalize(mean=[.485, .406, .456],
std= [.229, .225, .224])
])
model = spynet.SpyNet.from_pretrained('sentinel')
model.eval()
frame1 = tfms(Image.open('..')).unsqueeze(0)
frame2 = tfms(Image.open('..')).unsqueeze(0)
flow = model((frame1, frame2))[0]
flow = spynet.flow.flow_to_image(flow)
Image.fromarray(flow).show()
To finetune the default sentinel
pretrained model run:
$ python -m spynet.train \
--root data/ \
--checkpoint-dir models/finetuned-sentinel \
--finetune-name sentinel \
--batch-size 32 \
--epochs 10
Then to load the finetuned model do:
import spynet
model = spynet.SpyNet.from_pretrained('models/finetuned-sentinel/final.pt')
model.eval()