Skip to content

This is a conversion of the official paddle-implemented PaintTransformer.

Notifications You must be signed in to change notification settings

NeverGiveU/PaintTransformer-Pytorch-master

Repository files navigation

PaintTransformer-Pytorch-,aster

hi, guys!
this is a conversion of the paddle implementation of the paper Paint Transformer: Feed Forward Neural Painting with Stroke Prediction (ICCV 2021, oral). Here are the project site 🌐 and the official codes 💻.

Target Image From Pytorch From Paddle

1. Simple Start

Try to run

python inference.py 

and check the output frames in ./sampels/outputs/bingbing/.
Goto ./samples/outputs/ and run

python frames2mp4.py

to merge the frames into a video.

2. Model Conversion

Run

python paddle2pytorch.py

to convert the trained paddle checkpoints into pytorch. BTW, Remember to check the checkpoints paths.

3. Test on Yr Images.

Put yr images in ./samples/inputs/, and modify the lines of #12 in inference.py.

if __name__ == "__main__":
    ## files
    input_path = "samples/inputs/darling.jpg" # line# = 12
    output_dir = "samples/outputs/"

Just do as Section 1.

4. Attention

4.1 Problems by Pytorch Versions


Notice that this codes are dubugged in Pytorch-1.9.0, and the input shape of transformer is in (B,N,C), i.e., "batch first".
In Pytorch of earlier versions, no argument "batch_first" is defined. So you may try to modify the network.py as follows.

  1. In network.py #39, change
    self.transformer = nn.Transformer(hidden_dim, n_heads, n_enc_layers, n_dec_layers, batch_first=True) ➡️ self.transformer = nn.Transformer(hidden_dim, n_heads, n_enc_layers, n_dec_layers)
  2. In #72-74, change
src = (pos_embedding + feat.view(b, c, -1).permute(2, 0, 1)).permute(1, 0, 2)
tgt = self.query_pos_embedding.unsqueeze(1).repeat(1, b, 1).permute(1, 0, 2)
hidden_state = self.transformer(src, tgt)

➡️

src = (pos_embedding + feat.view(b, c, -1).permute(2, 0, 1))
tgt = self.query_pos_embedding.unsqueeze(1).repeat(1, b, 1)
hidden_state = self.transformer(src, tgt).permute(1, 0, 2) 

4.2 A Difference Between Pytorch & Paddle


We have found that the APIs in torch.nn.functional.affine_grid and paddle.nn.functional.affine_grid outputs slightly differently when fed with the same input $\theta$'s.
Fortunately, this seems not to make negative effects on final results.

5. Training

MAYBE come ... 😄

About

This is a conversion of the official paddle-implemented PaintTransformer.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages