Redrawing normalized samples using QR slows down training #6

Parskatt · 2020-10-20T16:10:36Z

Doing the QR-decomposition:

performer-pytorch/performer_pytorch/performer_pytorch.py

Lines 67 to 70 in f9765c4

    
           def orthogonal_matrix_chunk(cols, device = None): 
        
               unstructured_block = torch.randn((cols, cols), device = device) 
        
               q, _ = torch.qr(unstructured_block, some = True) 
        
               return q.t()

Slows down training substantially (at least for batch sizes of ~4). For example, in my own experiments I get ~2.5 batches/s per GPU without redrawing, and ~1.4 batches/s with redrawing.

I found one solution from pytorch GP, which dispatches to CPU for small QR factorizations:

cornellius-gp/gpytorch#1224

Perhaps a similar strategy could be used? I think num_cols should never really be more than about ~100 though, so perhaps you should always use cpu here?

Parskatt · 2020-10-20T16:18:05Z

Using CPU instead of GPU gives me ~2 batches/s.
It's not perfect, but its better.

lucidrains · 2020-10-20T22:35:46Z

@Parskatt thank you for looking into this! i noticed this as well, but didn't know CPU would be faster

https://github.com/lucidrains/performer-pytorch/releases/tag/0.0.9

Parskatt · 2020-10-21T07:09:30Z

Thanks for being super fast as usual :)

I think I will personally use trainable projection matrices, initialized as N(0,I). I'll let you know if it works out ;)

I'll close this issue

lucidrains · 2020-10-21T19:48:21Z

@Parskatt please do!

Parskatt closed this as completed Oct 21, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Redrawing normalized samples using QR slows down training #6

Redrawing normalized samples using QR slows down training #6

Parskatt commented Oct 20, 2020

Parskatt commented Oct 20, 2020

lucidrains commented Oct 20, 2020

Parskatt commented Oct 21, 2020

lucidrains commented Oct 21, 2020

Redrawing normalized samples using QR slows down training #6

Redrawing normalized samples using QR slows down training #6

Comments

Parskatt commented Oct 20, 2020

Parskatt commented Oct 20, 2020

lucidrains commented Oct 20, 2020

Parskatt commented Oct 21, 2020

lucidrains commented Oct 21, 2020