Add lens for typed tensor #523

junjihashimoto · 2021-02-14T17:25:29Z

This PR provides embed tensor having fields and its lens.

For example, when we defines RGB data as follows,

data RGB = RGB { r :: Float, g :: Float, b :: Float}

the type of the embed tensor is TensorData device dtype shape RGB,
and internal data of the embed tensor becomes Tensor device dtype (shape ++ [3]).

Then the lens (e.g. field @"r" or field @"g" ) provides both setter and getter for each RGB field.

tscholak · 2021-02-14T18:49:49Z

Hi, looks interesting. I don’t understand what TensorData is good for. Can you explain this differently, please?

junjihashimoto · 2021-02-15T09:37:21Z

The problem with roots is that the typed tensor doesn't specify a structure.
For example, there are two definitions of bounding boxes. One uses the points at both ends( e.g. BoundingBox below) , and another uses the center point and the width of the box( e.g. BoxWH below) . But both definitions are represented as the same Tensor device dtype [batch,4] on typed tensor.

Both tensor expressions are Tensor device dtype [batch,4], but TensorData can distinguish TensorData device dtype [batch] BoundingBox and TensorData device dtype [batch] BoxWH .

data BoundingBox = BoundingBox {
  x0::Float,
  y0::Float,
  x1::Float,
  y1::Float
}

data BoxWH = BoxWH {
  centerX::Float,
  centerY::Float,
  width::Float,
  height::Float
}

These two definition are often used. It is also used in detr.
https://github.com/facebookresearch/detr/blob/a54b77800eb8e64e3ad0d8237789fcbf2f8350c5/util/box_ops.py#L40

tscholak · 2021-02-15T19:25:50Z

I see now. In that case, why don't we make it so that a Tensor's shape can be heterogeneous lists of any type constructors, * -> *. one could use:

[] for a dimension with no name and undefined size.
(,) for a dimension with no name and size 2.
data RGB a = RGB a a a for a dimension with name RGB and size 3.
and so on.

What do you think?

junjihashimoto · 2021-02-15T20:38:30Z

The idea is great, because it can support tensor's permutations.

https://github.com/jasigal/hasktorch-naperian/blob/master/src/Torch/Naperian.hs#L74-L76
Jessie has a similar idea, too.
It also works well with monad and representable functor.

I don't understand how to use '[]' and '(,)' with heterogeneous lists.

When there is Tensor device dtype [4,64,16,16] in the current definition,
does the new tensor become Tensor device dtype (4:.64:.16:.16) ?
I think the example using RGB looks like these.

Tensor device dtype (4:.RGB:.16:.16)
Tensor device dtype (4:.16:.16:.RGB)

tscholak · 2021-02-15T20:46:43Z

I suppose we'd need some sized list for that.
With sized lists, e.g. https://hackage.haskell.org/package/vector-sized-1.4.3.1/docs/Data-Vector-Generic-Sized.html, I believe your examples could look like this:

Tensor device dtype '[ Sized [] 4, Sized [] 64, Sized [] 16, Sized [] 16]
Tensor device dtype '[ Sized [] 4, RGB, Sized [] 16, Sized [] 16]

and so on, where

newtype Sized (f :: * -> *) (n :: Nat) a = Sized (f a)

So Sized [] 4 it's just a wrapper around [] with a size annotation.

tscholak · 2021-02-15T20:50:56Z

or, perhaps, a data family that makes the data type construction inductive:

data family List (n :: Natural) a

data instance List 'Z a = Empty

data instance List ('S n) a = a ::: !(List n a)

(from https://github.com/input-output-hk/ouroboros-high-assurance/blob/master/Haskell/fixed-length-lists/src/Data/List/FixedLength.hs)

junjihashimoto · 2021-02-16T03:50:58Z

Thx!
It is good except for things that first-time viewers may feel a bit long and unintuitive.
It's new, so there may not be another good way to write it.
I'd like to try the new tensor. Would you like to move forward?

tscholak · 2021-02-16T12:51:14Z

Yes, let's try your thing.

austinvhuang · 2021-02-17T12:58:58Z

Seems lenses be a generalization of Sasha's namedtensors http://nlp.seas.harvard.edu/NamedTensor? One difference is there names refer to dimensions and here they refer to values (I think, I still have difficulty reading generics code).

One potential footgun is if users access tensor values both by indexing and lenses. If they were to insert a field or otherwise change the field ordering, that could break code in weird / silent ways.

Can you include an example that shows the RGB usage? I'm a little unclear where/when the tensor gets initialized and how the ADT data gets bound to tensor values.

junjihashimoto · 2021-02-17T14:42:13Z

Seems lenses be a generalization of Sasha's namedtensors http://nlp.seas.harvard.edu/NamedTensor? One difference is there names refer to dimensions and here they refer to values (I think, I still have difficulty reading generics code).

Yes. The lens is like namedtensors with fields.

One potential footgun is if users access tensor values both by indexing and lenses. If they were to insert a field or otherwise change the field ordering, that could break code in weird / silent ways.

I think the footgun depends on the definition of shapes.
If the shape is a list of Nat, we'll shoot, and if it's a list of Functor, we'll avoid it.
BTW, indexes and lens may become isomorphic.

Can you include an example that shows the RGB usage? I'm a little unclear where/when the tensor gets initialized and how the ADT data gets bound to tensor values.

Here is an example of converting from RGB to YCoCg.
https://en.wikipedia.org/wiki/YCoCg

An example with lens.

toYCoCG :: Tensor device dtype [Size [] n, RGB] -> Tensor device dtype [Size [] n, YCoCg] 
toYCoCG rgb = 
  set (field @"y")  ((r + g * 2+ b)/4) $
  set (field @"co")  ((r - b)/2)  $
  set (field @"cg")  ((-r + g * 2 - b)/4)  $
  mempty
  where 
    r = rgb ^. field @”r” 
    g = rgb ^. field @”g” 
    b = rgb ^. field @”b”

An example without lens.

toYCoCG :: Tensor device dtype [n, 3] -> Tensor device dtype [n, 3] 
toYCoCG rgb =  stack @1 (
   (r + g * 2+ b)/4 :.
   (r - b)/2 :.
   (-r + g * 2 - b)/4 :.
   HNil
  )
  where 
    r = slice @1 @0 rgb
    g = slice @1 @1 rgb
    b = slice @1 @2 rgb

For now, this PR does not include the initialization of the tensor, but we can get the initializer by making the tensor an instance of monoid with zeros-function.

junjihashimoto · 2021-04-08T07:24:34Z

I've realized sized-vector is not a instance of Generics.
https://hackage.haskell.org/package/vector-sized-1.4.3.1/docs/Data-Vector-Generic-Sized.html

junjihashimoto · 2021-04-09T12:20:09Z

This PR feature is reimplemented as follows.
#536

Add lens for typed tensor

fb3528d

junjihashimoto closed this Apr 9, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add lens for typed tensor #523

Add lens for typed tensor #523

junjihashimoto commented Feb 14, 2021 •

edited

tscholak commented Feb 14, 2021

junjihashimoto commented Feb 15, 2021 •

edited

tscholak commented Feb 15, 2021 •

edited

junjihashimoto commented Feb 15, 2021

tscholak commented Feb 15, 2021 •

edited

tscholak commented Feb 15, 2021

junjihashimoto commented Feb 16, 2021 •

edited

tscholak commented Feb 16, 2021

austinvhuang commented Feb 17, 2021

junjihashimoto commented Feb 17, 2021 •

edited

junjihashimoto commented Apr 8, 2021

junjihashimoto commented Apr 9, 2021

Add lens for typed tensor #523

Add lens for typed tensor #523

Conversation

junjihashimoto commented Feb 14, 2021 • edited

tscholak commented Feb 14, 2021

junjihashimoto commented Feb 15, 2021 • edited

tscholak commented Feb 15, 2021 • edited

junjihashimoto commented Feb 15, 2021

tscholak commented Feb 15, 2021 • edited

tscholak commented Feb 15, 2021

junjihashimoto commented Feb 16, 2021 • edited

tscholak commented Feb 16, 2021

austinvhuang commented Feb 17, 2021

junjihashimoto commented Feb 17, 2021 • edited

junjihashimoto commented Apr 8, 2021

junjihashimoto commented Apr 9, 2021

junjihashimoto commented Feb 14, 2021 •

edited

junjihashimoto commented Feb 15, 2021 •

edited

tscholak commented Feb 15, 2021 •

edited

tscholak commented Feb 15, 2021 •

edited

junjihashimoto commented Feb 16, 2021 •

edited

junjihashimoto commented Feb 17, 2021 •

edited