Skip to content
This repository has been archived by the owner on Oct 10, 2023. It is now read-only.

Expanding VectorXY API #4

Closed
jackparmer opened this issue Dec 6, 2022 · 0 comments · Fixed by #8
Closed

Expanding VectorXY API #4

jackparmer opened this issue Dec 6, 2022 · 0 comments · Fixed by #8
Assignees
Labels
enhancement New feature or request

Comments

@jackparmer
Copy link
Contributor

jackparmer commented Dec 6, 2022

VectorXY currently supports X and Y arrays. In addition to x-y pairs, there are other types of data (like matrices, scalars, videos, and images) that Flojoy programs need to support.

Here I propose an API to expand VectorXY support to:

  • Pandas dataframes
  • color images
  • black & white ("BW") images
  • XYZ(t) triples
  • scalars

For reference, here is the VectorXY setter (https://github.com/flojoy-io/flojoy-python/blob/main/flojoy/flojoy.py#L83):

    def __setitem__(self, key, value):
        if key not in ['x','y']:
            raise KeyError(key)
        else:
            value = self._ndarrayify(value)
            super().__setitem__(key, value)

Currently, usage of VectorXY looks like this:

    import numpy as np
    v = VectorXY()
    v.x = np.linspace(1,20,0.1)
    v.y = np.sin(v.x)

Images

🖤 BW images would be encoded as a 2-dim matrix under the m key. Example:

import numpy as np
from skimage import data

im = getattr(data, 'clock')()    # black & white image
np.shape(im).   # (300, 400)

v = VectorXY()
Vector.m = im     # "m" for "matrix"
Vector.type = 'grayscale'

The only difference between a grayscale image and a 2d matrix that does not represent an image is setting Vector.type to grayscale (see "Types" section below).

🌈 Color images would always be encoded as RGB channels (with values between 0 and 255). Example:

import numpy as np
from skimage import data

im = getattr(data, 'astronaut')()    # color image
np.shape(im).   # (512, 512, 3)

v = VectorXY()
v.type = 'image'

# set RGB values
v.r = image[:, :, 0]
v.g = image[:, :, 1]
v.b = image[:, :, 2]

Dataframes

Pandas dataframes would be converted to a numpy matrix and stored in the m key

from pydataset import data
df = data('iris')

v = VectorXY()
v.type = 'dataframe'

v.m = df.to_numpy()

Types

A user can explicitly set a VectorXY type so that nodes can define different logic - depending on the vector's type.

Allowed type values:

  • image # color image (subset of a matrix)
  • grayscale # BW image (subset of a matrix)
  • matrix Any numpy matrix of N dimensions
  • dataframe Tabular data encoded as rows of numpy arrays
  • ordered_pair (eg, an x and y numpy array or Python list)
  • ordered_triple(eg, x, y, and z numpy arrays/lists)
  • scalar (a Python integer or float)
  • parametric_[TYPE]

Getting a type (eg print(v.type)) is inferred for the user by the Getter if type has not been explicitly set. Eg,

  • ordered_pair - only x and y are set
  • ordered_triple only x, y, and z are set
  • parametric_ordered_pair only x, y, and t are set
  • etc

Explicitly setting a vector's type, then setting a key that does not belong to that type should result in an error.

For example,

v.type = `image`
v.x = [1,2,3]

should throw and error, since the vector has been defined as an image, but the x key instead of the m key is subsequently set

Parameterization

Any vector type can be parameterized with a t attribute, so long as the the dimensions are equal. Examples:

x-y coordinates measured each second for 4 seconds (Eg throwing a ball)

t = [1, 2, 3, 4]
x = [1, 2, 3, 4]
y = [1, 3, 9, 27]

x-y-z coordinates measured each second for 4 seconds (Eg a drone flight in 3d space)

t = [1, 2, 3, 4]
x = [1, 2, 3, 4]
y = [1, 3, 9, 27]
z = [0, -4, 0, 4]

A 4 second color video with only 4 frames

t = [1, 2, 3, 4]
m = [ VectorXY[type=image], 
         [VectorXY[type=image]],
         [VectorXY[type=image]], 
         [VectorXY[type=image]] ]

A constant value over time

t = [1, 2, 3, 4]
c = [ 2.0, 2.0, 2.0, 2.0]

# or equivalently:
t = [1, 2, 3, 4]
c = 2.0

t ("time") can be irregularly spaced but must always be in ascending order. Otherwise an error should be thrown.

Scalars

Scalars (constants) can be set as floating point or integer values. Examples:

v = VectorXY()
v.c = 2.0
print(v.c) # 2.0

v.c = 10
v.t = [1, 2, 3, 4]
print(v.c) # [10, 10, 10, 10]

Summary

In summary, VectorXY will accept these keys:

  • x
  • y
  • z
  • t
  • m ("matrix")
  • c ("scalar")

Sensible combinations of these keys will be enforced by the VectorXY Getter. For example,

v.x =  [1,2,3]
v.m =  [[1,2,3],  [4,5,6]]

would throw an error - there's no reason to have a vector and matrix ride along with eachother.

Similarly,

v.c =  2.0
v.x =  [1,2,3]

would throw and error - the VectorXY has already been defined as a scalar by having its c ("constant") attribute set.

@jackparmer jackparmer added the enhancement New feature or request label Dec 6, 2022
@smahmed776 smahmed776 linked a pull request Dec 9, 2022 that will close this issue
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants