Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FeatureSpace multiple output from one input #19697

Open
zippeurfou opened this issue May 10, 2024 · 3 comments
Open

FeatureSpace multiple output from one input #19697

zippeurfou opened this issue May 10, 2024 · 3 comments
Assignees
Labels
stat:awaiting keras-eng Awaiting response from Keras engineer type:support User is asking for help / asking an implementation question. Stackoverflow would be better suited.

Comments

@zippeurfou
Copy link

zippeurfou commented May 10, 2024

This is more of an ask to add to a tutorial than a feature request as I believe this might already doable today.
When looking at the FeatureSpace tutorials it assumes that you create one output per feature.
I am basing this from this tutorial where it seems to be an assumption that from one input you create one output.
For example:

feature_space = FeatureSpace(
    features={
        ...
        # Numerical features to normalize and bin
        "age": FeatureSpace.float_discretized(num_bins=4),
        ...
    },
  ...
    ],
    output_mode="concat",
)

In this example there is an assumption that you only create one float_discretized output for the input age.
However, in practice (eg. youtube paper) multiple output can be created from one input.
Screenshot 2024-05-09 at 8 05 04 PM
It would be nice to add in the tutorial or somewhere in the doc an example to do so. I have tried to replicate the paper on numerical features using FeatureSpace and I found it difficult to do so without doing the extra transformation in the model itself which I think is not the initial idea of this functionality.
Please also note that not all preprocessing requires an adapt method in practice.
For example from Are Neural Rankers still Outperformed by Gradient Boosted Decision Trees?, they encourage to do this transformation -> x = log(1 + |x|) * sign(x).
In this case, you don't really need an adapt method to implement this. I think it would be beneficial to consider to open it to not only preprocessing class implementation but that is a different topic that I can open a different ticket if it is better to do so.

@fchollet
Copy link
Member

This does not appear to require any extra features:

  1. Use FeatureSpace to get x.
  2. Call keras.ops.square(x) and keras.ops.sqrt(x) to get your other features.

I found it difficult to do so without doing the extra transformation in the model itself

You can do the above either inside the model or in a data pipeline. Inside the model would look like this:

# Retrieve a dict Keras Input objects
inputs = feature_space.get_inputs()
# Retrieve the corresponding encoded Keras tensors
encoded_features = feature_space.get_encoded_features()
x = encoded_features["x"]
x_sqrt = keras.ops.sqrt(x)
x_square = keras.ops.square(x)
output = ...
model = Model(inputs, output)

@zippeurfou
Copy link
Author

zippeurfou commented May 10, 2024

Thank you @fchollet for the quick answer. The method you mentioned is how I do it already today.
I was hoping to be able to do it as part of the FeatureSpace creation (maybe wrongly?) just because it felt cleaner in term of code organization to me and still part of the feature creation to me.
It also brings the question of combination within the FeatureSpace (eg. Normalization + x2).
So I guess the question is if it would make sense to extend FestureSpace to allows to have more flexibility than having a preprocessing transformation. In practice, if for example you send it a preprocessing that does Normalization and then x2 you would apply the adapt to normalization and then do x2.
In pseudo code it could look as follow:

custom_layer = keras.Sequential([keras.layers.Normalization(),keras.layers.Lambda(lambda x: x ** 2)])
feature_space = FeatureSpace(
    features={
        ...
        # Numerical features to normalize and bin
        "age":  FeatureSpace.feature(
            preprocessor=custom_layer, dtype="float", output_mode="float"
        ),
        ...
    },
  ...
    ],
    output_mode="concat",
)

This could be extended to more advanced transformation with "multiple" output from one input.
eg.

class DiscretizeAndParallelLambda(layers.Layer):
    def __init__(self, num_bins, **kwargs):
        super(DiscretizeAndParallelLambda, self).__init__(**kwargs)
        self.num_bins = num_bins

    def build(self, input_shape):
        self.discretize = layers.Discretization(self.num_bins, name="discretize")
        self.lambda_square = layers.Lambda(lambda x: tf.math.square(x), name="square")
        self.lambda_sqrt = layers.Lambda(lambda x: tf.math.sqrt(x), name="sqrt")
        self.concat = layers.Concatenate(name="concat")

    def call(self, inputs):
        discretized = self.discretize(inputs)
        squared = self.lambda_square(discretized)
        sqrt = self.lambda_sqrt(discretized)
        return self.concat([squared, sqrt])

    def get_config(self):
        config = super().get_config()
        ...
        return config

feature_space = FeatureSpace(
    features={
        ...
        # Numerical features to normalize and bin
        "age":  FeatureSpace.feature(
            preprocessor=DiscretizeAndParallelLambda(...), dtype="float", output_mode="float"
        ),
        ...
    },
  ...
    ],
    output_mode="concat",
)

In theses example the first layer do require adapt but using the log1p example it does not have to. Then when you call adapt to feature space, "behind the scene" it would look for preprocessing layer and apply it when the layer implement a preprocessing class.
Edit: Looking at the source code it might be able to work out of the box as it checks if adapt exist.
So in my previous layer I might just be able to do:

    def adapt(self, data):
        self.discretize.adapt(data)

and since it does check if adapt exist before executing it then in the case of log1p then this would work out of the box.
So maybe this is just providing example about it?

@zippeurfou
Copy link
Author

zippeurfou commented May 13, 2024

Another more trivial example is for example if you want to transform one feature (eg. age) into:

  1. Discretization that you will send after being discretized as cross with another feature.
  2. Normalization that you won't cross

So in this scenario from one input you created 2 preprocessing features where one is being used as a cross feature.
I don't think the current architecture allows you to do so but maybe I am missing something.

@sachinprasadhs sachinprasadhs added type:support User is asking for help / asking an implementation question. Stackoverflow would be better suited. stat:awaiting keras-eng Awaiting response from Keras engineer labels May 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stat:awaiting keras-eng Awaiting response from Keras engineer type:support User is asking for help / asking an implementation question. Stackoverflow would be better suited.
Projects
None yet
Development

No branches or pull requests

3 participants