-
Notifications
You must be signed in to change notification settings - Fork 221
Description
Hello,
As I understand from the documentation here, I can have a call like tft.compute_and_apply_vocabulary(s)
in order to convert a categorical column to numerical feature.
As a beginner in tensorflow, I am wondering if there exists a custom mapping of raw feature column to a numerical column? I have already seen that hash_bucket method described here almost does the job I want. But instead of hash of the entries, I need a custom mapping function to be called so that 'm' unique elements in the categorical column is mapped to 'n' unique elements of numerical or string, where n < m.
Use case. I bumped into this issue during my experiment with the KDD CUP 99 dataset where target class of training set contains 23 different attack types where they are required to be identified and categorized into four class of attacks. If there is a transformation function I could use so that all 23 unique elements in target class could be mapped to 4 classes of attack numbered [1,2,3,4]. Including a normal connection that could be mapped to [0], the target class will contain 5 classes and thus I can directly train a multiclass classification model. More on KDD CUP 99 dataset is here
Can anyone help?