Skip to content

transform raw categorical features using custom mapping function #147

@rahulrajpl

Description

@rahulrajpl

Hello,

As I understand from the documentation here, I can have a call like tft.compute_and_apply_vocabulary(s) in order to convert a categorical column to numerical feature.

As a beginner in tensorflow, I am wondering if there exists a custom mapping of raw feature column to a numerical column? I have already seen that hash_bucket method described here almost does the job I want. But instead of hash of the entries, I need a custom mapping function to be called so that 'm' unique elements in the categorical column is mapped to 'n' unique elements of numerical or string, where n < m.

Use case. I bumped into this issue during my experiment with the KDD CUP 99 dataset where target class of training set contains 23 different attack types where they are required to be identified and categorized into four class of attacks. If there is a transformation function I could use so that all 23 unique elements in target class could be mapped to 4 classes of attack numbered [1,2,3,4]. Including a normal connection that could be mapped to [0], the target class will contain 5 classes and thus I can directly train a multiclass classification model. More on KDD CUP 99 dataset is here

Can anyone help?

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions