|
| 1 | +# Feature Selection in Keras Data Processor |
| 2 | + |
| 3 | +The Keras Data Processor includes a sophisticated feature selection mechanism based on the Gated Residual Variable Selection Network (GRVSN) architecture. This document explains the components, usage, and benefits of this feature. |
| 4 | + |
| 5 | +## Overview |
| 6 | + |
| 7 | +The feature selection mechanism uses a combination of gated units and residual networks to automatically learn the importance of different features in your data. It can be applied to both numeric and categorical features, either independently or together. |
| 8 | + |
| 9 | +## Components |
| 10 | + |
| 11 | +### 1. GatedLinearUnit |
| 12 | + |
| 13 | +The `GatedLinearUnit` is the basic building block that implements a gated activation function: |
| 14 | + |
| 15 | +```python |
| 16 | +gl = GatedLinearUnit(units=64) |
| 17 | +x = tf.random.normal((32, 100)) |
| 18 | +y = gl(x) |
| 19 | +``` |
| 20 | + |
| 21 | +Key features: |
| 22 | +- Applies a linear transformation followed by a sigmoid gate |
| 23 | +- Selectively filters input data based on learned weights |
| 24 | +- Helps control information flow through the network |
| 25 | + |
| 26 | +### 2. GatedResidualNetwork |
| 27 | + |
| 28 | +The `GatedResidualNetwork` combines gated linear units with residual connections: |
| 29 | + |
| 30 | +```python |
| 31 | +grn = GatedResidualNetwork(units=64, dropout_rate=0.2) |
| 32 | +x = tf.random.normal((32, 100)) |
| 33 | +y = grn(x) |
| 34 | +``` |
| 35 | + |
| 36 | +Key features: |
| 37 | +- Uses ELU activation for non-linearity |
| 38 | +- Includes dropout for regularization |
| 39 | +- Adds residual connections to help with gradient flow |
| 40 | +- Applies layer normalization for stability |
| 41 | + |
| 42 | +### 3. VariableSelection |
| 43 | + |
| 44 | +The `VariableSelection` layer is the main feature selection component: |
| 45 | + |
| 46 | +```python |
| 47 | +vs = VariableSelection(nr_features=3, units=64, dropout_rate=0.2) |
| 48 | +x1 = tf.random.normal((32, 100)) |
| 49 | +x2 = tf.random.normal((32, 200)) |
| 50 | +x3 = tf.random.normal((32, 300)) |
| 51 | +selected_features, weights = vs([x1, x2, x3]) |
| 52 | +``` |
| 53 | + |
| 54 | +Key features: |
| 55 | +- Processes each feature independently using GRNs |
| 56 | +- Calculates feature importance weights using softmax |
| 57 | +- Returns both selected features and their weights |
| 58 | +- Supports different input dimensions for each feature |
| 59 | + |
| 60 | +## Usage in Preprocessing Model |
| 61 | + |
| 62 | +### Configuration |
| 63 | + |
| 64 | +Configure feature selection in your preprocessing model: |
| 65 | + |
| 66 | +```python |
| 67 | +model = PreprocessingModel( |
| 68 | + # ... other parameters ... |
| 69 | + feature_selection_placement="all_features", # or "numeric" or "categorical" |
| 70 | + feature_selection_units=64, |
| 71 | + feature_selection_dropout=0.2 |
| 72 | +) |
| 73 | +``` |
| 74 | + |
| 75 | +### Placement Options |
| 76 | + |
| 77 | +The `FeatureSelectionPlacementOptions` enum provides several options for where to apply feature selection: |
| 78 | + |
| 79 | +1. `NONE`: Disable feature selection |
| 80 | +2. `NUMERIC`: Apply only to numeric features |
| 81 | +3. `CATEGORICAL`: Apply only to categorical features |
| 82 | +4. `ALL_FEATURES`: Apply to all features |
| 83 | + |
| 84 | +### Accessing Feature Weights |
| 85 | + |
| 86 | +After processing, feature weights are available in the `processed_features` dictionary: |
| 87 | + |
| 88 | +```python |
| 89 | +# Process your data |
| 90 | +processed = model.transform(data) |
| 91 | + |
| 92 | +# Access feature weights |
| 93 | +numeric_weights = processed["numeric_feature_weights"] |
| 94 | +categorical_weights = processed["categorical_feature_weights"] |
| 95 | +``` |
| 96 | + |
| 97 | +## Benefits |
| 98 | + |
| 99 | +1. **Automatic Feature Selection**: The model learns which features are most important for your task. |
| 100 | +2. **Interpretability**: Feature weights provide insights into feature importance. |
| 101 | +3. **Improved Performance**: By focusing on relevant features, the model can achieve better performance. |
| 102 | +4. **Regularization**: Dropout and residual connections help prevent overfitting. |
| 103 | +5. **Flexibility**: Can be applied to different feature types and combinations. |
| 104 | + |
| 105 | +## Integration with Other Features |
| 106 | + |
| 107 | +The feature selection mechanism integrates seamlessly with other preprocessing components: |
| 108 | + |
| 109 | +1. **Transformer Blocks**: Can be used before or after transformer blocks |
| 110 | +2. **Tabular Attention**: Complements tabular attention by focusing on important features |
| 111 | +3. **Custom Preprocessors**: Works with any custom preprocessing steps |
| 112 | + |
| 113 | +## Example |
| 114 | + |
| 115 | +Here's a complete example of using feature selection: |
| 116 | + |
| 117 | +```python |
| 118 | +from kdp.processor import PreprocessingModel |
| 119 | +from kdp.features import NumericalFeature, CategoricalFeature |
| 120 | + |
| 121 | +# Define features |
| 122 | +features = { |
| 123 | + "numeric_1": NumericalFeature("numeric_1"), |
| 124 | + "numeric_2": NumericalFeature("numeric_2"), |
| 125 | + "category_1": CategoricalFeature("category_1"), |
| 126 | +} |
| 127 | + |
| 128 | +# Create model with feature selection |
| 129 | +model = PreprocessingModel( |
| 130 | + features_specs=features, |
| 131 | + feature_selection_placement="all_features", |
| 132 | + feature_selection_units=64, |
| 133 | + feature_selection_dropout=0.2 |
| 134 | +) |
| 135 | + |
| 136 | +# Build and use the model |
| 137 | +preprocessor = model.build_preprocessor() |
| 138 | +processed_data = model.transform(data) |
| 139 | + |
| 140 | +# Analyze feature importance |
| 141 | +for feature_name in features: |
| 142 | + weights = processed_data[f"{feature_name}_weights"] |
| 143 | + print(f"Feature {feature_name} importance: {weights.mean()}") |
| 144 | +``` |
| 145 | + |
| 146 | +## Testing |
| 147 | + |
| 148 | +The feature selection components include comprehensive unit tests that verify: |
| 149 | + |
| 150 | +1. Output shapes and types |
| 151 | +2. Gating mechanism behavior |
| 152 | +3. Residual connections |
| 153 | +4. Dropout behavior |
| 154 | +5. Feature weight properties |
| 155 | +6. Serialization/deserialization |
| 156 | + |
| 157 | +Run the tests using: |
| 158 | +```bash |
| 159 | +python -m pytest test/test_feature_selection.py -v |
| 160 | +``` |
0 commit comments