Is it possible to prevent fields from being used as features but keep them as output fields? #209

alexnikitchuk · 2019-01-16T08:21:42Z

Problem
More question than problem:
In my case some fields contain auxiliary information that I need to have in the output but at the same time these fields can have high correlation with target label.
Is it possible to prevent fields from being used as features but keep them as output fields?

Solution
Unknown

Alternatives
Unknown

Additional context
N/A

tovbinm · 2019-01-17T03:53:22Z

@alexnikitchuk when you say "have in the output" do you mean as an output of the model.score execution?

alexnikitchuk · 2019-01-17T07:28:26Z

@tovbinm yes, exactly

snabar · 2019-01-26T07:57:55Z

Actually here is how you can do it:

val features: FeatureLike[OPVector] = ...
val label: FeatureLike[RealNN] = ...
val excluded: FeatureLike[_ <: FeatureType] = ... // say we want to exclude this feature from modeling

// define the model selector with label & features
val pred = BinaryClassificationModelSelector().setInput(label, features).getOutput()

// set the result features to the workflow including the excluded feature
val workflow = new OPWorkflow().setReader(reader).setResultFeatures(excluded, pred)

// train & score the model
val workflowModel = workflow.train()
val df = workflowModel.score()

df should be a dataframe with f2 and pred as columns where pred used only f1 as the predictor.

snabar · 2019-01-26T08:07:06Z

In general, this diagram may help you understand what happens when you call “train” on a workflow:

https://github.com/salesforce/TransmogrifAI/blob/master/resources/workflows.png

Essentially the underlying DAG needed to materialize ResultFeatures gets prepped, i.e., any estimators (in the above case, the binary classification model selector) get fitted on the data.

When you then call score on the resulting workflowModel, data is run through the prepped DAG and all the ResultFeatures get materialized, whether or not they were a part of any estimator in the DAG.

https://github.com/salesforce/TransmogrifAI/blob/master/resources/materializingdata.png

tovbinm closed this as completed Jan 26, 2019

tovbinm mentioned this issue Jul 11, 2019

Release 3.3.3 #26

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is it possible to prevent fields from being used as features but keep them as output fields? #209

Is it possible to prevent fields from being used as features but keep them as output fields? #209

alexnikitchuk commented Jan 16, 2019

tovbinm commented Jan 17, 2019

alexnikitchuk commented Jan 17, 2019

snabar commented Jan 26, 2019 •

edited by tovbinm

Loading

snabar commented Jan 26, 2019 •

edited

Loading

Is it possible to prevent fields from being used as features but keep them as output fields? #209

Is it possible to prevent fields from being used as features but keep them as output fields? #209

Comments

alexnikitchuk commented Jan 16, 2019

tovbinm commented Jan 17, 2019

alexnikitchuk commented Jan 17, 2019

snabar commented Jan 26, 2019 • edited by tovbinm Loading

snabar commented Jan 26, 2019 • edited Loading

snabar commented Jan 26, 2019 •

edited by tovbinm

Loading

snabar commented Jan 26, 2019 •

edited

Loading