Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow customizing serialization for feature generator extract functions #352

Merged
merged 4 commits into from
Jul 6, 2019

Conversation

tovbinm
Copy link
Collaborator

@tovbinm tovbinm commented Jul 1, 2019

Related issues
It's not possible to save a model with a feature generator stage that has an extract function with ctor arguments.

Describe the proposed solution
Allow customizing read/write for extract functions, e.g.

// extract function IntMultiplyExtractor has an argument 'multiplier' passed during training time
val multiplier = 10
val multiplied = FeatureBuilder.Integral[Int].extract(new IntMultiplyExtractor(multiplier)).asPredictor

// one can now specify a read/write implementation to serialize for the extractor instance
@ReaderWriter(classOf[IntMultiplyExtractorReadWrite])
class IntMultiplyExtractor(val multiplier: Int) extends Function1[Int, Integral] with Serializable {
  def apply(i: Int): Integral = (i * multiplier).toIntegral
}
class IntMultiplyExtractorReadWrite extends ValueReaderWriter[IntMultiplyExtractor] {
  implicit val formats = DefaultFormats
  def read(valueClass: Class[IntMultiplyExtractor], json: JValue): Try[IntMultiplyExtractor] = Try {
    new IntMultiplyExtractor((json \ "multiplier").extract[Int])
  }
  def write(value: IntMultiplyExtractor): Try[JValue] = Try {
    "multiplier" -> JInt(value.multiplier)
  }
}

Describe alternatives you've considered
NA

@tovbinm tovbinm requested a review from leahmcguire as a code owner July 1, 2019 23:14
@tovbinm tovbinm changed the title Allow customizing serialization for FeatureGenerator extract function Allow customizing serialization for FeatureGenerator extract functions Jul 1, 2019
@tovbinm tovbinm changed the title Allow customizing serialization for FeatureGenerator extract functions Allow customizing serialization for feature generator extract functions Jul 1, 2019
@codecov
Copy link

codecov bot commented Jul 1, 2019

Codecov Report

Merging #352 into master will increase coverage by 0.05%.
The diff coverage is 94.33%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #352      +/-   ##
==========================================
+ Coverage   86.68%   86.73%   +0.05%     
==========================================
  Files         335      336       +1     
  Lines       10781    10799      +18     
  Branches      347      346       -1     
==========================================
+ Hits         9345     9367      +22     
+ Misses       1436     1432       -4
Impacted Files Coverage Δ
...op/stages/DefaultOpPipelineStageReaderWriter.scala 69.23% <ø> (-0.77%) ⬇️
...m/salesforce/op/stages/FeatureGeneratorStage.scala 95.08% <100%> (+5.08%) ⬆️
...alesforce/op/stages/DefaultValueReaderWriter.scala 100% <100%> (ø)
...sforce/op/stages/OpPipelineStageReaderWriter.scala 85.71% <76.92%> (+1.09%) ⬆️
...la/com/salesforce/op/features/FeatureBuilder.scala 35.66% <0%> (+1.39%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e6cc43f...31fa50d. Read the comment docs.

Copy link
Contributor

@michaelweilsalesforce michaelweilsalesforce left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested this internally. Seems to work as expected.

@tovbinm tovbinm merged commit c25f4eb into master Jul 6, 2019
@tovbinm tovbinm deleted the mt/serialize-parametrized-extract branch July 6, 2019 00:20
This was referenced Jul 10, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants