Add support for HOCON config. Fixes #29. #31

sacsar · 2019-06-15T21:45:36Z

Changes:

enable testng and add a missing runtime test dependency
Add support for HOCON config in addition to JSON config.

Because the transform config has type Map[String, Map[String, Any]], the decoding of Feature and InputFeatureInfo has to be done fairly manually. I was unable to find a Scala library that supported decoding typesafe Config objects to a Map[String, Any]. The downside to this is that you lose a nice error message in the case of a malformed InputFeatureInfo. (OutputTensorInfo will still give a nice message.) I can think of three solutions to this:

Parse the config string and immediately convert back to JSON and use Jackson to make the TensorizeIn config object. (This is certainly a hacky option and wasteful if most configs will be JSON, but it gets the job done.)
I believe the Any could be an Either[HashInfo, TokenizationInfo]] (it doesn't look like TokenizationInfo exists today). This obviously doesn't extend nicely if there are other transformations available in the future.
Have HashInfo and (a new)TokenizationInfo extend the same sealed trait. I think this is the best option, but it involves changing the code where the transformers were used, so I didn't want to go ahead and do it without discussion.

zhangxuhong · 2019-06-18T22:44:50Z

I can help review it once the tests are fixed, thank you!

sacsar · 2019-06-22T12:01:51Z

@zhangxuhong Frustratingly, the tests pass locally and fail on Travis. I've got Travis running Docker with failing tests, so hopefully I'll be able to figure out what's going on.

zhangxuhong · 2019-06-24T18:37:43Z

avro2tf/src/main/scala/com/linkedin/avro2tf/parsers/TensorizeInConfigParser.scala

+    * @param inputFeatureInfoConfig
+    * @return
+    */
+  private def parseInputFeatureInfo(inputFeatureInfoConfig: Config): InputFeatureInfo = {


Could you provide some comment here why we need a separate parser for InputFeatureInfo. is it mainly because we want to convert some fields to Option type? Then how about the shape field in OutputTensorInfo? do we need a special handling here? Thanks.

It's because the transform config has type Map[String, Map[String, Any]]--none of the Scala libraries for typesafe config seem to support decoding to Map[String, Any]. I think the right solution is to have HashInfo and (a new)TokenizationInfo extend the same sealed trait, call it TransformType. Then the transform config becomes Map[String, Map[String, TransformType]] and you should be able to decode, complete with nice exceptions.

I'm happy to make this change, but it involves touching other parts of the codebase beyond just the config, so I wanted to talk about it first.

Actually, looking at this some more, the transform config should probably be a case class with two optional fields hashInfo and tokenization. That gets rid of the sealed trait and enforces the field names in the config as it's parsed.

zhangxuhong

Thanks for the contribution! I left a minor comment.

sacsar force-pushed the hocon branch 2 times, most recently from 29a57a5 to abd5f4b Compare June 22, 2019 22:26

Add support for HOCON config. Fixes linkedin#29.

893a4a8

sacsar force-pushed the hocon branch from 264729b to 893a4a8 Compare June 22, 2019 23:29

zhangxuhong reviewed Jun 24, 2019

View reviewed changes

zhangxuhong approved these changes Jun 24, 2019

View reviewed changes

zhangxuhong merged commit 3aa1fdd into linkedin:master Jun 24, 2019

sacsar deleted the hocon branch June 25, 2019 00:22

This was referenced Jul 9, 2019

Fully use circe to parse TensorizeIn config #39

Merged

Restore support for "accept single value as array" in config? #40

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for HOCON config. Fixes #29. #31

Add support for HOCON config. Fixes #29. #31

sacsar commented Jun 15, 2019

zhangxuhong commented Jun 18, 2019

sacsar commented Jun 22, 2019

zhangxuhong Jun 24, 2019

sacsar Jun 25, 2019

sacsar Jun 25, 2019

zhangxuhong left a comment

Add support for HOCON config. Fixes #29. #31

Add support for HOCON config. Fixes #29. #31

Conversation

sacsar commented Jun 15, 2019

zhangxuhong commented Jun 18, 2019

sacsar commented Jun 22, 2019

zhangxuhong Jun 24, 2019

Choose a reason for hiding this comment

sacsar Jun 25, 2019

Choose a reason for hiding this comment

sacsar Jun 25, 2019

Choose a reason for hiding this comment

zhangxuhong left a comment

Choose a reason for hiding this comment