Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for HOCON config. Fixes #29. #31

Merged
merged 1 commit into from Jun 24, 2019

Conversation

sacsar
Copy link
Contributor

@sacsar sacsar commented Jun 15, 2019

Changes:

  • enable testng and add a missing runtime test dependency
  • Add support for HOCON config in addition to JSON config.

Because the transform config has type Map[String, Map[String, Any]], the decoding of Feature and InputFeatureInfo has to be done fairly manually. I was unable to find a Scala library that supported decoding typesafe Config objects to a Map[String, Any]. The downside to this is that you lose a nice error message in the case of a malformed InputFeatureInfo. (OutputTensorInfo will still give a nice message.) I can think of three solutions to this:

  • Parse the config string and immediately convert back to JSON and use Jackson to make the TensorizeIn config object. (This is certainly a hacky option and wasteful if most configs will be JSON, but it gets the job done.)
  • I believe the Any could be an Either[HashInfo, TokenizationInfo]] (it doesn't look like TokenizationInfo exists today). This obviously doesn't extend nicely if there are other transformations available in the future.
  • Have HashInfo and (a new)TokenizationInfo extend the same sealed trait. I think this is the best option, but it involves changing the code where the transformers were used, so I didn't want to go ahead and do it without discussion.

@zhangxuhong
Copy link
Contributor

I can help review it once the tests are fixed, thank you!

@sacsar
Copy link
Contributor Author

sacsar commented Jun 22, 2019

@zhangxuhong Frustratingly, the tests pass locally and fail on Travis. I've got Travis running Docker with failing tests, so hopefully I'll be able to figure out what's going on.

@sacsar sacsar force-pushed the hocon branch 2 times, most recently from 29a57a5 to abd5f4b Compare June 22, 2019 22:26
* @param inputFeatureInfoConfig
* @return
*/
private def parseInputFeatureInfo(inputFeatureInfoConfig: Config): InputFeatureInfo = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you provide some comment here why we need a separate parser for InputFeatureInfo. is it mainly because we want to convert some fields to Option type? Then how about the shape field in OutputTensorInfo? do we need a special handling here? Thanks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's because the transform config has type Map[String, Map[String, Any]]--none of the Scala libraries for typesafe config seem to support decoding to Map[String, Any]. I think the right solution is to have HashInfo and (a new)TokenizationInfo extend the same sealed trait, call it TransformType. Then the transform config becomes Map[String, Map[String, TransformType]] and you should be able to decode, complete with nice exceptions.

I'm happy to make this change, but it involves touching other parts of the codebase beyond just the config, so I wanted to talk about it first.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, looking at this some more, the transform config should probably be a case class with two optional fields hashInfo and tokenization. That gets rid of the sealed trait and enforces the field names in the config as it's parsed.

Copy link
Contributor

@zhangxuhong zhangxuhong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution! I left a minor comment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants