New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New regexp flag, checks for bad yaml tags, bug in usage of UnmarshalYAML #62

Closed
wants to merge 1 commit into
base: v2
from

Conversation

Projects
None yet
2 participants
@advance512

advance512 commented Dec 13, 2014

  • Added new regexp flag: Unmarshal all encountered YAML values with keys
    that match the regular expression into the tagged field of a struct,
    which must be a map or a slice of a type that the YAML value should
    be unmarshaled into. [Unmarshaling only]
  • Now dies in case of a badly formatted YAML tag in a struct field
  • Added CONTRIBUTORS file

Bugs:

  • When a type implementing UnmarshalYAML calls the the unmarshaler func()
    to unmarshal to a specific type, which fails, followed by it calling
    the func() again with a different output value which suceeds, the YAML
    unmarshaling process still failed. Issue was d.terrs == nil check, but
    not len(d.terrs) == 0

Tests:

  • Lots of new tests for the regexp flag - regexp unmarshaling into maps,
    slices, regexp priority etc.
New features:
*  Added new regexp flag: Unmarshal all encountered YAML values with keys
   that match the regular expression into the tagged field of a struct,
   which must be a map or a slice of a type that the YAML value should
   be unmarshaled into. [Unmarshaling only]
*  Now dies in case of a badly formatted YAML tag in a struct field
*  Added CONTRIBUTORS file

Bugs:
*  When a type implementing UnmarshalYAML calls the the unmarshaler func()
   to unmarshal to a specific type, which fails, followed by it calling
   the func() again with a different output value which suceeds, the YAML
   unmarshaling process still failed. Issue was d.terrs == nil check, but
   not len(d.terrs) == 0

Tests:
*  Lots of new tests for the regexp flag - regexp unmarshaling into maps,
   slices, regexp priority etc.
@advance512

This comment has been minimized.

Show comment
Hide comment
@advance512

advance512 Dec 13, 2014

Please note that the tests won't work since decode_tests.go imports "gopkg.in/yaml.v2" instead of "github.com/advance512/yaml" which has the regexp flag.

Change it temporarily to "github.com/advance512/yaml" to see it in action.

advance512 commented Dec 13, 2014

Please note that the tests won't work since decode_tests.go imports "gopkg.in/yaml.v2" instead of "github.com/advance512/yaml" which has the regexp flag.

Change it temporarily to "github.com/advance512/yaml" to see it in action.

@advance512

This comment has been minimized.

Show comment
Hide comment
@advance512

advance512 Dec 24, 2014

Any update on this?

advance512 commented Dec 24, 2014

Any update on this?

@niemeyer

This comment has been minimized.

Show comment
Hide comment
@niemeyer

niemeyer Jan 16, 2015

Contributor

Thanks for getting on board and contributing code to the project. This is very welcome.

A couple of points on the submission: first, can you please sign the contributor agreement so that I can integrate any changes you do to this project or any other Canonical project? It's a straightforward CLA, and easy to sign online:

Then, above you report a problem which I don't quite understand. The errors that are appended to the slice are those returned by the function, so it's fine to call unmarshal multiple times in theory. We have test cases covering this:

Your test cases apparently do not include anything about this.

Finally, when submitting pull requests, can you please make sure they are about a specific problem or feature proposal, rather than several unrelated changes? It's problematic for reviewing and it's going to be frustrating for you as well.

I'm closing this ticket but please do resubmit the independent changes so we can get them in.

Except for the regexp validator. To save your time, this is not a task for the unmarshaling package. It should be done by custom logic after unmarshaling.

Contributor

niemeyer commented Jan 16, 2015

Thanks for getting on board and contributing code to the project. This is very welcome.

A couple of points on the submission: first, can you please sign the contributor agreement so that I can integrate any changes you do to this project or any other Canonical project? It's a straightforward CLA, and easy to sign online:

Then, above you report a problem which I don't quite understand. The errors that are appended to the slice are those returned by the function, so it's fine to call unmarshal multiple times in theory. We have test cases covering this:

Your test cases apparently do not include anything about this.

Finally, when submitting pull requests, can you please make sure they are about a specific problem or feature proposal, rather than several unrelated changes? It's problematic for reviewing and it's going to be frustrating for you as well.

I'm closing this ticket but please do resubmit the independent changes so we can get them in.

Except for the regexp validator. To save your time, this is not a task for the unmarshaling package. It should be done by custom logic after unmarshaling.

@niemeyer niemeyer closed this Jan 16, 2015

@advance512

This comment has been minimized.

Show comment
Hide comment
@advance512

advance512 Jan 17, 2015

Hi,

  1. I signed the agreement.
  2. The bug was actually already fixed in #52 (and #58, and here, advance512@e401b2b#diff-e51f89b39d976550831afb510a926be0L93, too), and though it allowed calling unmarshal() multiples times in theory, it prevented it in practice (if any of the calls failed).
  3. The 3 actual code changes in the big commit are:
  4. The rest of the changes are mostly improved documentation and code comments, and some formatting fixes (a la gofmt).

Regarding the Regexp matching, I think it is a very useful feature. Here's a concrete use case: a RAML parser, RAML being a YAML-based language that describes RESTful APIs. Study the following example RAML document:

#%RAML 0.8
title: GitHub API
version: v3
baseUri: https://api.github.com
/gists:
  displayName: Gists
  /public:
    displayName: Public Gists

As the RAML spec specifies:

Every property whose key begins with a slash (/), and is either at the root of the API definition or is the child property of a resource property, is a resource property.

So, using the existing code base - how do you unmarshal the document root mapping (which contains: title, version, baseUri and the various resource properties) into a struct? What would /gists map into? A field? Or into a map/slice? How would the go-yaml tag look? Please advice.

One way to go would be to unmarshal into trees of map[interface{}]interface{}, and then run over them and manually parse them and create the go-raml data structure. That, besides being very inelegant, defeats the point of unmarshaling into specific types. I assume that isn't what you meant when you referred to custom logic.

I could not have written the go-raml parser without the regexp matching feature.

advance512 commented Jan 17, 2015

Hi,

  1. I signed the agreement.
  2. The bug was actually already fixed in #52 (and #58, and here, advance512@e401b2b#diff-e51f89b39d976550831afb510a926be0L93, too), and though it allowed calling unmarshal() multiples times in theory, it prevented it in practice (if any of the calls failed).
  3. The 3 actual code changes in the big commit are:
  4. The rest of the changes are mostly improved documentation and code comments, and some formatting fixes (a la gofmt).

Regarding the Regexp matching, I think it is a very useful feature. Here's a concrete use case: a RAML parser, RAML being a YAML-based language that describes RESTful APIs. Study the following example RAML document:

#%RAML 0.8
title: GitHub API
version: v3
baseUri: https://api.github.com
/gists:
  displayName: Gists
  /public:
    displayName: Public Gists

As the RAML spec specifies:

Every property whose key begins with a slash (/), and is either at the root of the API definition or is the child property of a resource property, is a resource property.

So, using the existing code base - how do you unmarshal the document root mapping (which contains: title, version, baseUri and the various resource properties) into a struct? What would /gists map into? A field? Or into a map/slice? How would the go-yaml tag look? Please advice.

One way to go would be to unmarshal into trees of map[interface{}]interface{}, and then run over them and manually parse them and create the go-raml data structure. That, besides being very inelegant, defeats the point of unmarshaling into specific types. I assume that isn't what you meant when you referred to custom logic.

I could not have written the go-raml parser without the regexp matching feature.

@niemeyer

This comment has been minimized.

Show comment
Hide comment
@niemeyer

niemeyer Jan 19, 2015

Contributor

For 3.3, I'm happy to include a feature equivalent to http://gopkg.in/mgo.v2/bson#Unmarshal which allows you to inline a map or struct into the document (part of the logic is already in, actually), but picking what to do with the values that do not map any of the fields is a task for the application, not the parser.

If you are in control of the file format, it also feels like YAML is being misused. If there are a bunch of keys with a common prefix, these should be put under a common key instead of doing that sort of matching logic.

For everything else, again, unrelated changes should be pushed independently. It's a bad practice to discuss all of these unrelated points at the same time.

Contributor

niemeyer commented Jan 19, 2015

For 3.3, I'm happy to include a feature equivalent to http://gopkg.in/mgo.v2/bson#Unmarshal which allows you to inline a map or struct into the document (part of the logic is already in, actually), but picking what to do with the values that do not map any of the fields is a task for the application, not the parser.

If you are in control of the file format, it also feels like YAML is being misused. If there are a bunch of keys with a common prefix, these should be put under a common key instead of doing that sort of matching logic.

For everything else, again, unrelated changes should be pushed independently. It's a bad practice to discuss all of these unrelated points at the same time.

@niemeyer

This comment has been minimized.

Show comment
Hide comment
@niemeyer

niemeyer Jan 19, 2015

Contributor

Finished the support for inline maps, if you're interested: 5d6f7e0

Contributor

niemeyer commented Jan 19, 2015

Finished the support for inline maps, if you're interested: 5d6f7e0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment