Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Types with restrictions should have Marshal/Unmarshal methods #17

Open
onitake opened this issue Jul 19, 2017 · 4 comments
Open

Types with restrictions should have Marshal/Unmarshal methods #17

onitake opened this issue Jul 19, 2017 · 4 comments

Comments

@onitake
Copy link

onitake commented Jul 19, 2017

In xsdgen/xsdgen.go, any restrictions on types aren't given special treatment, only a simple comment is added to the resulting type.

This should be changed to proper Mashal/Unmarshal methods so full XML validation is possible.

@droyo
Copy link
Owner

droyo commented Aug 4, 2017

My concern with this is creating an explosion in the amount of generated code and even the number of types. One feature of the xsd package is that it tries to use the builtin Go types if introducing a new type adds no value (for some definition of "value"). I didn't intend for this package to be used for validation, as that is far more work. But let's give this a try.

For some of the restrictions, there are many ways to do this.

Enumerations

Consider the following type (taken from w3schools):

<xs:element name="car">
  <xs:simpleType>
    <xs:restriction base="xs:string">
      <xs:enumeration value="Audi"/>
      <xs:enumeration value="Golf"/>
      <xs:enumeration value="BMW"/>
    </xs:restriction>
  </xs:simpleType>
</xs:element>

Here's what xsdgen does today:

// Must be one of Audi, Golf, BMW
type Car string

Here's one possible solution:

// Must be one of Audi, Golf, BMW
type Car string

var validCar = map[Car]struct{} {
	"Audi": struct{},
	"Golf": struct{},
	"BMW": struct{},
}

func (c Car) MarshalText() ([]byte, error) {
	if _, ok := validCar[c]; !ok {
		return nil, fmt.Errorf("%q is not a valid %T", c, c)
	}
	return []byte(c), nil
}

func (c *Car) UnmarshalText(text []byte) error {
	if _, ok := validCar[Car(text)]; !ok {
		return fmt.Errof("%q is not a valid %T", text, c)
	}
	*c = Car(text)
	return nil
}
  • Pro: validation is simple and "generic" for all comparable types
  • Pro: it's obvious that Car is some kind of string
  • Con: have to write []byte -> int, []byte -> float, []byte -> byte, and any other base type conversions (we already do this for handling list simpleTypes).

Another option mapping to an int:

type Car int
const (
	Audi Car = iota
	Golf
	BMW
)

var validCars = map[string]Car{
	"Audi": Audi,
	"Golf": Golf,
	"BMW": BMW,
}

func (c *Car) UnmarshalText(text []byte) error {
	if x, ok := validCars[string(text)]; ok {
		*c = x
		return nil
	}
	return fmt.Errorf("%q is an invalid %T", text, c)
}

func (c Car) MarshalText() ([]byte, error) {
	return []byte(c.String()), error
}
  • Pro: Arguably easier to work with int-based enum, rather than string.
  • Pro: Set of valid Car's exposed at the package level
  • Pro: Can leverage stringer to make the mapping for MarshalText
  • Con: Have to make sure enum vals are valid Go identifiers. Will interfere with stringer.
  • Con: High likelihood of name collisions may have to use hungarian notation

Patterns

The regular expressions defined by XSD seem pretty close to RE2, used by the regexp package. I would be satisfied with opportunistically generating Marshal/Unmarshal methods iff the regexp package can compile the pattern.

Assertion

Nope. Not worth it. I don't want to link in an xpath implementation just to validate an xml type.

Other restrictions:

Min/max inclusive/exclusive could be useful, and should be easy enough to implement.

The whitespace restriction is necessary, I think, as it changes how a value should be parsed.

explicitTimezone? pass :D

@onitake
Copy link
Author

onitake commented Aug 5, 2017

I actually went ahead and did already implement some of these.

But I'm still struggling with a few other things related to the MPEG-DASH XSD I mentioned in #16.
In particular, some types may contain both an element and an attribute called "title". xsdgen is currently not capable of generating separate fields for this case.

@droyo
Copy link
Owner

droyo commented Aug 6, 2017

I opened #23 for that

@SoMuchForSubtlety
Copy link

A possible compromise would be to add constants of all possible values for enumerations (without validation during unmarshalling). That will help with type safety when using these values later on.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants