Skip to content

encoding/xml: memoize names during decode #38332

@qmuntal

Description

@qmuntal

Counterpart of #32779 for encoding/xml, but tracking it as a performance improvement instead of a proposal.

When an xml is decoded, many of the same element and attribute names appear over and over in the XML and get copied and reconstructed in the result over and over as well. This was previously partially discussed in #21823, where @nussjustin and @a-h proposed some good initial implementation:

// Get name: /first(first|second)*/
// Do not set d.err if the name is missing (unless unexpected EOF is received):
// let the caller provide better context.
func (d *Decoder) name() (s string, ok bool) {
	d.buf.Reset()
	if !d.readName() {
		return "", false
	}

	// Now we check the characters.
	b := d.buf.Bytes()
        if s, ok = d.names[string(b)]; ok {
		return s, ok
	}
	if !isName(b) {
		d.err = d.syntaxError("invalid XML name: " + string(b))
		return "", false
	}
        s = string(b)
        d.names[s] = s
	return s, true
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    NeedsInvestigationSomeone must examine and confirm this is a valid issue and not a duplicate of an existing one.Performance

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions