Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

What's the best way to compute "#seq" when inserting new data into a MapSeq? #102

Closed
davidkrauser opened this issue Aug 19, 2022 · 7 comments

Comments

@davidkrauser
Copy link

davidkrauser commented Aug 19, 2022

I'm currently using mxj.MapSeq to modify XML documents, but I find adding new nodes cumbersome. It's likely that I'm doing something wrong, so I come to you for advice 馃槄

Right now, when I want to add a new child node to some XML element, I assume that I need to figure out manually what the sequence number of that child node should be. To do so, I walk the MapSeq to find the largest number, then I add one to that. Is there a better way to do this?

In code, that looks like:

// mapSeqNumber returns the sequence number associated with this map
func mapSeqNumber(m map[string]interface{}) (int, bool) {
	if seq, ok := m["#seq"].(int); ok {
		return seq, true
	}

	if seq, ok := m["#seq"].(float64); ok {
		return int(seq), true
	}

	return 0, false
}

// mapSeqMaxSeqNumber returns the maximum sequence number of all child maps
func mapSeqMaxSeqNumber(m map[string]interface{}) int {
	maxSeqNumber := -1
	for _, val := range m {
		switch data := val.(type) {
		case map[string]interface{}:
			seqNumber, ok := mapSeqNumber(data)
			if !ok {
				continue
			}
			if seqNumber <= maxSeqNumber {
				continue
			}
			maxSeqNumber = seqNumber
		case []interface{}:
			for _, v := range data {
				vMap, ok := v.(map[string]interface{})
				if !ok {
					continue
				}
				seqNumber, ok := mapSeqNumber(vMap)
				if !ok {
					continue
				}
				if seqNumber <= maxSeqNumber {
					continue
				}
				maxSeqNumber = seqNumber
			}
		}
	}
	return maxSeqNumber
}

// Load some XML
m, _ := mxj.NewMapXmlSeq(/*... XML DATA ...*/)
// Get the maximum child sequence number
maxSeqNumber := mapSeqMaxSeqNumber(m)
// Create a new child with the next sequence number
m["NewChildNode"] = map[string]interface{
    "#seq": maxSeqNumber + 1,
}
@clbanning
Copy link
Owner

clbanning commented Aug 19, 2022

A bit of background - MapSeq was originally defined to address an edge case for folks that wanted to modify values in an XML object and then re-encode it while preserving the element/sub-element sequence. (Note: the original intent of mxj package was to convert XML objects to JSON objects where the sequence of key:value pairs is meaningless.) Thus, as used as an example in the documentation, a rather unusual case is referenced, as follows.

//	      <doc>
//	         <ltag>value 1</ltag>
//	         <newtag>value 2</newtag>
//	         <ltag>value 3</ltag>
//	      </doc>
//	  is decoded as:
//	    doc :
//	      ltag :[[]interface{}]
//	        [item: 0]
//	          #seq :[int] 0
//	          #text :[string] value 1
//	        [item: 1]
//	          #seq :[int] 2
//	          #text :[string] value 3
//	      newtag :
//	        #seq :[int] 1
//	        #text :[string] value 2

Suppose I have the following XML doc.

<doc>
      <ltag>value 1</ltag>
      <newtag>value 2</newtag>
      <ltag>value 3</ltag>
      <newtab>value 4</newtag>
</doc>

NewMapSeq would decode the XML object to

doc :
    ltag :[[]interface{}]
        [item: 0]
           #seq :[int] 0
           #text :[string] value 1
        [item: 1]
          #seq :[int] 2
          #text :[string] value 3
    newtag :[[]interface{}]
        [item 0]
          #seq :[int] 1
          #text :[string] value 2
        [item 1]
          #seq:[int] 3
          #text:[string] value 4

Now I want to add another ltag element to and re-encode it to XML. If I've grabbed the list of k:v pairs for ltag and append it as #seq:3 it'll conflict with the second newtag element. If I append it as #seq:4 - I'd need to scan all the element #seq values at that level of the XML object - then when the NewMap value is re-encoded to XML I'd get something like:

<doc>
      <ltag>value 1</ltag>
      <newtag>value 2</newtag>
      <ltag>value 3</ltag>
      <newtag>value 4</newtag>
      <ltag>value whatever</ltag>
</doc>

Is that what I want? Or did I really want:

<doc>
      <ltag>value 1</ltag>
      <newtag>value 2</newtag>
      <ltag>value 3</ltag>
      <ltag>value whatever</ltag>
      <newtag>value 4</newtag>
</doc>

I suppose that I could just say that you get what you get based on the original sequence pattern in the XML object.

The functionality is a cool idea, but I'll have to noodle on it a bit to see what might be possible without restricting the functionality to specific use cases. Of course, if the order of XML elements and subelements isn't meaningful, an Mxj value works great.

@davidkrauser
Copy link
Author

What do you think of a mechanism to get/replace nodes at an XML level like the following? I think this could solve the use-case you describe (and would definitely help with mine) 馃檪

type ChildNode struct {
    name string
    data map[string]interface{} // Each of these could be a MapSeq
}
func (m MapSeq) GetSortedChildNodes() []ChildNode {
    // Get all XML nodes at this level, sort, and return the list
}
func (m MapSeq) ReplaceChildNodes(childNodes []ChildNode) {
    // Replace all XML nodes at this level with childNodes.
    // Compute the values of #seq using the slice order
}

Additionally, some helper functions like the following could help avoid dealing with #seq numbers at all:

func (m MapSeq) SetAttr(key, value string) {
    // Sets the value in the map for the attribute
    // And computes the value of #seq as needed
}

@davidkrauser
Copy link
Author

I can propose a PR with a real implementation if you think it would be helpful. Or if you'd rather noodle on it for a while and/or implement something yourself, that's fine, too 馃檪

@clbanning
Copy link
Owner

I'd like to keep anything like this as minimal as possible, since it should also be available as a method for mxj.Map values for consistency.

How about this?

m, _ := mxj.NewMapXmlSeq(<xml_object>)

// Append a #text:value entry to a list or 
// create a list of #test:value entries from a singleton #text:value.
// An error would be returned if the specified dot-notation key did not exist.
// There would be no #seq tag:value entry created.
err := m.Add(<dot-notation_key>, value) 

Then, m.Xml() would just marshal the appended values at the end in the order they were appended, irrespective of whether the original tag:values were interleaved with a different tag:value list as discussed above.

@davidkrauser
Copy link
Author

@clbanning that could work 馃檪. I think that solves my issue of "How do I insert new data".

That doesn't help with this issue, though: #101
It's still cumbersome to get the list of actual data in sorted order to manipulate.

Maybe I was premature closing it 馃槄. Should I re-open that and we can discuss this particular issue over there?

@jessicafarias
Copy link

jessicafarias commented Sep 27, 2022

Hi @clbanning I have a temporally solution to add nested values on MapSeq on specific parent

input:

<root>
    <parent>
        <child>true</child>
    </parent>
</root>

output:

<root>
    <parent>
        <child>true</child>
        <newChildTagName>tagContent</newChildTagName>
    </parent>
</root>

parentPath = "root.parent"

func AddChildTextToXMLSeqParent(XMLdata *mxj.Map, parentPath, newChildTagName, tagContent string) error {
	existingValues, err := XMLdata.ValuesForPath(parentPath)
	if err != nil {
		return err
	}
	var newParent = make(map[string]interface{})

	maxseq := 0
	for _, v := range existingValues {
		cast, _ := v.(map[string]interface{})
		for childComponent, childValues := range cast {
			inside := make(map[string]interface{})
			mapOfChild, _ := childValues.(map[string]interface{})
			maxseq++
			for key, value2 := range mapOfChild {
				inside[key] = value2
				newParent[childComponent] = inside
			}
		}
	}

	inside := make(map[string]interface{})
	inside["#text"] = tagContent
	inside["#seq"] = maxseq
	newParent[newChildTagName] = inside

	err = XMLdata.SetValueForPath(newParent, parentPath)
	if err != nil {
		return err
	}

	return nil
}

@clbanning
Copy link
Owner

clbanning commented Oct 24, 2022

Sorry for the belated response.

I'd suggest folks use your (or davidkrauser's) code if it will work for them. However, as outlined in the Aug 19 comment, #102 (comment), such solutions will only perform in well-behaved cases. I had outlined a simple general solution similar to yours but which would default appended subelements to the end of the existing subelements of the "parent" using "#text" and without "#seq" map members. Such an approach would also allow similar functionality for mxj.Map values as well as mxj.MapSeq values.

This whole direction feels a little convoluted right now, since clbanning/mxj was never intended as an XML manipulation package but was solely meant to facilitate JSON and map[string]interface{} transformation operations. Perhaps forking the package and creating your own extensions may be the best solution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants