Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

confused Similarity func when compare two array #38

Open
djangocc opened this issue Sep 6, 2020 · 0 comments
Open

confused Similarity func when compare two array #38

djangocc opened this issue Sep 6, 2020 · 0 comments

Comments

@djangocc
Copy link

djangocc commented Sep 6, 2020

I know it's hard to locate two exactly index which supposed to be compared in an array
and i can understand that the similarity() is a way to make the similarity of two nodes quantifiable in the json, but I don't think your code way to compare two json objects is feasible.
In your code, the complement of the Object delta similarity function is to divide the sum of all the sub deltas similarity by the count of the sum deltas, the formula is like sum(sub deltas similarity)/count(sub deltas). but when compare two objects, the field which is exactly the same will not generate the delta object. so in some way, it's not a fair similarity way.
I think the fair similarity formula is like (sum(sub deltas similarity)+count(same fields))/(count(sub deltas)+count(same fields))
if you can agree with me, I will try to send the pr.
here is some example:
left json: [{"a":1,"b":2,"c":3}] right json:[{"a":2,"b":2,"c":4},{"a":1,"b":2,"c":3,"d":4}]

and the delta result is like
{
"placeholder": [
0: {
- "a": 1,
+ "a": 2,
- "b": 2,
+ "b": 3,
- "c": 3
+ "c": 4
}
+ 0: {
+ "a": 2,
+ "b": 3,
+ "c": 4
+ }
]
}

there are two bugs in the above example I think.
one is the first object {"a":2,"b":2,"c":4} in the right json appear twice in the deltas, which means that object has been compared twice, it may be a bug.
bug code is on below

var x, y int
for x, y = 0, 0; x <= sizeX-2 && y <= sizeY-2; {
	current := dpTable[x][y]
	nextX := dpTable[x+1][y]
	nextY := dpTable[x][y+1]
	xValidLength := len(left) - maxInvalidLength + y
	yValidLength := len(right) - maxInvalidLength + x
	if x+1 < xValidLength && current == nextX {
		freeLeft = append(freeLeft, left[x])
		x++
	} else if y+1 < yValidLength && current == nextY {
		freeRight = append(freeRight, right[y])
		y++
	} else {
		resultDeltas = append(resultDeltas, deltaTable[x][y])
		x++
		y++
	}
}
for ; x < sizeX-1; x++ {
	freeLeft = append(freeLeft, left[x-1]) //here 
}
for ; y < sizeY-1; y++ {
	freeRight = append(freeRight, right[y-1]) //here
}

the x-1 or y-1 index has been appended to the result Deltas, so they are not supposed to be appended into the deleted ary or added ary again , so they should be fixed to freeLeft = append(freeLeft, left[x])
and when i fixed the first bug, the result becomes to
{
"placeholder": [
0: {
- "a": 1,
+ "a": 2,
- "b": 2,
+ "b": 3,
- "c": 3
+ "c": 4
}
+ 1: {
+ "a": 1,
+ "b": 2,
+ "c": 3,
+ "d": 4
+ }
]
}
in this case, I think the first element in the left json array should be compared to the second element in the right json array, cause they're much similar.
the reason is the similarity() way may be not that fair.

@djangocc djangocc changed the title confused Similarity func when compare two object confused Similarity func when compare two array Sep 6, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant