Fix some "_id" fields not being unique. #17

Senexis · 2018-05-22T17:46:01Z

The field not being unique may cause issues with the difference checker as it just randomly decides the order of certain objects, causing false positives in difference checkers. It was also causing some issues in my own overlay project.

Make the "_id" fields use more than just the name, as duplicate names are possible in the cases of enemies.
Add an "_id" field to transients and refactor the file a bit.
Updates the README to reflect the "_id" addition.

These are the only current files that have ambiguity, but there may be more cases of this issue possible later (think items, primarily). Might need to improve the "_id" field of those later, too. For now, however, this suffices.

Difference can be checked here, compare with the live version of the diff checker.

The field not being unique may cause issues with the difference checker, as it just randomly decides the order of certain objects, causing false positives in difference checkers. - Make the "_id" fields use more than just the name, as duplicate names are possible in the cases of enemies. - Add an "_id" field to transients and refactor the file a bit. These are the only current files that have ambiguity, but there may be more cases of this issue possible later (think items, primarily). Might need to improve the "_id" field of those later, too. For now, however, this suffices.

TobiTenno · 2018-05-22T18:16:59Z

lib/enemyModTables.js

@@ -41,6 +38,7 @@ module.exports = function($) {
    }

    // push the last one too
+    enemy._id = hash(enemy.enemyName + enemy.ememyModDropChance + enemy.mods.length)


use a template string instead of concat

Just wondering: Why?

The point of this string is not to be human readable at all, in fact, the user won't directly interact with this in any way. The only purpose is to make the hash target more strict instead of just using "text" which is the cause of duplicate identifiers.

If this text were in any way supposed to be readable, I would immediately agree but that's not the point in this case, so why? 😃

Less chance of misunderstanding this as a numerical operation, since aspects of it are numerical

Subjectively: it looks better

Across several projects we enforce the usage of templates over concatenation, so from this point of view: consistency.

(I don't think that I'm consistent with that in this project either tho :D)

TobiTenno · 2018-05-22T18:18:34Z

lib/transientRewards.js

@@ -48,7 +45,9 @@ module.exports = function($) {
        }
    }

-    transientRewards.push(curr)
+    // push the last one too
+    transient._id = hash(transient.objectiveName + transient.rewards.length)


use a template string instead of concat

TobiTenno · 2018-05-22T18:19:08Z

lib/transientRewards.js

                }

-                name = text
-                curr = {objectiveName: name, rewards: []}
+                transient = {_id: null, objectiveName: text, rewards: []}


don't define _id if there's no intention to set it until later, which will create it anyway.

TobiTenno · 2018-05-22T18:19:29Z

lib/modLocations.js

                mods.push(mod)
            }

-            mod = {_id: hash(text), modName: text, enemies: []}
+            mod = {_id: null, modName: text, enemies: []}


don't define _id if there's no intention to set it until later, which will create it anyway.

TobiTenno · 2018-05-22T18:19:39Z

lib/modLocations.js


            mod.enemies.push({
-                _id: hash(text),
+                _id: hash(text + enemyModDropChance + chance.rarity),


template string, don't concat

Senexis · 2018-05-23T15:25:51Z

Oh dang, I just figured out where "ememy" came from, it's from the enemy mod locations file, and has been since pretty much always, dating back to 2017. Does it matter that this fixes that too (I didn't think twice about it and just auto-corrected it) or should we add "ememy" back for existing implementations?

Again, "ememy" has been in there since forever.

atomicptr · 2018-05-23T15:27:22Z

I don't like that changes in the data will change the ID because you add rarity to the hash stuff. Instead I'd suggest salting the file name into it like hash(text + "modLocations.js") or something that helps to turn it into a more unique thing

Senexis · 2018-05-23T15:29:52Z

That wouldn't work. This would result in double items like "Bailiff" still having the same ID (reference, see enemyModTables) in the same file. There has to be something unique added to it other than something which is the same in both cases, and drop chance and mod items count is as close as it will get that's both unique (enough) and consistent across updates.

atomicptr · 2018-05-23T15:31:59Z

Hm, how about adding a string that explains what this is then?

hash(text + "this is mod locations")

or something, just to salt it.

Because adding rarity values to it will cause the ID to change every single time DE changes the rarity which makes _id obsolete because it always changes and you can't tell if it's still the same field as before.

Edit: What is salted into it isn't relevant, but it shouldn't be something that might change, the purpose of the ID is to make the entry "unique" and I agree that it should probably be unique for each use case. But lets say the reward "400 Endo" from Blah Rotation B should always have the same ID no matter what DE thinks the drop rate should be. Potential values to salt the hash in this example btw are game mode, rotation, location etc and maybe a random string that describes it.

Senexis · 2018-05-23T15:40:20Z

That's the thing, though, anything based off just a text string is going to cause duplicate IDs. Any static salt will which is file-based. Therefor we have to use an object-based salt, which this is...

atomicptr · 2018-05-23T15:47:58Z

Hmm... yeah I see that... the question is in taht case if it wouldn't make more sense to merge these two 🤔

atomicptr · 2018-05-23T15:50:15Z

I don't even know WHY they did this this way lol...

Senexis · 2018-05-23T15:50:59Z

You mean like an "_id" (text only) field in addition to "_uid" (text + other properties) or something?

Yeah, the best way would have been to have THEM set an ID for every item. Additionally, why is it that one enemy has multiple drop tables? 🤔

TobiTenno · 2018-05-23T15:58:31Z

lib/modLocations.js


            mod.enemies.push({
-                _id: hash(text),
+                _id: hash(`${text} ${enemyModDropChance} ${chance.rarity}`),


now there's spaces, which shouldn't be there....

This doesn't matter to be fair, as the output will be the same length and will change no matter what due to the earlier changes.

TobiTenno · 2018-05-23T15:59:39Z

I think, going from an object-based standpoint, it should be treated as, effectively, a different object, since the drop chance did change.. we could almost think of the id as also representing a specific state.

Senexis · 2018-05-23T16:11:57Z

This is why I wrote it like this. It doesn't matter as much whether the ID changes sometimes, compared to being an identifier for that object, and that object alone.

What's the point of an identifier if it's not unique? If a non-unique identifier is something people are looking for, they could just use the text field.

Edit: Only using the chance and and item name would cause less identifier changes. However, this would not fix cases like the Eidolon Gantulyst (2 entries, same chances), not solving the original problem. I don't think there's any way around needing to effectively hash the entire object to be truly unique.

atomicptr · 2018-05-23T16:49:39Z

Yes it should be unique, but I also don't like the ID changing (every time) because this makes it fairly obsolete. The entire reason why I added this in the first place is to check if an object changed between builds, if you change the ID there is no link between them anymore. Which is why I'm against that.

One idea to solve this problem would be to merge the drop lists of these multi entry things, because I don't see a good argument to have 2x "Drop by enemy" entries at all. If that doesn't work, we could maybe add a counter for occurrences of names, this would rely on DE having a consistent order in the list tho....

Senexis · 2018-05-23T17:06:49Z

That would be an option. However, this wouldn't solve cases where the drop chance differs. How would you denote the drop chance for the entire table? It would need to be something like:

{
  "name": "Bailiff",
  "drops": {
    "1.25": [
       "items..."
    ],
    "25.00": [
       "items..."
    ]
  }
}

Edit: Another option might be to rename the duplicate items themselves into something like "Bailiff" and "Bailiff (Alternative)", though that would cater to only 2 duplicate items, and what would you do with more than 2? Something like "Bailiff (1)" and "Bailiff (2)" seems a bit blunt.

Edit 2: Won't work, there's no guarantee that the order DE prints their drops is consistent, so it could occur that "Bailiff (Alternative)" gets to be first sometime and become "Bailiff", bringing us right back to square 1...

atomicptr · 2018-05-24T05:39:48Z

Have you found a case where they're actually different? Bailiff and Eidolon Gantulyst seemto be the same. If that happens you could reference that value in the drop itself tho.

Senexis · 2018-05-24T05:43:56Z

Yes -- the Commander has this issue.

Referencing the table drop chance and mod drop chance in the objects will cause more data as in bigger objects, which is more than what it's worth. I would still prefer changing the ID over the structure of objects to be honest.

atomicptr · 2018-05-24T06:03:06Z

I would still prefer changing the ID over the structure of objects to be honest.

Might as well remove the _id then, it's completely useless for its intended purpose if it changes whenever the data gets changed. Tho I agree with not trying to change the structure of objects.

Edit 2: Won't work, there's no guarantee that the order DE prints their drops is consistent, so it could occur that "Bailiff (Alternative)" gets to be first sometime and become "Bailiff", bringing us right back to square 1...

I wanted to use the counter only for salt, assuming that most of the time they don't change the order this makes this option a lot more viable then the other ones (though still not perfect obviously), also because this doesn't affect a lot of objects. I don't know if there is a better solution 🤔.

Senexis · 2018-05-24T06:12:51Z

I actually think DE not printing their list in a set order is how this issue occurs in the first place, so I really wouldn't go that route.

Maybe it's useless for that purpose but it would still be useful for any other purpose. You would still have the name to identify it by, don't forget that. 😉

Also, the only time the id is going to change is if the drop chance or amount of mods change, which isn't that much of an issue, is it?

TobiTenno · 2018-05-24T13:08:01Z

I think I've mentioned that if we sorted after applying the ids and such, it would fix the issue here: #18

So, i'm ok with this if we add sorting as well.

Reverts the dynamic salt in 90% of the cases, so comparisons can be more easily made. Now, when the same ID has been found in an array, try to hash it with some properties until giving up eventually and just skipping adding the object when it's the _exact_ same. This applies to enemyModTables.js and modLocations.js, as those are the currently known cases of duplicates occurring. Also refactors the other files having inconsistent formatting.

Senexis · 2018-05-24T18:31:16Z

There we go. I went the middle route with this. Objects whose IDs are unique are not changed at all by this, making comparison easier in theory. Objects whose IDs are not unique get additional values added to the salt until falling back onto the entire JSON object stringified. If even that fails, we have a duplicate object which gets removed, so this acts as both a unique-ifier and a de-duplicator at the same time.

Diff is available at my gh-pages as soon as Github decides to update it.

Edit: You may notice more removals and additions than before. This is because these items just happened to go OK on the current iteration, but they could have been compared with each other on a whim. These potential issues are also fixed now.

Edit 2: @atomicptr There is no way to keep the transients from not being counted as removals and additions in this state (adding an _id field). However, in the next actual update from DE it will be compared as it should, so it will resolve itself then.

Edit 3: @TobiTenno Sorting won't help as the diff checker is already sorting the items, as mentioned in the issue as well. Have tested this, doesn't change anything.

This reverts my previous fix for ememy > enemy for now and discards all changes to the object already in the array when all previous checks fail. This causes less removals and additions when just de-duplicating the array as the original item doesn't get edited at all.

Senexis · 2018-05-24T19:14:41Z

Those commits make this solution dang-near perfect. It now provides the least amount of id changes I can do while still fixing the issue and de-duplicating the arrays.

Also since my question about the "ememy" fix got completely ignored (no worries about that!), I just reverted my "ememy" fix for now.

Edit: I know checking against == true is not needed, will remove if there's another change request. Leftover from stupidity.

atomicptr

You don't have to duplicate the checkUnique function, move that to utils or something and also don't use "==" for comparison but use "===" instead. Beyond that it looks fine.

Senexis · 2018-05-25T09:10:13Z

It isn't a duplicate function, one uses the length of a collection, the other uses the value of "chance". I'm sure there's a better way to write that, but I can't think of it right now. Will fix the comparisons soon.

atomicptr · 2018-05-25T10:30:44Z

It's almost 90% the same function tho, which is a duplication. You could pass a function as parameter that returns the value it's using for comparison or something along those lines:

checkUnique(..., obj => obj.chance);
checkUnique(..., obj => obj.collection.length);

Adds checkUniqueHash(items, source, [...properties]) function. Expects "_id" field to exist within the source object, and experts the first property to be the value that's the default hash.

Senexis · 2018-05-25T19:09:24Z

Took a while (due to the very nice weather here), but there it is. Fully modular (apart of the "_id" field being required). Can easily be used in other places now.

atomicptr

Just a minor nitpick, but looks good otherwise :)

atomicptr · 2018-05-28T09:36:27Z

lib/utils.js

+                let sourceHash = []
+
+                // Add one to the properties length to add an additional static check.
+                for (var i = 0; i < (properties.length + 1); i++) {


minor nitpick, but I'd change var to let because I don't see a reason why you'd want to hoist i. This can introduce some unexpected/unintended behavior.

If you know how variable hoisting works in JS you can ignore this, if not this saves one Google search :).

var declares the variable in the entire function not just the scope:

function f() { for(let i = 0; i < 10; i++) { for(var j = 0; j < 10; j++) { // do shit } } console.log(j); // works fine even though we're not in the scope anymore, thank you var? :P console.log(i); // this will throw an ReferenceError because i was used with let which doesn't hoist the variable } f(); // j doesn't exist here obviously...

I'll change the code template I use. Simply a side effect of using the "for" template. Will fix.

atomicptr · 2018-05-28T09:44:47Z

You should probably also merge the changes from the main branch into this and rebuilt, because #19 introduced a change in the data.

Senexis · 2018-05-28T10:43:11Z

Would suggest you merge this and do the rebuild yourself. My fork is too messed up at this point unfortunately due to my own stupidity and doing this all on my gh-page branch.

atomicptr · 2018-05-28T12:06:02Z

Done :), ty for your work!

Nadermane added 3 commits May 22, 2018 19:34

Manual regeneration of the 2018-05-17 17:00 droptables.

a1f9b23

Update README to add the transient "_id" field.

0d19a57

TobiTenno requested changes May 23, 2018

View reviewed changes

Nadermane added 2 commits May 23, 2018 17:20

Concat strings and remove redundant null JSON properties.

00fd6dd

Manual regeneration of the 2018-05-17 17:00 droptables.

0b4edbd

TobiTenno reviewed May 23, 2018

View reviewed changes

Nadermane added 2 commits May 24, 2018 20:27

Manual regeneration of the 2018-05-17 17:00 droptables.

aec26f1

Nadermane added 2 commits May 24, 2018 21:04

Manual regeneration of the 2018-05-17 17:00 droptables.

d35b74f

atomicptr requested changes May 25, 2018

View reviewed changes

Nadermane added 2 commits May 25, 2018 21:05

Add modular way to check hashes of arrays.

e75c51a

Adds checkUniqueHash(items, source, [...properties]) function. Expects "_id" field to exist within the source object, and experts the first property to be the value that's the default hash.

Manual regeneration of the 2018-05-17 17:00 droptables.

9d5c61c

TobiTenno approved these changes May 26, 2018

View reviewed changes

atomicptr requested changes May 28, 2018

View reviewed changes

var -> let

a85e523

atomicptr merged commit 84d110d into WFCD:gh-pages May 28, 2018

atomicptr mentioned this pull request May 28, 2018

Diff page still compares things that shouldn't be compared #18

Closed

Fix some "_id" fields not being unique. #17

Fix some "_id" fields not being unique. #17

Conversation

Senexis commented May 22, 2018 • edited

Choose a reason for hiding this comment

Senexis May 23, 2018 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Senexis commented May 23, 2018 • edited

atomicptr commented May 23, 2018

Senexis commented May 23, 2018 • edited

atomicptr commented May 23, 2018 • edited

Senexis commented May 23, 2018

atomicptr commented May 23, 2018

atomicptr commented May 23, 2018

Senexis commented May 23, 2018

Choose a reason for hiding this comment

Senexis May 23, 2018 • edited

Choose a reason for hiding this comment

TobiTenno commented May 23, 2018

Senexis commented May 23, 2018 • edited

atomicptr commented May 23, 2018

Senexis commented May 23, 2018 • edited

atomicptr commented May 24, 2018

Senexis commented May 24, 2018 • edited

atomicptr commented May 24, 2018

Senexis commented May 24, 2018

TobiTenno commented May 24, 2018

Senexis commented May 24, 2018 • edited

Senexis commented May 24, 2018 • edited

atomicptr left a comment

Choose a reason for hiding this comment

Senexis commented May 25, 2018

atomicptr commented May 25, 2018 • edited

Senexis commented May 25, 2018

atomicptr left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

atomicptr commented May 28, 2018

Senexis commented May 28, 2018

atomicptr commented May 28, 2018

Senexis commented May 22, 2018 •

edited

Senexis May 23, 2018 •

edited

Senexis commented May 23, 2018 •

edited

Senexis commented May 23, 2018 •

edited

atomicptr commented May 23, 2018 •

edited

Senexis May 23, 2018 •

edited

Senexis commented May 23, 2018 •

edited

Senexis commented May 23, 2018 •

edited

Senexis commented May 24, 2018 •

edited

Senexis commented May 24, 2018 •

edited

Senexis commented May 24, 2018 •

edited

atomicptr commented May 25, 2018 •

edited