-
Notifications
You must be signed in to change notification settings - Fork 303
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle null item times and adjust cost thresholds in getHeroItemPopularity
#2664
Handle null item times and adjust cost thresholds in getHeroItemPopularity
#2664
Conversation
Here is the existing logic: const startGameItems = countItemPopularity(
items.filter((item) => item.time <= 0)
);
const earlyGameItems = countItemPopularity(
items.filter(
(item) => item.time > 0 && item.time < 60 * 10 && item.cost > 700
)
);
const midGameItems = countItemPopularity(
items.filter(
(item) =>
item.time >= 60 * 10 && item.time < 60 * 25 && item.cost > 2000
)
);
const lateGameItems = countItemPopularity(
items.filter((item) => item.time >= 60 * 25 && item.cost > 4000)
); My first proposition is that start game items should be redefined: const startGameItems = countItemPopularity(
items.filter((item) => item.time <= 0 && item.cost <= 600)
); because, apparently My other proposition is to adjust the item.cost filters used in mid and late game items. Because, 2000 and 4000 feel like really high thresholds, so will discard a lot of other items, especially for supports. My recommendation would be: const midGameItems = countItemPopularity(
items.filter(
(item) =>
item.time >= 60 * 10 && item.time < 60 * 25 && item.cost > 1200
)
);
const lateGameItems = countItemPopularity(
items.filter((item) => item.time >= 60 * 25 && item.cost > 1800)
); @howardchung what do you think? |
I feel like if we want to check for null, we should explicitly do that check rather than relying on the 600 gold filter (since that could change) |
store/queries.js
Outdated
@@ -315,7 +315,7 @@ function getHeroItemPopularity(db, redis, heroId, options, cb) { | |||
.flatMap((purchaseLog) => purchaseLog.purchase_log) | |||
|
|||
.map((item) => { | |||
const time = parseInt(item.time, 10); | |||
const time = item.time ? parseInt(item.time, 10) : 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does item?.time work?
Also I think the very next line is going to encounter the same crash when we try item.key. So maybe we should be using filter(Boolean) to remove all null items before we try doing something
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
item.time ? parseInt(item.time, 10) : 0
will default to 0 when item.time is null
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree filtering any null would make sense. I'm just curious what would cause the null item.time
. If it only ever happens for pre-game items, then defaulting to 0 would make sense. If it can happen to late game items, then the item.cost <= 600 would filter that. But would also make sense to just remove all nulls from the start, assuming it's not too prevalent
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if item is null then item.key will crash. I think you want .filter(Boolean) here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I added .filter((item) => item !== null && item.key !== null)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
null time will then still default to 0, with const time = item.time ? parseInt(item.time, 10) : 0;
,
so if it was a pre-game purchase it'll be counted by :
const startGameItems = countItemPopularity(
items.filter((item) => item.time <= 0 && item.cost <= 600)
);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still think .filter(Boolean) is simpler and does what we want there. But if you want to check item and item.key specifically then use != to include checking for undefined
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the latest changes I added a filter statement: https://github.com/odota/core/pull/2664/files#diff-6483f8be8db993ad2d2cb82be6da1067d8cb8c925a8559e532e063aefa4f5047R315
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can add filtering to remove cases that have null item.time
too. I just can't know what makes most sense, given I can't see how many cases this filters, and what causes those cases.
If item.time defaults to 0, then it can potentially only be counted in starting items, and only if it's less than 600 gold. That feels to me like an adequate solution. But I'm open to just remove null cases for time too, it won't change much
So, I added a filter for null keys, just in case. I'm trying to inspect cases where time is null, playing around with the /explorer. I finally got GPT to help write a query that runs, but it just times out, even with select 1. It's basically like this: SELECT purchase_log
FROM player_matches
WHERE purchase_log::text LIKE '%"time":null%'
LIMIT 1 Anyway, sharing that in case you can query without timeout and see an example of a problematic row |
store/queries.js
Outdated
); | ||
const earlyGameItems = countItemPopularity( | ||
items.filter( | ||
(item) => item.time > 0 && item.time < 60 * 10 && item.cost > 700 | ||
(item) => item.time > 0 && item.time < 60 * 10 && item.cost >= 600 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
>= 600
instead of >700
Essentially will now add: Bottle (675), Voodoo Mask, Ring of Health, Voice Stone (700)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@howardchung last thing I'm wondering about is if there's any justification to reducing this threshold further, maybe to >=500, to include items like boots of speed, bracer, null talisman, wraith band, chainmail.
Reducing to >= 400 would include things like magic wands, buckler, gloves of haste, blades of attack, belt of strength, etc.
So it's not a HUGE amount of items being added by reducing threshold, but it's up to you if it makes sense to include magic wand, blades of attack, etc as "early game items". I don't see why not. I think 500 threshold makes most sense to include bracers/etc. Maybe 450 / 400 if you're okay with including wand, band of elvenskin, robe of magi, etc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Like it may be interesting to see how common boots of speed is in first 10 minutes for a hero. Should be almost every game, but will be differences across heroes. And bracer/wraith/null builds are quite common too, would be interesting for stats to see that too.
store/queries.js
Outdated
) | ||
); | ||
const midGameItems = countItemPopularity( | ||
items.filter( | ||
(item) => | ||
item.time >= 60 * 10 && item.time < 60 * 25 && item.cost > 2000 | ||
item.time >= 60 * 10 && item.time < 60 * 25 && item.cost >= 1200 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
>= 1200
instead of >2000
Essentially will now add: Cornucopia, Point Booster (1200), Arcane Boots, Talisman of Evasion (1300), Claymore (1350), Aghs Shard, Platemail, Perseverance, Pavise (1400), Ghost Scepter, Oblivion Staff, Phase Boots (1500), Veil of Discord (1525), Mithril Hammer (1600), Drum of Endurance (1650), Vanguard (1700), Mask of Madness, Mekansm (1775), Dragon Lance (1900), Crystalys (1950), Hyperstone (2000).
) | ||
); | ||
const lateGameItems = countItemPopularity( | ||
items.filter((item) => item.time >= 60 * 25 && item.cost > 4000) | ||
items.filter((item) => item.time >= 60 * 25 && item.cost >= 2000) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
>= 2000
instead of >4000
Essentially ... will include every item between 2000 and 4000 which was previously ignored
getHeroItemPopularity
The LIKE query is going to be slow because there isn't a text index on that column. |
@howardchung thoughts on these changes? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@howardchung I noticed that PublicMatchesResponse had some missing params in its definition. So, patching that in this PR
}, | ||
cluster: { | ||
type: "integer", | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here*
My only blocking concern is changing the filter condition to: |
How do you mean? I've added: |
That depends on the behavior we want if time is null/undefined. If we want to default it to 0, then we shouldn't add it. But the change I'm asking for is either to just check for item && item.key or item != null && item.key != null to handle undefined (which may not be possible, but I figure it's safer) |
.flatMap((purchaseLog) => purchaseLog.purchase_log) | ||
|
||
.filter((item) => item && item.key && item.time) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@howardchung I edited as follows
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
makes sense to not need !== null
check - I'm new to javascript so wasn't sure on diff between the two, but asserting truthiness is probably better than only asserting not null. Filtering by time also makes sense because should be an anomaly
store/queries.js
Outdated
.flatMap((purchaseLog) => purchaseLog.purchase_log) | ||
|
||
.filter((item) => item !== null && item.key !== null) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you should use != here to catch undefined in addition to null.
@howardchung thanks 👍 |
getHeroItemPopularity
can error if item.time is null. #2663Also adjust cost thresholds used in defining early, mid, and late game items. See review comments