-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Correlate sum with count of distinct users #32
Comments
So it is request to show total number of people who ever added this tag, right? |
Yes, focus is on adding. Result would be a time series of the size of a tags active user base. If I understand correctly, history has to be crawled from the start with every run, so to know the previous state, and that I think is what taghistory does. No idea though, how much that would increase execution time or memory requirements. PS: Meanwhile I learned, that the ohsome API has a /users endpoint, where I can retrieve number of users interacting with something, having or having had a certain tag. I have yet to find out what kind of interaction that applies to: creation, modification, deletion, some, or all… Documentation there is sparse. |
Of course this is not perfect: active must not be understood as active in a short timeframe, but as an accumulated having been actively setting the tag some time in the past value. Splits will also overstate user base, especially when just motivated by relation building. But it will be better than the number currently shown on taginfo. Then, when deletions are not handled, the total will differ from the number of items having a certain tag. Could possibly be handled by software that treats the user not as a boolean but as a counter, which could provide further base for even more statistical tools that do not work on time series but on distribution of activity within a window of time, eg. would allow to not only calculate the mean, but also the median. I would have liked something like that last year when researching how to recognize hiking trails, that are not bound to a relation, as there are several tags that commonly are only found on those. The only method to get a sense of the user base that added them, was to look at random items returned from an overpass query and examining each one separately. |
This is a wishlist item - perhaps you can see value in it: (I cannot assess, if it is in scope of taghistory at all…)
When considering taghistory a popularity contest, that shows the count of votes (i.e. objects carrying a specific tag), where every voter has unlimited votes, it would be nice to also have a count of voters (the people that applied a tag to an object).
For little or medium used tags, one could immediately see, if it was applied (voted on, in contest speak) by a single, a few or by many. For tags with many occurences, the distribution will be quite flat, but it might still be reasonable to have the number, when comparing tags that are close in meaning, e.g.
Hope it is clear :)
The text was updated successfully, but these errors were encountered: