Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Speedscan bug fixes #1900
Many bug fixes and improvements to Speedscan
This PR has many bug fixes and changes to the Speed scan scheduler.
Motivation and Context
Reviewing the logs showed many warning messages, complaining about Pokemon not being where they should. I investigated these and fixed bugs to make the warnings go away. TTHs should also be correctly identified eventually. (As this can still take several hours).
However, when we queue a spawn point, we only need it queued once, so the changes attribute each spawnpoint to one scan location only.
Another significant change was to use the 'current_timestamp_ms' from the GET_MAP_OBJECTS response. A lot of problems were caused by using the current time, rather than the time the scan took place.
Performance gains were also included and uncaught exception handled.
How Has This Been Tested?
Testing will be documented per change relating to an issue. Some tests will run on a clean database, others may have data already fully populated. Some may require data manipulation to force senarios. All SQL used will be put here:
Here are some CPU graphs created using netdata. They were taken with the server running 2 instances (-st 35 and -st 40) A total of 18,000 spawnpoints using a total of 480 workers.
As far as I can see the overall CPU usage may actually be slightly lower with the new code, but there is not much in it.
I would say 40% lower on my setup than current dev
Speedscan delay in scanning Spawns is too high:
On average the scan delay is about 100 seconds.
Speedscan distance calculations take too long:
Large -ST hexes have distorted Cells at the map edges:
RocketMap can stop with an uncaught exception:
Scannedloaction bands don't match the Spawnpoint detection data and don't correspond to when the scan actually took place:
In this example the _919 detection data doesn't match with the band4 time of 918 on the cell. This is not a major difference, but unifying the time will help solve a lot of problems.
Unable to locate a TTH with a scan, just after the latest seen scan:
Spawnpoints have an incorrect 'kind':
Running the check_kind SQL compares the stored kind of a spawnpoint with a calculated one. This is only for SPs with a TTH and where the cell scanning is complete. The output show that 336 SPs have an incorrect kind:
Spawnpoints are not counted correctly':
The map is reporting lower than expected SPs
Also for a multiple - ST 30, running a count of spawnpoints reads:
The existing Spawnpoint.Select_in_hex was renamed to Spawnpoint.Select_in_hex_by_location and is still needed upon startup in case the location has changed
The queue can refresh between an item from the queue being scanned and when it is sent to Task_done:
SPs are not being re-classified when a cell is done, because the last scan was a 'bad' one:
Queue is continuously refreshed if there are no associated SPs within that area:
SPs taking forever to find a TTH:
Types of changes
Tested on (13) -st 9 after clearing database and it's working well. Still has the --Account (account name) returned empty scan for more than 3 scans; possibly ip is banned. Switching accounts... -- on the section of my map that intentionally doesn't have anything there. Not sure if there is even a possible fix for that though. Good Work.
Speedscan to me seems key to a useful RocketMap so I'm glad to see work to improve it. I plan to have a proper look through and test when I have time. Just wondering if you know if this is likely to fix issues 1897, 1898 and 1900, these just recently popped up? (1898 is mine, and at first glance this may not be fixed). If not perhaps we could have a look at some of these at the same time while there is a few of us considering speedscan.
referenced this pull request
Mar 3, 2017
referenced this pull request
Mar 3, 2017
I have tested this really intensively for my ~140km² scan area with @rocketrobot99 for more than two days. We compared the data with previous SpeedScan DB, compared pokemon and spawnpoint counts and so on. It finishes initial scan, finds all the spawnpoints and times are correct as well.
I recommend a clean database (delete all scan* and spawn* tables) though.
@voxx @dsorc I have pushed a fix. The problem set SP's disappear time incorrectly. What would happen is an SP would be picked up right at the very end of its spawn, by the scan of another SP. This is fine, and a simple lookup would normally show that the pokemon for the spawn already exists. However, the lookup was using the current time (now just after the SP has despawned) rather than the time of the scan. Thanks to Voxx for his help, I have been running all day with this fix and not found any more examples.
I've been running the latest patch 4fc0fd7 for about 12 hours on three fairly large instances now.
I spent some time working with @rocketrobot99 testing a handful of previously erroneous spawn points and it appears the bogus alarms for 1 hour spawns that were occurring are now alerting at proper times.
While I did still experience a small handful of bogus timers appearing, these particular issues seem to not be related to this same bug, and I think can be considered out of scope for this PR. Likely the infamous "rare" spawn timers that have been alluded to in the past that the existing code base, and this PR don't account for.
I'll report back in another 12 hours or so, but thus far it looks good on my end. If no additional weirdness creeps up this evening you've got thumbs up to merge from me.
I have been using this PR for almost 2 weeks (updating as needed) and when I went back to the dev version today to make a new PR I did notice how fast the startup is in this PR and that I just need 1 db worker with PR and 3 without it.
I am not sure about CPU usage but mem usage is similar.
@onilton I understand your point, Although I have only heard of cases where this improves the scanning. I Just don't want to make any more changes in this PR. If your system is set up so that the queue is minimal, then, in theory, all SPs will be scanned as soon as possible. The situation you describe will not happen, if you have enough workers as another worker will pick up the missed SP and that other worker may even be closer. The current dev logic just goes for the closest, but only considers moving once the spawn has already started. I have been experimenting with some amendments to next_item and will be included in a follow-up PR. They are a combination to minimise secs after spawn and distance to travel.
Tested this heavily with a completely fresh database in comparison to latest develop branch. I see significant improvements concerning (correct) TTH found, average scan delay and queue calculation speed. I now scan 7 instances with -st 30, which wasn't possible without the PR (covered a little bit smaller area with 19 instances -st 14). When I tried to scan this big steps in a very spawn intense area (major city in Germany), it took ages to finish the initial scan and scan delay was way higher. Even with the smaller steps, avg scan delay was still ~150 secondes while it is at ~30 seconds now. Also, the initial scan just took about 12 hours to finish with 99,9% TTH found. I'm still missing 6 of 26,5k spawnpoints after 48 hours, though. But I think that's acceptable. Also, all the timers seem to be correct now, as I didn't got any negative reports from my users by now.
I don't have the skills to review the code, but I see that it's working way better than before. I hope this helps a little.
I've been getting poked about merging this PR so often, by so many people (most of which have never coded in their life), that I'd like to say congratulations: you know how to annoy a person. And not just random people, but people that have contributed in some way to the project (#pr) and that I consider to be on the team.
I'm used to people poking us constantly for stuff they want us to do, but this time the "request a review" feature on Github has been used for the first time ever by @FrostTheFox to poke me towards this PR.
This makes it very clear to me that y'all don't want an actual review, because that would take more time - as I've explained several times (especially to the people in #pr on Discord). You just want it merged.
So here you are, I'm merging it. No review, reduced code quality.
No one here has any idea of what the actual process is for code review outside of the button in github. But this PR is extremely popular, and obviously many people wanted you to review it. So why didn't you? We don't know your timetable for review, you don't provide us with a roadmap or milestones.
Since you haven't reviewed this, why do you assume it will be reduced code quality? Are you immediately assuming that the code written here is inferior to your own?
Please, open up to us.. be transparent, about this review process, the timetable, who is doing it, the roadmap ahead... heck, even what you're working on right now... Because this is an open source project, and an open source community.. right?
@jagauthier Except you're oblivious to everything that's been happening in #pr.
I have been reviewing it. I have been replying to questions about where I was and how it was going, I addressed that in my earlier comment.
Everyone knew. I just needed more time, because proper review isn't something you force to finish. And I know it's bad code because it's the only thing I've been putting my time towards for the past few weeks. Even rocketrobot knew it was pretty bad and was open to being patient for an actual review - but no one else was.
I'm not even given the time by #pr to finish a review, even when I communicated properly with them, so I can tell you with absolute certainty that we don't have the time to put towards communicating everything to everyone. It doesn't work and it increases the amount of time we need to act like customer support to hold everyone's hands and to make sure we don't "offend" anyone with our decisions, actions, or delays because we're trying to communicate first rather than coding first.
For everything that the community has against commercial organisations (e.g. the negative feedback towards pogodev/pgoapi after BossLand entered the scene to solve the hashing problem), a commercial organisation with strict rules and nearly no transparency is actually what everyone seems to need. And of course that's going to annoy me, it's hypocritical.
As for your "are you assuming it's inferior to your own", that wasn't necessary.
The 6 people who upvoted your comment are all people who don't understand the consequences of some decisions, and the time and effort everything takes. Besides bluemode (you troll), I don't recognize anyone for any effort they've put towards RocketMap. And before I get another reaction of "then tell us, explain it all to us!", I'm sorry but I'm not here to hold your hands and teach you how this part of the world works.
It's open source. The thing you don't seem to get, is that the equivalent of the time and effort we've been putting in for the past 9+ months, is for you to start learning from scratch and to try to catch up to where we are in whatever way you can rather than complaining about us. We don't expect you to suddenly catch up, but the effort of trying will move you forward and into a position where you can contribute (small parts) to the code and the project rather than having to place some form of vague idealistic comment about communication.
As for this PR, everyone could comment and participate. But that doesn't mean everyone will suddenly have the required experience to properly evaluate software code, and whether it's one person without the required skills/experience approving the PR or twenty, it doesn't make any of them more valid. Feedback is always welcome, but that doesn't mean it'll be enough to use as a single point of reference for approval or rejection of a PR.
Someone PM'ed me this, and I'm finding myself agreeing more lately: the community has been spoiled.
Perhaps it's time for us to take a step back as RocketMap and act more as an organization rather than your local pub where everyone can enter to improve the work and code quality. To set up formal processes of recruitment, development and communication. But I'm convinced that'll also bring complaints along with it.
Either way, we have things to discuss and we're discussing them. For now, these are just thoughts, and I've always been transparent in any way with the relevant people: #pr knows everything I've been up to, including where I went for the holidays and for how long. I've always included everyone that was necessary in any decision we've made so far, and in the cases where that wasn't always possible (e.g. a ban on Discord), I've always been open to receiving feedback about them and adjusting if necessary (FrostTheFox). So to use my own quotes to turn a lack of time into a way of saying I haven't been transparent enough, is kind of insulting.
So rather than find a solution to a practical problem (not getting a reply at the times you tried to contact us), you complain that we're not being transparent enough, rather than trying to understand our situation, without even once having contacted us about it? (What did you do, who did you talk to, in what timezone were you when sending your messages, was it when I was at Disneyland or snowboarding, what did you try when you noticed I didn't/couldn't reply, ...?)
I'd like to share people's ideas and thoughts with the community, but if it keeps coming back to the fact that people haven't even tried to put in the effort of understanding the other party, then it makes it a bit more difficult for me.
It only reinforces the suggestion of being a managed organization rather than "your local bar where everyone can enter".
It's not my responsibility to send you a message at the perfect time for you, or @FrostTheFox .
"and I've always been transparent in any way with the relevant people: #pr knows everything I've been up to,".
Being transparent in behind closed doors (#pr) is exclusivity, not transparency.
I would like access to #pr. I would like to discuss RM and it's development will all the fine people who contribute. I would like to have a friendly relationship with you (which is very achievable, because I am certainly willing, and respect you).
So, this is my 4th format request for access to #pr. You can search my name for PRs, you know who I am on discord. And I've tagged both you and Frosty in it.
Thanks! Looking forward to further discussions about RM!
@jagauthier I'm pretty sure I would remember if you direct PMd me recently while I was on and active, sometimes I wake up to a hundred messages (yes, not even lying, not all from RM, but from all the servers I'm in) and sometimes I don't really feel like reading every single one of them.
I don't even know your discord name.
Yes it is. That's what you get when everyone is a volunteer, no one is paid, and all we have is what we put into it. You cannot choose not to be proactive and then complain we didn't help you out enough.
Either you want to be in #pr and you do the absolute minimum to make it work (which really doesn't take much, you can ask all the other #pr members), or you don't but then don't complain about it.
Transparency to everyone contributes nothing and wastes time, which we already don't have. Transparency to relevant people, especially when #pr is a public rank that anyone with minimal coding experience can access, making it a publicly accessible rank rather than an exclusive one, keeps transparency and takes up less time.
The first thing I did was search for your name on Discord, I got zero search results. I have absolutely no idea who you are.
And this is the last comment I'll post about this here. This isn't showing us anything new, and it's not winning us any time.
Trying to think of takeaways from this and I think the obvious one is to encourage smaller pull requests. The larger the PR gets, the longer a code review will take (and the increase in turnaround time is not linear). This is kind of a microcosm for the argument of agile vs waterfall in software development. While this PR wasn't gargantuan, it wasn't small and it addressed multiple bugs which probably could have been broken up and submitted as separate PRs. The maintainers should encourage and/or enforce smaller PRs but outside contributors would do well to heed this advice if they want to see their code merged in a timely fashion.