Update Phase has an issue where not all jobs come through and an infinite wait loop is started #18

andygello555 · 2023-02-23T11:44:25Z

Add a timeout to the Scout procedure that will forcibly stop it after a certain amount of time
- ScoutTimeout constant + update README
- Start a timer at the start of the Scout procedure that will throw a panic if the time is reached
- Update the deferred error email sender to also recover from any panics, as well as stop the timer if it hasn't been done already
Add a timeout to the producers in the Update Phase that will drop the current batch of jobs if there have been no finished jobs for a while
Find out why the bug is occurring
- Race test the Update Phase

Monday.com Item ID: #4034318492

The text was updated successfully, but these errors were encountered:

- Added a timeout timer to the Scout procedure that will cause a panic if the timeout is reached (23/02/2023 - 11:30:28) - Added the ScoutTimeout constant to Scrape.Constants as well as adding documentation for this to the README (23/02/2023 - 11:31:32) - The deferred function that is used to send Error emails in the Scout procedure is also now checking for panics that occur (most likely from the timeout timer) (23/02/2023 - 11:33:53)

andygello555 · 2023-02-23T13:34:00Z

Can't find bug at the moment. Don't have enough time. The timeout for the Scout procedure should hopefully stop the Scout procedure then machinery will restart/resume the procedure for that day, if this happens again.

- Added a uuid.UUID field to both updateDeveloperJob and updateDeveloperResult for a sanity checking bug #18 (23/02/2023 - 13:09:39) - This now means that the finishedJobs variable in UpdatePhase is a channel of uuid.UUID instead of ints (23/02/2023 - 13:10:23) - queueDeveloperRange now returns a set of uuid.UUIDs of the queued updateDeveloperJobs, this is then used by the producer to tick off each finished job that comes in from the consumer rather than checking the cardinality against high-low (23/02/2023 - 13:13:40) - Hopefully, this will mean that if any jobs aren't queued up by queueDeveloperRange will not be taken into account (can't happen anyway but sanity checks) (23/02/2023 - 13:19:23)

- Updated twitter.ClientWrapper struct and methods to better synchronise ClientWrapper.RateLimits and ClientWrapper.TweetCap (17/04/2023 - 12:05:17) - Added the twitter.ClientWrapper.RateLimit method which returns the latest rate limit for the given BindingType. This was in order to make accessing the sync.Map that holds the rate limits a bit easier (17/04/2023 - 12:06:05) - Access token that is held in reddit.Client can now only be accessed and set using methods. This is because there is now a RWMutex that manages synchronisation for it (17/04/2023 - 12:06:58) - Changed all accesses to ClientWrapper.RateLimits in update.go to instead use the ClientWrapper.RateLimit method (17/04/2023 - 12:07:56) - Updated gapi to v1.0.1 to consolidate synchronisation fixes (17/04/2023 - 12:10:15)

- Producers in the Update phase will now drop the current batch of jobs if they have not yet recieved a finished job in a while (17/04/2023 - 13:30:21) - The above behaviour is controlled by 2 new scrape constants: UpdateProducerFinishedJobTimeout and UpdateProducerFinishedJobMaxTimeouts (17/04/2023 - 13:30:50) - These constants + explanations have been added to the README (17/04/2023 - 13:31:17) - Some scrape constants in the README were missing a default value, this has now been resolved (17/04/2023 - 13:31:43)

andygello555 · 2023-04-18T11:50:07Z

I think the bug was because of the creation and closure of the finished jobs queue within the producer. I've made it so that the finished jobs queue is a fixed length (len(unscrapedDevelopers)) in #2 and the producer will either keep dequeuing or waiting until all jobs are seen.

I also added a maximum number of possible waits in a row. The batch of jobs will be dropped if this is exceeded.

- Reduced the amount of logging that the Reddit API client does (19/04/2023 - 12:25:05) - Synchronisation bug within SetTweetCap where TweetCapMutex was being acquired twice (19/04/2023 - 12:41:23)

andygello555 self-assigned this Feb 23, 2023

andygello555 added bug Something isn't working and removed bug Something isn't working labels Feb 23, 2023

andygello555 closed this as completed Feb 23, 2023

andygello555 reopened this Apr 5, 2023

andygello555 closed this as completed Apr 18, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update Phase has an issue where not all jobs come through and an infinite wait loop is started #18

Update Phase has an issue where not all jobs come through and an infinite wait loop is started #18

andygello555 commented Feb 23, 2023 •

edited

andygello555 commented Feb 23, 2023 •

edited

andygello555 commented Apr 18, 2023

Update Phase has an issue where not all jobs come through and an infinite wait loop is started #18

Update Phase has an issue where not all jobs come through and an infinite wait loop is started #18

Comments

andygello555 commented Feb 23, 2023 • edited

andygello555 commented Feb 23, 2023 • edited

andygello555 commented Apr 18, 2023

andygello555 commented Feb 23, 2023 •

edited

andygello555 commented Feb 23, 2023 •

edited