-
Notifications
You must be signed in to change notification settings - Fork 22
Fails on horribly huge inboxes #4
Comments
I'm getting the same error (many times, rolled up in an email from Google most or all days). Except that my inbox only has 58 items in it. My "All Mail" folder has 2700, so even that isn't all that enormous. I set the script to run every 10 minutes. I just decreased that to every 15 minutes to see if it makes a difference. I am using something very close to version dbb4a2d. |
@mhagger does it give a line number for your error message? I'm curious which API it is hitting the limit for. |
@mastahyeti: here's the start of the email: Line 319 in my version of the script (my customization probably changed the line numbers) is
|
For me it fails on an inbox that has less than 200 emails. See Google's quota limits here: https://script.google.com/dashboard It claims "50000 Gmail operations / day" for Apps for Business accounts but doesn't go into detail of what this means. It's possible that we all share the same daily quota. Or, it's possible that there is a finer grained quota that would be alleviated by actually putting |
I wrote my own simpler version of OctoGAS that optimizes a lot of queries over this one, but even that wasn't enough to get rid of query limit failures. I'll experiment with one more level of optimization and post my findings here. |
@matthewmccullough Looks like you're actually running into trouble with the muter script and not the labeler script. Google had changed some behavior which broke this script, but I fixed that in #11. Can you try copy-pasting the current version of https://github.com/mastahyeti/OctoGAS/blob/master/muter.gs into your copy of the script? You might need to run the script manually once (they seem to rate limit manual script runs differently) to clear out your backlog of muted messages. Ping me if you need help with any of this. |
As for the rate limit problem with the labeler script, @josh had a good idea that I was meaning to follow up on. If the user adds a Gmail filter to add a |
👍 Whoops. Can do! |
In my script, I cache the timestamp when the script last run and then grab only threads that were updated since that time. Avoids iterating over threads that have already been processed: var query = 'in:inbox AND ( from:"notifications@github.com" OR from:"notifications@support.github.com" OR from:"noreply@github.com" )'
, lastRunAt = cache.getLastRun()
, newLastRun = new Date()
if (lastRunAt) {
query += " after:" + lastRunAt
}
cache.recordLastRun(newLastRun) However, I think individual My plan was to experiment with saving the ID of the last message in the thread that was already processed, then start processing only new messages after that one. That will save on a lot of unnecessary |
Mentioned this to @ross earlier today, and he said he had been doing something similar in his own copy of the script. Please to share? 😁 |
i just pr'd the customizations i made. they're directly in the labeler.gs file since i didn't know anything about coffeescript and i was in the middle of onboarding when i did it. the nice this about this route is that it only takes a single gmail filter to move all notifications in to Github/Pending and then once it processes them it moves them out so it doesn't grow over time. i actually did it this way b/c my android notifications were happening almost immediately after receiving the messages way before OctoGAS has a chance to run so my phone was constantly showing dozens of notifications. |
I've upgraded my "simpler OctoGAS" script to cache last read message index for all processed threads and, when new replies arrive, process only new messages in a thread rather than starting from the beginning of the thread. log("fetching messages for %d threads", todoThreads.length)
forEach(GmailApp.getMessagesForThreads(todoThreads), function(messages, i){
var message
, thread = todoThreads[i]
, i = cache.getStartingMessageIndex(thread)
log("fetching body for %d messages starting from index %d", messages.length - i, i)
for (; i < messages.length; i++) {
message = messages[i]
// ... |
Here's one of my labeling scripts. function processQueue() {
var githubReasonLabels = {
"assign": GmailApp.getUserLabelByName("GitHub/Assign"),
"author": GmailApp.getUserLabelByName("GitHub/Author"),
"comment": GmailApp.getUserLabelByName("GitHub/Comment"),
"mention": GmailApp.getUserLabelByName("GitHub/Mention"),
"team_mention": GmailApp.getUserLabelByName("GitHub/Team Mention"),
"manual": GmailApp.getUserLabelByName("GitHub/Manual")
};
function processThread(thread, messages) {
for (var i = 0; i < messages.length; i++) {
if (!messages[i].isUnread()) continue;
var rawContents = messages[i].getRawContent();
var match = rawContents.match(/^X-GitHub-Reason: ((.|\r\n\s)+)\r\n/m);
if (match) {
var reasonLabel = githubReasonLabels[match[1]];
if (reasonLabel) reasonLabel.addToThread(thread);
}
}
}
var label = GmailApp.getUserLabelByName("Queue");
var threads = label.getThreads();
var messages = GmailApp.getMessagesForThreads(threads);
for (var i = 0; i < threads.length; i++) {
Logger.log("Process Thread[" + i + "]");
processThread(threads[i], messages[i]);
threads[i].removeLabel(label);
}
} |
seems like anything relying on the cache is eventually going to run in to trouble when the cache is wiped/key is evicted. to address that it would seem like it would have to process some number of things each pass and stop (recording the cache key) and then pick up at that point the next time. another option might be to label the last processed item and use that non-ephemeral marker. |
I set my cache TTL for 2 hours and renew it when I run the script multiple times within that period. In my experience I didn't see the caches get wiped arbitrarily. But I agree, that's a downside. @josh Pretty cool trick with checking for |
Sorry for not responding here sooner. Looking at the labler script again, we do have caching of which threads have already been processed and threads should only be processed once: class Thread
# Queue all threads to have the appropriate labels applied given our reason
# for receiving them.
#
# Returns nothing.
@labelAllForReason: ->
@all[id].labelForReason() for id in @ids when !@all[id].alreadyDone()
# Load a list of Thread ids that have already been labled. Because the ids
# are based on the messages in the thread, new messages in a thread will
# trigger relabeling.
#
# Returns nothing.
@loadDoneFromCache: ->
cached = CACHE.get @doneKey
@done = JSON.parse(cached) if cached
# Save the list of ids that we have already labeled.
#
# Returns nothing.
@dumpDoneToCache: ->
CACHE.put @doneKey, JSON.stringify(@done)
# Has this thread already been labeled?
#
# Returns a bool.
alreadyDone: ->
Thread.done.indexOf(@id) >= 0
...
Label.loadPersisted()
Thread.loadFromSearch QUERY
Thread.loadDoneFromCache()
Message.loadReasonsFromCache()
try
Thread.labelAllForReason()
Thread.archiveAll() if SHOULD_ARCHIVE
catch error
Logger.log error
finally
try
Label.applyAll()
catch
Logger.log error
finally
Thread.dumpDoneToCache()
Message.dumpReasonsToCache() Assuming that the error handling is correct, this should be able to process a large inbox over the course of many runs, even if it hits rate limit issues. I don't have a large inbox to test this in, so maybe it isn't working. They've updated the cache API a bit, so I made a few changes in dcd1086 and edd8ac8. |
I have a setup where I wrote a general rule that all incoming github notifications go in to a |
My inbox has 5k+ messages. Hit a rate limit with the script:
The text was updated successfully, but these errors were encountered: