Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

only process 500 emails? #6

Closed
shyderc opened this issue Mar 6, 2021 · 8 comments
Closed

only process 500 emails? #6

shyderc opened this issue Mar 6, 2021 · 8 comments
Assignees

Comments

@shyderc
Copy link

shyderc commented Mar 6, 2021

I ran the latest extractor and it only does max. 500 messages. Seems it only does 1 page of the search result?

C:\GmailAttachmentsExtractor_v1.0.1> java -jar GmailAttachmentsExtractor.jar --mime-type 'image|video|audio' larger:1M

=== SUMMARY ===
Processed 500 email(s)
Extracted attachments from 0 email(s)
Extracted 0 attachment(s)
Total extracted attachments size: 0 bytes
Extracted attachments types: []
NOT extracted (filtered) attachments types: [application/octet-stream x 300]

Please enhance your code to look at response for the presence of nextPageToken:

{
  "messages": [
    {
      "id": "177f47bb76a1f474",
      "threadId": "177f47bb75a1f484"
    },
    {
      "id": "177acb8f79ef9158",
      "threadId": "177acb8f79df9152"
    }
  ],
  "nextPageToken": "08833574449120401647",
  "resultSizeEstimate": 94
}

If it exists, please submit a new api call including the value as parameter pageToken, and loop until the response has no nextPageToken:

{
  "messages": [
    {
      "id": "177562523a3c9083",
      "threadId": "177562523a2c9086"
    },
    {
      "id": "1773c736f8b81694",
      "threadId": "1773a736f8c81694"
    }
  ],
  "resultSizeEstimate": 10
}
@shyderc shyderc closed this as completed Mar 6, 2021
@shyderc shyderc reopened this Mar 6, 2021
@TeWu
Copy link
Owner

TeWu commented Mar 7, 2021

Thanks for reporting this.

@TeWu TeWu self-assigned this Mar 7, 2021
@TeWu TeWu closed this as completed in fa3f281 Mar 7, 2021
@TeWu TeWu changed the title only process 100 emails? only process 500 emails? Mar 7, 2021
@shyderc
Copy link
Author

shyderc commented Mar 9, 2021

I tried compiling your changes and ran 169 msgs.
Then I ran with "larger:5M" and prior the Gmail GUI search paged to 1101 msgs.
The extractor dies at 451/~656
I will try again

   Attachment saved: IMG_5108.JPG
    Inserting copy of email without extracted attachments to Gmail
451/~656 (68%) | Processing email 'More Mallard pics'
java.net.SocketTimeoutException: Read timed out
        at java.base/sun.nio.ch.NioSocketImpl.timedRead(NioSocketImpl.java:283)
        at java.base/sun.nio.ch.NioSocketImpl.implRead(NioSocketImpl.java:309)
        at java.base/sun.nio.ch.NioSocketImpl.read(NioSocketImpl.java:350)
        at java.base/sun.nio.ch.NioSocketImpl$1.read(NioSocketImpl.java:803)
        at java.base/java.net.Socket$SocketInputStream.read(Socket.java:981)
        at java.base/sun.security.ssl.SSLSocketInputRecord.read(SSLSocketInputRecord.java:478)
        at java.base/sun.security.ssl.SSLSocketInputRecord.readHeader(SSLSocketInputRecord.java:472)
        at java.base/sun.security.ssl.SSLSocketInputRecord.bytesInCompletePacket(SSLSocketInputRecord.java:70)
        at java.base/sun.security.ssl.SSLSocketImpl.readApplicationRecord(SSLSocketImpl.java:1434)
        at java.base/sun.security.ssl.SSLSocketImpl$AppInputStream.read(SSLSocketImpl.java:1038)
        at java.base/java.io.BufferedInputStream.fill(BufferedInputStream.java:244)
        at java.base/java.io.BufferedInputStream.read1(BufferedInputStream.java:284)
        at java.base/java.io.BufferedInputStream.read(BufferedInputStream.java:343)
        at java.base/sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:754)
        at java.base/sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:689)
        at java.base/sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1623)
        at java.base/sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1528)
        at java.base/java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:527)
        at java.base/sun.net.www.protocol.https.HttpsURLConnectionImpl.getResponseCode(HttpsURLConnectionImpl.java:308)
        at com.google.api.client.http.javanet.NetHttpResponse.<init>(NetHttpResponse.java:37)
        at com.google.api.client.http.javanet.NetHttpRequest.execute(NetHttpRequest.java:105)
        at com.google.api.client.http.HttpRequest.execute(HttpRequest.java:981)
        at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:419)
        at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:352)
        at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.execute(AbstractGoogleClientRequest.java:469)
        at pl.geek.tewu.gmail_attachments_extractor.GmailAttachmentsExtractor.getRawMessage(GmailAttachmentsExtractor.java:370)
        at pl.geek.tewu.gmail_attachments_extractor.GmailAttachmentsExtractor.extractAttachments(GmailAttachmentsExtractor.java:138)
        at pl.geek.tewu.gmail_attachments_extractor.Main.call(Main.java:45)
        at pl.geek.tewu.gmail_attachments_extractor.Main.call(Main.java:14)
        at picocli.CommandLine.executeUserObject(CommandLine.java:1783)
        at picocli.CommandLine.access$900(CommandLine.java:145)
        at picocli.CommandLine$RunLast.executeUserObjectOfLastSubcommandWithSameParent(CommandLine.java:2150)
        at picocli.CommandLine$RunLast.handle(CommandLine.java:2144)
        at picocli.CommandLine$RunLast.handle(CommandLine.java:2108)
        at picocli.CommandLine$AbstractParseResultHandler.execute(CommandLine.java:1975)
        at picocli.CommandLine.execute(CommandLine.java:1904)
        at pl.geek.tewu.gmail_attachments_extractor.Main.main(Main.java:29)

@shyderc
Copy link
Author

shyderc commented Mar 9, 2021

Another try and a different error

    Inserting copy of email without extracted attachments to Gmail
java.net.SocketException: Unexpected end of file from server
        at java.base/sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:866)
        at java.base/sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:689)
        at java.base/sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1623)
        at java.base/sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1528)
        at java.base/java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:527)
        at java.base/sun.net.www.protocol.https.HttpsURLConnectionImpl.getResponseCode(HttpsURLConnectionImpl.java:308)
        at com.google.api.client.http.javanet.NetHttpResponse.<init>(NetHttpResponse.java:37)
        at com.google.api.client.http.javanet.NetHttpRequest.execute(NetHttpRequest.java:105)
        at com.google.api.client.http.HttpRequest.execute(HttpRequest.java:981)
        at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:419)
        at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:352)
        at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.execute(AbstractGoogleClientRequest.java:469)
        at pl.geek.tewu.gmail_attachments_extractor.GmailAttachmentsExtractor.addLabelToMessage(GmailAttachmentsExtractor.java:347)
        at pl.geek.tewu.gmail_attachments_extractor.GmailAttachmentsExtractor.extractAttachments(GmailAttachmentsExtractor.java:212)
        at pl.geek.tewu.gmail_attachments_extractor.Main.call(Main.java:45)
        at pl.geek.tewu.gmail_attachments_extractor.Main.call(Main.java:14)
        at picocli.CommandLine.executeUserObject(CommandLine.java:1783)
        at picocli.CommandLine.access$900(CommandLine.java:145)
        at picocli.CommandLine$RunLast.executeUserObjectOfLastSubcommandWithSameParent(CommandLine.java:2150)
        at picocli.CommandLine$RunLast.handle(CommandLine.java:2144)
        at picocli.CommandLine$RunLast.handle(CommandLine.java:2108)
        at picocli.CommandLine$AbstractParseResultHandler.execute(CommandLine.java:1975)
        at picocli.CommandLine.execute(CommandLine.java:1904)
        at pl.geek.tewu.gmail_attachments_extractor.Main.main(Main.java:29)

@shyderc
Copy link
Author

shyderc commented Mar 9, 2021

Still getting issues, so I have not been able to run past 500 messages yet. Will keep trying

222/~656 (33%) | Processing email 15f552104e762ce3
java.lang.NullPointerException: Cannot invoke "String.length()" because "name" is null
        at pl.geek.tewu.gmail_attachments_extractor.Utils.sanitizeFSName(Utils.java:76)
        at pl.geek.tewu.gmail_attachments_extractor.Utils.sanitizeDirName(Utils.java:72)
        at pl.geek.tewu.gmail_attachments_extractor.GmailAttachmentsExtractor.createDirForAttachments(GmailAttachmentsExtractor.java:255)
        at pl.geek.tewu.gmail_attachments_extractor.GmailAttachmentsExtractor.extractAttachments(GmailAttachmentsExtractor.java:142)
        at pl.geek.tewu.gmail_attachments_extractor.Main.call(Main.java:45)
        at pl.geek.tewu.gmail_attachments_extractor.Main.call(Main.java:14)
        at picocli.CommandLine.executeUserObject(CommandLine.java:1783)
        at picocli.CommandLine.access$900(CommandLine.java:145)
        at picocli.CommandLine$RunLast.executeUserObjectOfLastSubcommandWithSameParent(CommandLine.java:2150)
        at picocli.CommandLine$RunLast.handle(CommandLine.java:2144)
        at picocli.CommandLine$RunLast.handle(CommandLine.java:2108)
        at picocli.CommandLine$AbstractParseResultHandler.execute(CommandLine.java:1975)
        at picocli.CommandLine.execute(CommandLine.java:1904)
        at pl.geek.tewu.gmail_attachments_extractor.Main.main(Main.java:29)

@shyderc
Copy link
Author

shyderc commented Mar 9, 2021

The above error is because the Subject is blank. See #7

@shyderc
Copy link
Author

shyderc commented Mar 9, 2021

Another error;

java.lang.RuntimeException: Incorrect exported file size
        at pl.geek.tewu.gmail_attachments_extractor.GmailAttachmentsExtractor.extractAttachments(GmailAttachmentsExtractor.java:174)
        at pl.geek.tewu.gmail_attachments_extractor.Main.call(Main.java:45)
        at pl.geek.tewu.gmail_attachments_extractor.Main.call(Main.java:14)
        at picocli.CommandLine.executeUserObject(CommandLine.java:1783)
        at picocli.CommandLine.access$900(CommandLine.java:145)
        at picocli.CommandLine$RunLast.executeUserObjectOfLastSubcommandWithSameParent(CommandLine.java:2150)
        at picocli.CommandLine$RunLast.handle(CommandLine.java:2144)
        at picocli.CommandLine$RunLast.handle(CommandLine.java:2108)
        at picocli.CommandLine$AbstractParseResultHandler.execute(CommandLine.java:1975)
        at picocli.CommandLine.execute(CommandLine.java:1904)
        at pl.geek.tewu.gmail_attachments_extractor.Main.main(Main.java:29)

and

com.sun.mail.util.DecodingException: BASE64Decoder: Error in encoded stream: needed 4 valid base64 characters but only got 2 before EOF, the 10 most recent characters were: "UUhH//2Q\r\n"
       at com.sun.mail.util.BASE64DecoderStream.decode(BASE64DecoderStream.java:265)
       at com.sun.mail.util.BASE64DecoderStream.read(BASE64DecoderStream.java:146)
       at java.base/java.io.FilterInputStream.read(FilterInputStream.java:106)
       at com.google.api.client.util.ByteStreams.copy(ByteStreams.java:51)
       at com.google.api.client.util.IOUtils.copy(IOUtils.java:94)
       at com.google.api.client.util.IOUtils.copy(IOUtils.java:63)
       at pl.geek.tewu.gmail_attachments_extractor.Utils.copyInputStreamToFile(Utils.java:115)
       at pl.geek.tewu.gmail_attachments_extractor.GmailAttachmentsExtractor.saveToFile(GmailAttachmentsExtractor.java:290)
       at pl.geek.tewu.gmail_attachments_extractor.GmailAttachmentsExtractor.extractAttachments(GmailAttachmentsExtractor.java:167)
       at pl.geek.tewu.gmail_attachments_extractor.Main.call(Main.java:45)
       at pl.geek.tewu.gmail_attachments_extractor.Main.call(Main.java:14)
       at picocli.CommandLine.executeUserObject(CommandLine.java:1783)
       at picocli.CommandLine.access$900(CommandLine.java:145)
       at picocli.CommandLine$RunLast.executeUserObjectOfLastSubcommandWithSameParent(CommandLine.java:2150)
       at picocli.CommandLine$RunLast.handle(CommandLine.java:2144)
       at picocli.CommandLine$RunLast.handle(CommandLine.java:2108)
       at picocli.CommandLine$AbstractParseResultHandler.execute(CommandLine.java:1975)
       at picocli.CommandLine.execute(CommandLine.java:1904)
       at pl.geek.tewu.gmail_attachments_extractor.Main.main(Main.java:29)

and

461/~766 (60%) | Processing email 'Motion detected 2016-03-21 09:36:34'
    Extracting 5 attachment(s) to directory '2016.03.21 01_37_19 Motion detected 2016_03_21 09_36_34'
com.sun.mail.util.DecodingException: BASE64Decoder: Error in encoded stream: needed at least 2 valid base64 characters, but only got 0 before padding character (=), the 10 most recent characters were: "iGz/9k\r\n=="
        at com.sun.mail.util.BASE64DecoderStream.decode(BASE64DecoderStream.java:276)
        at com.sun.mail.util.BASE64DecoderStream.read(BASE64DecoderStream.java:146)
        at java.base/java.io.FilterInputStream.read(FilterInputStream.java:106)
        at com.google.api.client.util.ByteStreams.copy(ByteStreams.java:51)
        at com.google.api.client.util.IOUtils.copy(IOUtils.java:94)
        at com.google.api.client.util.IOUtils.copy(IOUtils.java:63)
        at pl.geek.tewu.gmail_attachments_extractor.Utils.copyInputStreamToFile(Utils.java:115)
        at pl.geek.tewu.gmail_attachments_extractor.GmailAttachmentsExtractor.saveToFile(GmailAttachmentsExtractor.java:290)
        at pl.geek.tewu.gmail_attachments_extractor.GmailAttachmentsExtractor.extractAttachments(GmailAttachmentsExtractor.java:167)
        at pl.geek.tewu.gmail_attachments_extractor.Main.call(Main.java:45)
        at pl.geek.tewu.gmail_attachments_extractor.Main.call(Main.java:14)
        at picocli.CommandLine.executeUserObject(CommandLine.java:1783)
        at picocli.CommandLine.access$900(CommandLine.java:145)
        at picocli.CommandLine$RunLast.executeUserObjectOfLastSubcommandWithSameParent(CommandLine.java:2150)
        at picocli.CommandLine$RunLast.handle(CommandLine.java:2144)
        at picocli.CommandLine$RunLast.handle(CommandLine.java:2108)
        at picocli.CommandLine$AbstractParseResultHandler.execute(CommandLine.java:1975)
        at picocli.CommandLine.execute(CommandLine.java:1904)
        at pl.geek.tewu.gmail_attachments_extractor.Main.main(Main.java:29)

@shyderc
Copy link
Author

shyderc commented Mar 9, 2021

I was able to get beyond 500 emails in a single run and it worked well, so this ticket stays closed.
Thank you for your efforts. I freed up over 12GB of space!

@TeWu
Copy link
Owner

TeWu commented Mar 12, 2021

@shyderc I'm happy you managed to free a lot of space 😄.
Thanks for reporting all those exceptions, I'm gonna take a look at them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants