Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTTP Error 413 #34

Closed
want-to-export-group opened this issue Dec 20, 2019 · 8 comments · Fixed by #37
Closed

HTTP Error 413 #34

want-to-export-group opened this issue Dec 20, 2019 · 8 comments · Fixed by #37

Comments

@want-to-export-group
Copy link

I am trying to archive messages from a large private group. The script seems to run fine, until the "Fetching data" step. Here is the output (the name has been changed to "group"):

:: Downloading all topics (thread) pages...
:: Creating './group//threads/t.0' with 'categories/group'
:: Fetching data from 'https://groups.google.com/forum/?_escaped_fragment_=categories/group'...
--2019-12-20 13:16:16-- https://groups.google.com/forum/?_escaped_fragment_=categories/group
Resolving groups.google.com (groups.google.com)... 2607:f8b0:400d:c0f::8a, 172.217.197.102, 172.217.197.113, ...
Connecting to groups.google.com (groups.google.com)|2607:f8b0:400d:c0f::8a|:443... connected.
HTTP request sent, awaiting response... 413 Request Entity Too Large
2019-12-20 13:16:16 ERROR 413: Request Entity Too Large.

As you can see, there is an Error 413. What is causing this, and how can it be fixed?

@want-to-export-group
Copy link
Author

The test script works fine for "google-group-crawler-public" but fails for "google-group-crawler-public2" due to HTTP error 500. Could something be going wrong with the cookies?

@icy
Copy link
Owner

icy commented Apr 9, 2020

@want-to-export-group Was you able to resolve the issue?

@icy
Copy link
Owner

icy commented Apr 9, 2020

I haven't seen that issue. Maybe it's a temporary network issue, you can look at the wget command and retry if that helps.

@icy
Copy link
Owner

icy commented Apr 12, 2020

The test script works fine for "google-group-crawler-public" but fails for "google-group-crawler-public2" due to HTTP error 500. Could something be going wrong with the cookies?

Yes I can confirm this issue. Google has changed something to prevent our script from working :(

@want-to-export-group
Copy link
Author

want-to-export-group commented Apr 12, 2020 via email

@icy
Copy link
Owner

icy commented Apr 12, 2020

:( it's used to work. Now accessing from the web browser also generates an error https://groups.google.com/forum/?_escaped_fragment_=categories/google-group-crawler-public2

@icy
Copy link
Owner

icy commented Apr 13, 2020

By mistake google-group-crawler-public2 was set to private mode. Now it's fine. Btw, I have rewritten the script using curl hopefully it can help to resolve a few strange issue. Stay tuned.

icy added a commit that referenced this issue Apr 13, 2020
@icy icy mentioned this issue Apr 13, 2020
Merged
@icy icy closed this as completed in #37 Apr 13, 2020
@icy
Copy link
Owner

icy commented Apr 13, 2020

The problem should be fixed in the latest version 2.0.0 (using curl). Please have a look if it's better. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants