-
Notifications
You must be signed in to change notification settings - Fork 627
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unable to collect posts beyond a certain number #285
Comments
Try increasing the posts_per_page option |
I added some code to retry pagination requests on error (1cc8064), and with that, and this test code: start = time.time()
posts = []
try:
for post in get_posts("ChannelNewsAsia", cookies="cookies.txt", pages=200, timeout=60, options={"allow_extra_requests": False, "posts_per_page": 200}):
posts.append(post)
except:
print(f"{len(posts)} posts retrieved in {round(time.time() - start)}s. Oldest post: {posts[-1].get('time')}") I get 14201 posts retrieved in 910s. Oldest post: 2013-12-12 13:05:00 |
717d522 might also be useful to resume from the last cursor that errored out, see #287 (comment) for usage |
Thanks! I think the combination of: from facebook_scraper import * import requests start = time.time() and this: cursor = " some url from the loggin output." posts = [] have helped in making the process more robust. Thanks for the help! Your project is amazing! |
#291 might be useful |
Hi wondering how can i solve this issue where I get only 2000+ posts from doing this. I know there are more posts, but I'm not getting them.
opt = {}
opt['daterange'] = False #set to True if you want to limit your search by startDate and endDate
opt['startDate'] = datetime.strptime('01/05/19 00:00:00', '%m/%d/%y %H:%M:%S') #change the time ranges to what you want
opt['endDate'] = datetime.strptime('12/06/22 00:00:00', '%m/%d/%y %H:%M:%S') #change the time ranges to what you want
page_name = 'ChannelNewsAsia'
fbcookies = { cookie details}
lst = []
for post in get_posts(page_name, cookies=fbcookies, pages=1000000,options={"allow_extra_requests": False}):
It stopped collecting stuff at 2020, which is odd for me
The text was updated successfully, but these errors were encountered: