Many public pages are not returning posts even when they are available #195

adarsh2104 · 2021-04-01T13:26:48Z

Many public pages are returning empty post lists even when many recent posts are published on official page.
Eg:
1.reidandtaylor (https://m.facebook.com/reidandtaylor/posts/)
2.zodiacclothing
3.TataHitachiCorporate (https://m.facebook.com/TataHitachiCorporate/posts/)

DerekChia · 2021-04-01T17:00:08Z

hi, i think there might be something wrong with your network. I'm getting the data with this code.

from facebook_scraper import get_posts

for post in get_posts("reidandtaylor", pages=10):
	print(post)

{'post_id': '1976083782410298', 'text': 'Presenting the Autumn/Winter 18 Collection by Reid & Taylor.', 'post_text': 'Presenting the Autumn/Winter 18 Collection by Reid & Taylor.', 'shared_text': '', 'time': datetime.datetime(2018, 6, 29, 14, 57, 3), 'image': None, 'images': None, 'video': 'https://scontent.fsin2-1.fna.fbcdn.net/v/t42.9040-4/36367831_265117740909488_355914310502842368_n.mp4?_nc_cat=100&ccb=1-3&_nc_sid=985c63&efg=eyJ2ZW5jb2RlX3RhZyI6InN2ZV9zZCJ9&_nc_ohc=ASqzkokET7UAX_dbujb&_nc_ht=scontent.fsin2-1.fna&oh=ac7a849020aa1e28426931662e2712a1&oe=60661AE6', 'video_thumbnail': 'https://scontent.fsin2-1.fna.fbcdn.net/v/t15.5256-10/cp0/e15/q65/s320x320/34835304_1976086299076713_5308380875489017856_n.jpg?_nc_cat=105&ccb=1-3&_nc_sid=ccf8b3&_nc_ohc=xu-7D_uRJEkAX_vpev9&_nc_ht=scontent.fsin2-1.fna&tp=9&oh=a93bc9814976995baa438dfc4b64f2e7&oe=6089E2E4', 'video_id': '1976053362413340', 'likes': 496, 'comments': 11, 'shares': 0, 'post_url': 'https://facebook.com/reidandtaylor/videos/1976053362413340', 'link': None, 'user_id': '163066417045386', 'username': 'Reid & Taylor', 'is_live': False, 'factcheck': None, 'shared_post_id': None, 'shared_time': None, 'shared_user_id': None, 'shared_username': None, 'shared_post_url': None, 'available': True, 'comments_full': None}
{'post_id': '4077918145560174', 'text': 'Comfortable, lightweight neutrals', 'post_text': 'Comfortable, lightweight neutrals', 'shared_text': '', 'time': datetime.datetime(2021, 3, 23, 20, 44, 13), 'image': 'https://scontent.fsin2-1.fna.fbcdn.net/v/t1.6435-9/fr/cp0/e15/q65/164072753_4077917345560254_1277089394854544074_n.jpg?_nc_cat=102&ccb=1-3&_nc_sid=8024bb&_nc_ohc=YejSQ1pHEyYAX-_uLf0&_nc_ht=scontent.fsin2-1.fna&tp=14&oh=2a65a92ebed753fe227d9eaa1239b43f&oe=608D6380', 'images': ['https://scontent.fsin2-1.fna.fbcdn.net/v/t1.6435-9/fr/cp0/e15/q65/164072753_4077917345560254_1277089394854544074_n.jpg?_nc_cat=102&ccb=1-3&_nc_sid=8024bb&_nc_ohc=YejSQ1pHEyYAX-_uLf0&_nc_ht=scontent.fsin2-1.fna&tp=14&oh=2a65a92ebed753fe227d9eaa1239b43f&oe=608D6380'], 'video': None, 'video_thumbnail': None, 'video_id': None, 'likes': 19, 'comments': 1, 'shares': 0, 'post_url': 'https://facebook.com/reidandtaylor/posts/4077918145560174', 'link': None, 'user_id': '163066417045386', 'username': 'Reid & Taylor', 'is_live': False, 'factcheck': None, 'shared_post_id': None, 'shared_time': None, 'shared_user_id': None, 'shared_username': None, 'shared_post_url': None, 'available': True, 'comments_full': None}
...

xobius · 2021-04-02T15:08:26Z

I had the same problem. When I change my ip, the posts are downloaded.

neon-ninja · 2021-04-07T01:25:49Z

Try pass cookies as per #28 (comment)

adarsh2104 · 2021-04-07T04:17:49Z

Is there a limit to the number of requests that can be made? I have added proxy and user agent rotation to the HTML request made in the get function in facebook_scraper.py. But still, it returns an empty response for some time and after about an hour it begins to give back the response on the same pages. I am not passing any credentials with the requests.

neon-ninja · 2021-04-07T04:29:20Z

It would seem so, yes. If you scrape too hard, Facebook starts to serve the message "You're Temporarily Blocked" in the HTML. I even prepped some code like

if "Temporarily Blocked" in raw_page.text:
   logger.error("Temporarily blocked by Facebook")

But I realised it wouldn't do much good for the average user as the default log handler is NullHandler

sunboy123 · 2021-04-21T07:49:26Z

It would seem so, yes. If you scrape too hard, Facebook starts to serve the message "You're Temporarily Blocked" in the HTML. I even prepped some code like
if "Temporarily Blocked" in raw_page.text:
   logger.error("Temporarily blocked by Facebook")
But I realised it wouldn't do much good for the average user as the default log handler is NullHandler

this problem could solving by ip proxy?

neon-ninja · 2021-04-21T21:57:36Z

It would seem so, yes. If you scrape too hard, Facebook starts to serve the message "You're Temporarily Blocked" in the HTML. I even prepped some code like
if "Temporarily Blocked" in raw_page.text:
   logger.error("Temporarily blocked by Facebook")
But I realised it wouldn't do much good for the average user as the default log handler is NullHandler
this problem could solving by ip proxy?

Yes

adarsh2104 · 2021-04-22T06:37:29Z

Even after using HTTP IP proxy rotation + user agent rotations(fake user agent), I am still not able to prevent "You're Temporarily Blocked" in the HTML. I am not using any login credentials or cookies. Above 22 proxy addresses are valid and tested with the proxy checker package and verified by sending a request to "https://ipinfo.io" to verify if the proxy is successfully applied which I am using for rotation.

neon-ninja · 2021-04-22T22:48:38Z

Even after using HTTP IP proxy rotation + user agent rotations(fake user agent), I am still not able to prevent "You're Temporarily Blocked" in the HTML. I am not using any login credentials or cookies. Above 22 proxy addresses are valid and tested with the proxy checker package and verified by sending a request to "https://ipinfo.io" to verify if the proxy is successfully applied which I am using for rotation.

I think you must be scraping too hard anonymously. Either reduce the number of requests you're making per hour or pass cookies

ghost · 2021-05-09T19:54:31Z

I have same issue: #245
Changing IP helps but after same time it starts returning nothing again,

I guess it's how FB works, if you visit some page with a new browser or incognito mode it may work but after some time it stops working and requires to login to view the page, even public pages.

enaserianhanzaei · 2021-05-25T19:30:27Z

Above 22 proxy addresses are valid and tested with the proxy checker package and verified by sending a request to
@adarsh2104

Could you please explain how did you add the proxy ? I have no idea why it doesn't work for me. I'm getting this error :

HTTPConnectionPool(host='178.212.54.137', port=8080): Max retries exceeded with url: http://ifconfig.co/ (Caused by ConnectTimeoutError(<urllib3.connection.HTTPConnection object at 0x7f800ab2de48>, 'Connection to 178.212.54.137 timed out. (connect timeout=5)'))

Many thanks,

neon-ninja · 2021-05-25T23:07:20Z

This proxy seems to trip some sort of cloudflare protection on ifconfig.co - give this a try - 43cecdd

vcuspinera · 2022-11-10T17:56:14Z

Hi adarsh2104,
I had the same problem until I notice that the FacebookScraper Class has a function for loging into your facebook account: login(self, email: str, password: str). Look at the row 959 of the Python script of the facebook_scraper library

So, in general what you should do is something like this:

# call Class from library
from facebook_scraper import FacebookScraper

# Create a class object
my_scrapy = FacebookScraper()

# login on facebook
my_scrapy.login(email="write_your_email_here", password="write_your_password_here")

# now we do not have problems to get posts from "reidandtaylor"
posts_bk = my_scrapy.get_posts("reidandtaylor", pages=3)

i = 0
for post in posts_bk:
    if i<3:
        print(post,"\n")
        i = i+1
    else:
        break

neon-ninja mentioned this issue May 14, 2021

indentify cookies/user ban #255

Closed

adarsh2104 closed this as completed May 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Many public pages are not returning posts even when they are available #195

Many public pages are not returning posts even when they are available #195

adarsh2104 commented Apr 1, 2021

DerekChia commented Apr 1, 2021

xobius commented Apr 2, 2021

neon-ninja commented Apr 7, 2021

adarsh2104 commented Apr 7, 2021 •

edited

Loading

neon-ninja commented Apr 7, 2021 •

edited

Loading

sunboy123 commented Apr 21, 2021

neon-ninja commented Apr 21, 2021

adarsh2104 commented Apr 22, 2021 •

edited

Loading

neon-ninja commented Apr 22, 2021

ghost commented May 9, 2021

enaserianhanzaei commented May 25, 2021

neon-ninja commented May 25, 2021

vcuspinera commented Nov 10, 2022

Many public pages are not returning posts even when they are available #195

Many public pages are not returning posts even when they are available #195

Comments

adarsh2104 commented Apr 1, 2021

DerekChia commented Apr 1, 2021

xobius commented Apr 2, 2021

neon-ninja commented Apr 7, 2021

adarsh2104 commented Apr 7, 2021 • edited Loading

neon-ninja commented Apr 7, 2021 • edited Loading

sunboy123 commented Apr 21, 2021

neon-ninja commented Apr 21, 2021

adarsh2104 commented Apr 22, 2021 • edited Loading

neon-ninja commented Apr 22, 2021

ghost commented May 9, 2021

enaserianhanzaei commented May 25, 2021

neon-ninja commented May 25, 2021

vcuspinera commented Nov 10, 2022

adarsh2104 commented Apr 7, 2021 •

edited

Loading

neon-ninja commented Apr 7, 2021 •

edited

Loading

adarsh2104 commented Apr 22, 2021 •

edited

Loading