Skip to content

Latest commit

 

History

History

graph-api

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 

Graph API with Full-permission Token Approach

I wrote a simple script to get data of posts from any Page/Group by querying Facebook Graph API with Full-permission Token. My implementation for this approach only needs 130 lines of code (100 if not including comments) with some built-in Python functions.

👉 Demo: https://www.youtube.com/watch?v=Q4oAsz__e_M

I. Usage

python scraper.py
  1. COOKIE (Most important setup):

    This script needs your COOKIE to work. You can get it by following these steps:

    Note: Don't use document.cookie as this will only extract cookies that are accessible via JavaScript and are not marked as HttpOnly.

  2. LIMIT and MAX_POSTS:

    For Page, you can only read a maximum of 100 feed posts with the limit field:

    • If you try to read more than that you will get an error message to not exceed 100.
    • The API will return approximately 600 ranked, published posts per year.

    For Group, there is no limited number mentioned in the document. As I experimented:

    • For simple query (such as fields=message): I can request up to 1850 posts (LIMIT=1850) .
    • For complex query (like the below fields): LIMIT=300 works fine. Larger numbers sometimes work but most of the times are errors.
    • Therefore, I recommend just querying up to 300 posts at a time for Group.

    Note: If the data retrieved is too large, you can receive this error message: "Please reduce the amount of data you're asking for, then retry your request".

  3. POST_FIELDS and COMMENT_FIELDS:

    You can customize the fields you want to get from the Post or even Comment objects of Page and Group:

    Note: A User or Page can only query their own reactions. Other Users' or Pages' reactions are unavailable due to privacy concerns.

  4. Other settings:

    • SLEEP: The time (in seconds) to wait between each request to get LIMIT posts.
    • PAGE_OR_GROUP_URL: The URL of the Page or Group you want to crawl.

    Note: The resulting file will contain each post separated line by line.

II. Recommendation

I have learned a lot from this repo. It's a NodeJs tool for auto downloading Facebook media with various features:

  • View album information (name, number of photos, link, ...)
  • Download timeline album of a FB page: this kind of album is hidden, containing all the photos so far in a FB page, like this album.
  • Download any kind of albums: user's, group's, or page's.
  • Download all photos/videos on the wall of an object (user/group/page).
  • It also provided scripts to extract album_id / user_id / group_id / page_id.

The only disadvantage is that the description and instructions of this repo are in Vietnamese, my language. But I think you can use the translation feature of your browser to read, or you can watch its instruction video for more information. Hopefully, in the future, the author will update the description as well as the instructions in English.