Skip to content

alexrutherford/facebook_scraping

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 

Repository files navigation

Script to query Facebook API, extract public pages based on a keyword search, extract all posts and comments on posts where avaliable. Content which matches a keyword is stored.

Some complex logic is in place to handle bad requests either due to API downtime, key expiration or overuse.

Allows for restarts at a particular page.

##Outputs

  1. Produces a log file log.csv which records every API call and it's time. Useful for debugging and estimating usage limits.
  2. Produces an output file out.csv with all content that matches search terms of interest. If restarting, this file is appended to, otherwise overwrites previous content.

##Arguments

  1. To restart at a specific page (if for example, token expired midway through looping through a large collection of search results) run with page ID as a single argument. Will skip over other pages until it matches

  2. To restart at a specific page and a specific set of posts on that page, call with two arguments. First is page ID as above and second is link to that page via API. This link is recorded in log file and can be used to restart when middway through a specific page of posts from a given page.

##Requirements

Requires requests library

About

Python script to query public Facebook pages via API based on keyword search. Then extracts posts and comments which matches a keyword

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages