Skip to content
Facebook Scraping Tool
Branch: master
Clone or download
Pull request Compare This branch is even with repat:master.
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
backend
frontend
imagerecognition
screenshots
.gitignore
README.md

README.md

Grafari

Grafari is a Facebook Graph Search scraping tool. It's been developed as part of a lecture on Modelling of Information Systems at the University of Applied Sciences Hamburg by 11 Master students. You can find more information in the wiki, it's probably in German though.

Facebook changes the layout regularly so this might not work anymore. Most probably they added, removed or renamed a <div> tag.

The point of this (besides passing the lab) is to show, that if a bunch of students can do this in 3 months, every marketing and advertising company as well as intelligence services (even with no direct access to Facebook servers) can do this as well.

DISCLAIMER

Scraping Facebook is forbidden according to the Automated Data Collection Terms (April 15th, 2010). This is in no way an encouragement or a request to scrape Facebook but merely an academical proof-of-concept. We also don't offer a service. Use this software at your own risk. We are not responsible for what you are doing with this software.

Technology

To better divide the tasks we split into a frontend, backend and image recognition team. Queries can include AND and OR. For time and abuse prevention reasons, it is only possible to query for 12 people. It is however possible to reverse-engineer a bit more as shown by the facebook-uid-scraper Chrome Plugin by autoclick.us.

Frontend

The Frontend Team consists of a webpage which allows the user to enter a query in a (more or less) natural language. It can send the request to the backend and parse the repsonse, display, sort and filter it and it includes a search history. It's also possible to query each profile photo for image recognition tags.

Backend

The backend parses the request, queries Facebook, parses the returned Graph Search Page, caches the result in a Redis Database and offeres the data via Rest API.

  • node.js
  • Redis
  • npm
  • zombie.js as a headless browser
  • restify for building Rest API
  • node modules: async, fbgraph, nodemon, redis, request

Image recogniction

  • Imagga
  • included in both the backend (accessible via Rest API) and frontend

Screenshots

You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.