Skip to content

NISH1001/instagramcrawler

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 

Repository files navigation

instagramcrawler

A crawler for instagram.

A project for crawling Instagram. The crawler logs in to the user account and fetches all the general information of followers // following like:

  • Full Name
  • Username
  • Biography text
  • Followers count
  • Following count

The dump data is like:

{
    "full_name"     : "Full Name",
    "username"      : "username",
    "followers"     : [ user1_data, user2_data, ...],
    "following"     : [ user1_data, user2_data, ...]
}

Since Instagram is fully dynamic and its api is sandboxed(limited), Selenium is used to automate the login, click on the followers/following link and extract all the usernames.

Finally, Scrapy is used to crawl all the relevant data for the usernames collected.


Dependencies

It uses python3 along with scrapy and selenium

pip install scrapy
pip install selenium

Usage

Run scrapy spider as:

scrapy crawl instaspider

Also, the crawler instaspider.py requires a path to json file. The JSON format is:

{
    "USERNAME"  :   "your_user_name",
    "PASSWORD"  :   "your password"
}

About

A crawler for instagram.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages