instagramcrawler

A crawler for instagram.

A project for crawling Instagram. The crawler logs in to the user account and fetches all the general information of followers // following like:

Full Name
Username
Biography text
Followers count
Following count

The dump data is like:

{
    "full_name"     : "Full Name",
    "username"      : "username",
    "followers"     : [ user1_data, user2_data, ...],
    "following"     : [ user1_data, user2_data, ...]
}

Since Instagram is fully dynamic and its api is sandboxed(limited), Selenium is used to automate the login, click on the followers/following link and extract all the usernames.

Finally, Scrapy is used to crawl all the relevant data for the usernames collected.

Dependencies

It uses python3 along with scrapy and selenium

pip install scrapy

pip install selenium

Usage

Run scrapy spider as:

scrapy crawl instaspider

Also, the crawler instaspider.py requires a path to json file. The JSON format is:

{
    "USERNAME"  :   "your_user_name",
    "PASSWORD"  :   "your password"
}

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
instagramcrawler		instagramcrawler
.gitignore		.gitignore
README.md		README.md
scrapy.cfg		scrapy.cfg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly