Skip to content

Scrape handle and name from posts from Instagram based on #hashtag

Notifications You must be signed in to change notification settings

johnsliao/instagram_influencer_scraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Instagram Influencer Scraper

What is this

This script scrapes handle and name from top posts on Instagram based on #hashtags using selenium webdriver

How it works

  1. Logs in
  2. Find instagram top posts by provided #hashtag
  3. Navigate to each @handle and saves @handle, name in a file

Set up

  • Python 3.x+
  • pip
  1. pip install -r requirements.txt
  2. Download chromedriver. Place in root directory.
  3. Create influencers and tags file (no extension) in root directory
  4. Set IG_USERNAME and IG_PASSWORD environment variables
  5. $ python app.py

tags file should look like:

gaming
mensfashion

Results will be stored in influencers file. E.g.

pewdiepie,PewDiePie
markiplier,Markiplier

Configuration

  1. MAX_HANDLE_ATTEMPTS set to 25 by default. Sets the number of posts the script will scrape in a single run.
  2. MINIMUM_FOLLOWER_COUNT set to 10000 by default. Sets the minimum number of followers for influencer to be recorded.

Features

  • Duplicate @handles will not be saved to influencers file

Please note

  • Emojis/special characters in names are ignored when saving to influencers
  • Commas in names are replaced as a space

About

Scrape handle and name from posts from Instagram based on #hashtag

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages