Skip to content

Scrape the data of members of the linkedin social network

Notifications You must be signed in to change notification settings

RashidCodes/Scrapers

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Scrape the data of Linkedin members

Please note that this activity is a breach of terms and conditions associated with the usage of LinkedIn. This code was created solely for educational purposes. Your account WILL BE RESTRICTED if suspicious activity is detected.

Data

The scraper returns some personal information of the member, their educational qualifications, work experiences and certifications.

Requirements

  • A Chrome Webdriver: The version should be same as the version of your google chrome browser.

  • An HTML parser eg. lxml

  • Selenium

  • BeautifulSoup


Usage

1. Install the modules in the requirements.txt file.
>>> pip install -r requirements.txt
  1. Make sure the correct version of ChromeDriver is downloaded.

  2. Edit the try.py file and run it.


Scraping data from a single profile

>>> import uuid
>>> profile_link = "https://www.linkedin.com/in/alexnalmpantis/?originalSubdomain=uk"
>>> Id = str(uuid.uuid4())

>>> alexander = ScrapeLinkedin("yourEmail@email.com", "yourPassword", profile_link, Id)
>>> alexander.scrape()
Entering your username...
Entering your password...
Signing you in...
Successfully signed in. Proceeding to another profile.
Scraping...
No certifications
Done scraping.

>>> alexander.get_name()
'Alexander Nalmpantis'

>>> alexander.get_location()
'London, Greater London, United Kingdom'

>>> alexander.get_education()
[('d10a208c-2375-421d-b450-34e1341ecb21',
  'City University London',
  "Master's degree Data Science"),
 ('d10a208c-2375-421d-b450-34e1341ecb21',
  'Harvard Business School',
  'Core: Credential of Readiness'),
 ('d10a208c-2375-421d-b450-34e1341ecb21',
  'Technologiko Ekpaideutiko Idrima, Kavalas',
  'Bachelor of Engineering (BEng) Petroleum and Natural Gas Engineering')]

>>> alexander.get_experience()
[('d10a208c-2375-421d-b450-34e1341ecb21',
  'BP',
  None,
  'Lead Data Scientist',
  'Dec 2018',
  'Present',
  None),
 ('d10a208c-2375-421d-b450-34e1341ecb21',
  'Deloitte',
  None,
  'Manager - Analytics & Modelling',
  'Apr 2017',
  'Dec 2018',
  'London, United Kingdom'),
 ('d10a208c-2375-421d-b450-34e1341ecb21',
  'IHS Consulting',
  None,
  'Business Analyst - Upstream Consulting',
  'Oct 2014',
  'Mar 2017',
  'London, United Kingdom'),
 ('d10a208c-2375-421d-b450-34e1341ecb21',
  'GlobalData',
  None,
  'Oil & Gas Analyst',
  'Jan 2014',
  'Oct 2014',
  'London, United Kingdom'),
 ('d10a208c-2375-421d-b450-34e1341ecb21',
  'Chatzikosmas Co',
  None,
  'Production Engineer',
  'Sep 2008',
  'Oct 2012',
  'Greece')]

>>> alexander.save()
Successfully saved scientist
Successfully saved experiences
Successfully saved qualifications
Successfully saved certifications

Scraping data from multiple profiles

The data is automatically stored in an SQLite database.

>>> unscraped_urls = ["member1_url", "member2_url"]
>>> scrape_multiple = ScrapeLinkedinM("yourEmail@email.com", "yourPassword", unscraped_urls)
>>> scrape_multiple.scrape_save()
Entering your username...
Entering your password...
Signing you in...
Successfully signed in.
Scraping member1_url...
No certifications
Done scraping
Saving all credentials...
Successfully saved scientist
Successfully saved experiences
Successfully saved qualifications
Sleeping for approximately 3 minutes

About

Scrape the data of members of the linkedin social network

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages