Skip to content

Basic web crawling and data processing based on Python(using selenium and openpyxl) and R(using rvest, xml2 and xlsx).

License

Notifications You must be signed in to change notification settings

PeterQiu0516/WebCrawler

Repository files navigation

Web-Crawler

Basic web crawling and data processing based on Python(using selenium and openpyxl) and R(using rvest, xml2 and xlsx).

Work done during the first day of internship as a data analyst in Dec. 2019 @AssetPro.

WebCrawler.py

This script gives a basic example of how to utilize webdriver to crawls fund IDs from a funding company's website: www.ifund.com.hk, you could crawl any data you want from any webpage following the similar pattern of the usage of webdriver.

Author: Changyuan Qiu
Latest Update: Nov. 12, 2020

Build:

Make sure that the latest version of selenium and openpyxl is installed on your computer.

Apart from selenium and openpyxl, you also need to download chrome driver from

https://sites.google.com/a/chromium.org/chromedriver/downloads

and add it to the PATH for executing this script.

About

Basic web crawling and data processing based on Python(using selenium and openpyxl) and R(using rvest, xml2 and xlsx).

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages