Basic web crawling and data processing based on Python(using selenium and openpyxl) and R(using rvest, xml2 and xlsx).
Work done during the first day of internship as a data analyst in Dec. 2019 @AssetPro.
This script gives a basic example of how to utilize webdriver to crawls fund IDs from a funding company's website: www.ifund.com.hk, you could crawl any data you want from any webpage following the similar pattern of the usage of webdriver.
Contact: peterqiu@umich.edu
Make sure that the latest version of selenium
and openpyxl
is installed on your computer.
Apart from selenium
and openpyxl
, you also need to download chrome driver from
https://sites.google.com/a/chromium.org/chromedriver/downloads
and add it to the PATH for executing this script.