I make this repository for portfolio when I'm submit data extraction proposal at UpWork. This project using scrapy library for python to download the data from target websites. Parse the data for meet the requirements client. And then transform to data framework and save to csv using pandas framework. I hope this project maybe useful for you untuk learn data extraction using python, scrapy and pandas. Regards!
First, clone the repo!
git clone https://github.com/hendrapaiton/guidestar.git
Second, make virtual environment in the project.
python3 -m virtualenv venv
source ./venv/bin/activate # in Most Linux
./venv/Scripts/activate # in Windows
Third, crawl the spider organization
scrapy crawl organization
Last but not least, waiting process until "organization.csv" file created.