SME Crawler

Welcome to git repository for SME crawler project. For historical reason, it is called Kemenperin crawler instead of SME crawler. Now it can crawl multiple websites. Supported websites are listed in this google sheet. The statistics for each site can be seen on the same sheet.

Crawling Result

The result of crawling Kemenperin site is stored in crawled-data directory. While the result of crawling from other sites are stored in crawled-data-1.

kemenperin-crawler

Exporter Crawler for Kemenperin Site. Built using NodeJS with axios + cheerio.

Prerequisites

NodeJS version >= 8

Preparation

$ npm install

Generating data

Crawl raw data into csv

$ npm start

Data will be generated at data.csv

Generating Heatmap

These two process are used to generate the data to produce heatmap.

Get lattitude and longitude from address

$ node geocoder.js

Write .geojson file from lat and long

$ node transformer.js

Indonetwork Crawler

The sourcecode of Indonetwork crawler are included in scrapy_indonetwork directory.

Telpon Info Crawler

Telpon info crawler are contained in telponinfo.js, to run it use

$ npm run telpon

Analytics

To have more visibility on the result of the crawling, you can use the analytics.js to analyze the CSVs. It simply count the number of data for each CSVs from the crawling result. To run it, use the following command:

$ npm run analytics

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
crawled-data-1		crawled-data-1
crawled-data		crawled-data
docs		docs
scrapy_indonetwork		scrapy_indonetwork
.gitignore		.gitignore
README.md		README.md
analytics.js		analytics.js
geocoder.js		geocoder.js
main.js		main.js
package-lock.json		package-lock.json
package.json		package.json
province.js		province.js
telponinfo.js		telponinfo.js
transformer.js		transformer.js

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SME Crawler

Crawling Result

kemenperin-crawler

Prerequisites

Preparation

Generating data

Crawl raw data into csv

Generating Heatmap

Get lattitude and longitude from address

Write .geojson file from lat and long

Indonetwork Crawler

Telpon Info Crawler

Analytics

About

Releases

Packages

Contributors 2

Languages

eric-kargo/kemenperin-crawler

Folders and files

Latest commit

History

Repository files navigation

SME Crawler

Crawling Result

kemenperin-crawler

Prerequisites

Preparation

Generating data

Crawl raw data into csv

Generating Heatmap

Get lattitude and longitude from address

Write .geojson file from lat and long

Indonetwork Crawler

Telpon Info Crawler

Analytics

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages