Before using for the first time this tool you should create a config.json inside the conf folder. Please take a look to conf/config.example.json or directly clone it.
You can setup an Anti-Captcha API keys in order to skip captcha checks, please follow how to generate keys from this link: https://anti-captcha.com
You can setup fields you want to export. A complete list:
- "Denominazione",
- "Sede legale",
- "Attività",
- "Sede operativa",
- "Indirizzo web",
- "Posta elettronica",
- "Commercio elettronico",
- "Chi siamo",
- "Cosa facciamo",
- "Classe di fatturato",
- "Canali di vendita",
- "Marchi",
- "Principali paesi di export",
- "Certificazioni"
You can setup a mode, you can learn about it following the next section.
You can choose one of the following scraping modes:
- search_by_name (Ricercando nel Nome in the website)
- search_by_desc (Ricercando nella Descrizione attività in the website)
- with_dash (con la Vetrina su infoimprese.it in the website)
- with_cert (con certificazione di qualità in the website)
- with_dash (che praticano e-commerce in the website)
- with_email (che possiedono l'e-mail in the website)
- with_website (che hanno il sito internet in the website)
- with_export (che svolgono attività di export in the website)
usage: main.py [-h] -q QUERY [-m MODE] [-l LOCATION] [-o OUTPUT]
Arguments are:
- query represents your keyword
- location represents where you want search
- mode represent modes (check Modes section)
- output csv file for storing data
Enjoy :)
You can use exec.bat in order to have a very basic GUI
Disclaimer: Please Note that this is a research project. I am by no means responsible for any usage of this tool.