This is an example Maven project written in Java 8.
- Jsoup 1.11.1
- HttpClient 4.5.3
- Commons CLI 1.4
- Log4j 4.5.3
- JUnit 4.12
- Mockito 2.9.0
- Maven 3.3.9
- JRE 8
- Scrap multiple pages
- Save the website scraped data as xml file
git clone https://github.com/d4ptak/webscraper.git
cd webscraper
mvn clean install
cd target
java -jar webscraper-1.0.jar -u https://www.ceneo.pl/Filmy_Blu-ray/Gatunek:Sensacyjne;m80;n100.htm
usage: webscraper
-d,--debug debug mode
-h,--help print this message
-p,--profile <arg> input profile name: ceneo-list (default), ceneo-box
-t,--type <arg> output type: xml (default)
-u,--url <arg> source url (required)