Skip to content

PreferredAI/venom-tutorial

master
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
src
 
 
 
 
 
 
 
 
 
 

Venom Tutorial

Your preferred focused crawler based on Venom. Now in a convenient package to quickly get your projects up to speed!

Bundled in this pack is a tutorial package to help get you up and sprinting. If you don't require a tutorial, you can access a fully functional example here. For more information, read the tutorial guide below.

Maven Central Build Status Coverage Status Javadocs

Overview

Check out our main Venom page for more information.

Quick links

Website | Wiki | API Reference | PreferredAI

Tutorial

venom-tutorial includes a set of tutorial designed to quickly get you from 0 to 100.

There are 7 exercises located in the package ai.preferred.crawler.example.tutorial. Alongside the exercises are a set of test suite that automatically checks your code and provide hints on errors, so you do not have to worry about not knowing whether your code works.

For more information you can visit our Wiki.

TutorialCrawler.java

  • Exercise 1: Creating a crawler with default settings.
  • Exercise 2: Creating a fetcher that includes three (3) validators.
  • Exercise 3: Creating a session store with PAPER_LIST_KEY.
  • Exercise 4: Creating a crawler that uses a specified fetcher and session.

TutorialValidator.java

  • Exercise 5: Creating a validator that validate a page.

TutorialHandler.java

  • Exercise 6: Parsing the response from the crawl.

TutorialCrawler.java

  • Exercise 7: Putting it all together.

Test Suite

Easily find out what went wrong by running the tests included with the exercises.

You can run this command in the project folder

mvn test

Or use your IDE to run JUnit tests Test Suite

License

Apache License 2.0

About

A tutorial based on your preferred open source focused crawler for the deep web.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages