Find file
Fetching contributors…
Cannot retrieve contributors at this time
95 lines (64 sloc) 2.17 KB

NAME

LibCPV::Categorizer - Class for Hierarchical CPV-Number Categorizing via AI::Categorizer.

SYNOPSIS

  my $doc_set = new LibCPV::Categorizer::DocumentSet
  ({
    dirname         => '/path/to/docset/dir'
  });
  $doc_set->add_docs_from_dir;

  my $categorizer = new LibCPV::Categorizer
  ({
    document_set         => $doc_set,
    learner_rootdir      => '/path/to/learner/output/dir'
  });
  $categorizer->train;

INTRODUCTION

We use AI::Categorizer. Because AI::Categorizer does not do hierarchical categorization we added our own hierarchy schema based on the semantics of "cpv numbers".

For introduction to cpv numbers see http://simap.eu.int/EN/pub/src/welcome.htm.

In LibCPV::Categorizer we try to use a consistent wording. Here are the most important phrases:

learner - An AI::Categorizer instance used to learn (or train).

category - a cpv number, simply an 8-digit-number. CPV numbers are hierarchically built. The first 2 digits form a common level of accuracy, then each following digit forms another accuracy level. We derive the word "group" from that accuracy level definition.

Exercise some Affe dance.

Quite funky Zomtec

Bla.

Fasel.

Bummer!

moo foo bar

Kram.

Some code examples

  # a verbatim block
  sub cut { 42 }
  my $foo = cut();
  sub affe {
          do_something_strong($foo, $zomtec, @tiger);
          print STDERR $foo, "\n";
  }

  # another verbatim block after a single empty line
  # although that is not the only reason for confusion
  affe();
  sub kram {
          foo($kram);
  }

  If all possible cpv numbers with 8 digits would be used, the tree
  would have one root level learner categorizing into 99 categories,
  99 learners at the next level each categorizing into 9 categories,
  therefore 9 learnes in each of the 99 categories, and in each
  following level 9 more learners for each category.

POD ERRORS

Hey! The above document had some coding errors, which are explained below:

Around line 43:

Unknown directive: =func

Around line 47:

Unknown directive: =method

Around line 51:

Unknown directive: =method

Around line 55:

Unknown directive: =func

Around line 59:

Unknown directive: =attr