Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NUTCH-2419 Some URL filters and normalizers do not respect command-line override for rule file #526

Conversation

sebastian-nagel
Copy link
Contributor

  • fix urlfilter-domain, urlfilter-domainblacklist, urlfilter-prefix and urlfilter-suffix
  • always prefer the configured rule file (urlfilter.domain.file, urlfilter.domainblacklist.file, urlfilter.prefix.file, urlfilter.suffix.file) over the file defined in plugin.xml
  • remove constructors taking rule file as argument (used only in unit tests and now obsolete because we can override the rule file via configuration)
  • update Java API doc comments

…ne override for rule file

- fix urlfilter-domain, urlfilter-domainblacklist, urlfilter-prefix
  and urlfilter-suffix

- always prefer the configured rule file (urlfilter.domain.file,
  urlfilter.domainblacklist.file, urlfilter.prefix.file,
  urlfilter.suffix.file) over the file defined in plugin.xml

- remove constructors taking rule file as argument
  (used only in unit tests and now obsolete because we can override the
   rule file via configuration)

- update Java API doc comments
…ne override for rule file

- simplify selection of rule file (from property or attribute in plugin.xml)
@sebastian-nagel sebastian-nagel merged commit 9139d6e into apache:master May 14, 2020
@sebastian-nagel sebastian-nagel deleted the NUTCH-2419-urlfilter-rule-file-precedence branch May 17, 2020 12:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
1 participant