Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP
Browse files

- update based on README revs from branch-1.3 to bring up to date

git-svn-id: https://svn.apache.org/repos/asf/nutch/trunk@983324 13f79535-47bb-0310-9956-ffa450edef68
  • Loading branch information...
commit 4c87f337336d5035681fc89a63e59384e2c51be5 1 parent c2cef7c
@chrismattmann chrismattmann authored
Showing with 4 additions and 28 deletions.
  1. +4 −28 README.txt
View
32 README.txt
@@ -1,32 +1,8 @@
Apache Nutch README
-Important note: Due to licensing issues we cannot provide two libraries that
-are normally provided with PDFBox (jai_core.jar, jai_codec.jar), the parser
-library we use for parsing PDF files. If you encounter unexpected problems when
-working with PDF files please
-
-1. download the two missing libraries from:
- http://pdfbox.cvs.sourceforge.net/viewvc/pdfbox/pdfbox/external/
-
-2. Put them to directory src/plugin/parse-pdf/lib
-3. follow the instructions in file src/plugin/parse-pdf/plugin.xml
-4. Rebuild nutch.
-
-
-
-Interesting files include:
-
-
- docs/api/index.html
- Javadocs for the Nutch software.
-
- CHANGES.txt
- Log of changes to Nutch.
-
-
For the latest information about Nutch, please visit our website at:
- http://lucene.apache.org/nutch/
+ http://nutch.apache.org
and our wiki, at:
@@ -34,7 +10,7 @@ and our wiki, at:
To get started using Nutch read Tutorial:
- http://lucene.apache.org/nutch/tutorial.html
+ http://wiki.apache.org/nutch/NutchTutorial
Export Control
@@ -55,6 +31,6 @@ Section 740.13) for both object code and source code.
The following provides more details on the included cryptographic software:
-Apache Nutch uses the PDFBox API in its parse-pdf plugin for extracting textual content
-and metadata from encrypted PDF files. See http://incubator.apache.org/pdfbox/ for more
+Apache Nutch uses the PDFBox API in its parse-tika plugin for extracting textual content
+and metadata from encrypted PDF files. See http://pdfbox.apache.org for more
details on PDFBox.
Please sign in to comment.
Something went wrong with that request. Please try again.