From c60ca918368443f4d964a0bbf11720f221cfaff4 Mon Sep 17 00:00:00 2001
From: Dawid Weiss <dawid.weiss@carrotsearch.com>
Date: Fri, 8 Aug 2014 11:13:49 +0200
Subject: [PATCH] Added a simple readme.

---
 README.txt | 33 +++++++++++++++++++++++++++++++++
 1 file changed, 33 insertions(+)
 create mode 100644 README.txt

diff --git a/README.txt b/README.txt
new file mode 100644
index 0000000..675aad8
--- /dev/null
+++ b/README.txt
@@ -0,0 +1,33 @@
+
+folder2index
+------------
+
+Converts PDF, TXT or HTML documents to a Lucene index (for use with Carrot2 Clustering Workbench)
+
+Quick usage guide
+-----------------
+
+- install Apache Maven.
+
+- run:
+  mvn clean package
+  
+- cd target
+
+- prepare a folder FOO with your PDF, HTML or plain text files. Prepare an empty folder BAR
+  for the index.
+
+- run:
+
+  java -jar folder2index-0.0.2.jar --folder FOO --index BAR --use-tika
+
+The index will be created. Download and open Carrot2 Workbench.
+
+http://project.carrot2.org/download.html
+
+Select Lucene as the document source and pick the correct fields for the title, content and URL (pick file path as
+the URL field).
+
+http://download.carrot2.org/head/manual/index.html#section.getting-started.lucene
+
+Select other input options (how many results to cluster, query or *:*) and run your clustering.