How Palmetto can be used

Adrian Wilke edited this page Apr 26, 2018 · 13 revisions

If you are using Palmetto for an experiment or something similar that leads to a publication, please cite the paper "Exploring the Space of Topic Coherence Measures" that you can find on the project website.

There are three different ways, how Palmetto could be used.

As web service

You only want to evaluate your topics or word sets? Than you should simply program a client for the REST interface of our web service. Requesting the coherence for a word set can be done using the URL of the form

http://palmetto.aksw.org/palmetto-webapp/service/<coherence>?words=<words>

where <words> are the space separated words and <coherence> is the name of the coherence. At the moment, the following values can be used:

  • ca
  • cp
  • cv
  • npmi
  • uci
  • umass The response contains the coherence.

If you want to request the C_V coherence for the word set "cake","apple","banana","cherry","chocolate", the URL should look like this

http://palmetto.aksw.org/palmetto-webapp/service/cv?words=cake%20apple%20banana%20cherry%20chocolate

and the response should be text/plain like

0.5678879445677241

An alternative URL that can be used is

http://palmetto.aksw.org/palmetto-webapp/service/calculate?coherence=<coherence>&words=<words>

Note that it is recommended to send GET requests because of recent problems that seemed to be caused by POST requests (#10,#11)

Python client

Thanks to Ivan Ermilov, there is a Python client available at https://github.com/earthquakesan/palmetto-py

As Java program

You would like to use Palmetto locally? No problem, it can be built as runable jar.

1. Download and extract the index

You will have to download a Lucene index containing the preprocessed Wikipedia from here. By extracting the files you should get a wikipedia_bd directory and a wikipedia_bd.histogramm file. Note that the file has to be in the same directory as the wikipedia_bd directory.

There is a Dutch index that has been created by van der Zwaan, Marx and Kamps. It can be downloaded here.

2. Download the program

You can either download the runable jar file from here or you can checkout the master branch and create it by yourself using

cd palmetto
mvn clean compile assembly:single

3. Run Palmetto

The program can be started using

java -jar palmetto-0.1.0-jar-with-dependencies.jar <some-path>/wikipedia_bd <coherence> <topics-file>

You have to set insert the path to the wikipedia_bd directory (the program will assume that the histogramm file can be found under <some-path>/wikipedia_bd.histogramm). The two last parameters are the coherence type and a file containing your topics (see below).

Coherences

At the moment, there are 6 common coherences types that you can run directly with this jar.

  • C_A
  • C_P
  • C_V
  • NPMI
  • UCI
  • UMass

Topics file

The file containing your topics should have one single topic per line. In every line the top words of your topic are listed, separated by a single space. Your file should look like this:

company sell corporation own acquire purchase buy business sale owner
age population household female family census live average median income

Output

The jar will simply print out the topic's coherences.

As Java library

You want to include Palmetto into your own project? You can check out the last stable version using

git clone -b v0.1.1 https://github.com/dice-group/Palmetto.git

install it locally using

cd Palmetto
mvn clean install

and add it as a Maven dependency

  	<dependency>
  		<groupId>org.aksw</groupId>
  		<artifactId>palmetto</artifactId>
  		<version>0.1.1</version>
  	</dependency>

Another way is to download the necessary files from here:

  • palmetto-0.1.0.jar
  • palmetto-0.1.0-javadoc.jar (optional)
  • palmetto-0.1.0-sources.jar (optional) If you are using maven, you can install these files to your local repository using
mvn install:install-file -Dfile=./target/palmetto-0.1.0.jar -Dpackaging=jar -Djavadoc=./target/palmetto-0.1.0-javadoc.jar -Dsources=./target/palmetto-0.1.0-sources.jar

If you want to know how to use the coherence inside your source code, you should 1) read the paper to understand the parts a coherence comprises and 2) take a look into the org.aksw.palmetto.Palmetto class.

You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.
Press h to open a hovercard with more details.