Clone this wiki locally
OpenRefine Developer's Guide
Guide for OpenRefine Developers
Build and Run OpenRefine from an IDE (easier)
At the command line, go to a directory not under your Eclipse workspace directory and check out the source:
git clone https://github.com/OpenRefine/OpenRefine.git
Then in Eclipse, invoke the Import ... command
Pick Existing Projects into Workspace
Locate the directory where you've checked out Google Refine. Eclipse should detect the sub-projects "grefine", "grefine-server", etc.
Locate the directory where you've checked out OpenRefine. Eclipse should detect the sub-projects "grefine", "grefine-server", etc.
Note that the above steps are already packaged for you if you look for the Refine.launch file in the server/IDEs/eclipse folder inside the OpenRefine source code. To execute, right click on that file, select "Run As...", then click on "Refine", this should run OpenRefine with the above parameters already set. After the first execution, this command should be updated automatically to your list of Run launches so you can find it there.
- make sure to import the eclipse project and not start one from scratch (either from the "import" command in the File menu, or thru importing directly via svn)
- in the "run" configuration, in the "(x)= arguments" tab, you need to place this line in the VM configuration (VM Options in NetBeans) text area:
- apply, close the dialog and then click the green run button and voila', you should see it run and open your default browser.
If you want OpenRefine logs to be more verbose, append this to the line above
Build and Run OpenRefine from the command line using Apache Ant (Fairly Easy and recommended for Windows users)
Build and Run OpenRefine from the command line (harder but more comprehensive)
On Mac OS X and Linux the OpenRefine build system requires you to have a Unix shell; just use a shell compatible with /bin/sh. On Windows you can use the Windows command line. (go to Start then Run, type cmd and click OK). The refine.bat on Windows only supports a subset of the functionality, supported by the refine shell script.
NOTE: The Ant build is not designed to be run separately. It expects things to be set up by the shell script/batch file.
To see what functions are supported by OpenRefine's build system, type
to get a list of them.
NOTE: When running from the Windows command line, you should omit the ./
Since OpenRefine is composed of two parts, a server and a in-browser UI, the testing system reflects that:
- on the server side, it's powered by TestNG and the unit tests are written in Java;
To run the complete testing harness, simply type
The above command requires you to have Python and Curl installed on your machine and available in the path (you might already do, even if you don't know about it so just try it once and see what errors you get). Note that you don't need to have Windmill installed, the script will install it for you and build a proper Python virtualenv so that you don't have to worry about version collisions or polluting your existing Python installation.
If you want to run only the server side portion of the tests run
or, if you want to run only the client side portion of the tests run
If you get an an error saying "No default or local profile has been set" from Windmill, add the following two lines to your ~/.windmill/prefs.py file:
Testing in Eclipse
You can also run the server-side part of the tests directly from Eclipse. To do that you need to have the TestNG launcher plugin installed. If you don't have it, you can get it by installing new software from this update URL
Once the TestNG launching plugin is installed in your Eclipse, right click on the source folder , select "Run As -> TestNG Test". This should open a new tab with the TestNG launcher running the OpenRefine tests.
Testing on Windows
In addition to Python, the Python for Windows extensions are required to run Windmill on Windows. Without it Windmill fails to respond and/or gives an ImportError?: No module named win32api
Curl is also required, but you can install it when you install Cygwin (there is a module for it).
Building Distributions (Kits)
The Refine build system uses Apache Ant to automate the creation of the installation packages for the different operating systems. The packages are currently optimized to run on Mac OS X which is the only platform capable of creating the packages for all three OS that we support.
To build the distributions type
./refine dist <version>
where 'version' is the release version (for example, 2.6b1).
On Windows, OpenRefine requires the Java JDK to be installed. Once these are installed you then need to Set the 'JAVA_HOME' environment variable (please ensure it points to the JDK, and not the JRE) and the ANT_HOME environment variable (this is set up in a similar way to JAVA_HOME, but should point to the Tools/Ant directory in OpenRefine's source code). You may need to reboot your machine after setting these environment variables.
Open the Windows command line. Navigate to your installation directory, then run OpenRefine (the ./ should be omitted when running any of the Refine commands from the Windows command line)
# cd 'C:\\path_to_installation_directory' # refine
If you receive a message Could not find the main class: com.google.refine.Refine. Program will exit. this means that the JAVA_HOME variable is likely not set correctly.
If your Linux distro doesn't have Java installed and you have support for apt-get (say, Debian or Ubuntu) you can read this helpful page. Make sure that JAVA_HOME points to your JDK (Java developer kit), the one that contains the compiler, and not the JRE (Java runtime environment) which is what you need to actually run a Java program.