Skip to content
This repository has been archived by the owner on Jan 16, 2021. It is now read-only.

Commit

Permalink
Readme
Browse files Browse the repository at this point in the history
  • Loading branch information
Rainer Simon committed Jan 20, 2014
1 parent 95742d7 commit 609bfe5
Show file tree
Hide file tree
Showing 2 changed files with 739 additions and 13 deletions.
78 changes: 65 additions & 13 deletions README.md
Expand Up @@ -6,27 +6,79 @@ A Web-based tool for validating & correcting geo-resolution results.

## Installation

...
* Install the [Play Framework v2.2.1](http://www.playframework.com/download).
* Recogito depends on the _scalagios-core_ and _scalagios-gazetteer_ utility libraries from the [Scalagios](http://github.com/pelagios/scalagios)
project. These are not yet available through a Maven repository. You need to add them manually to a `lib` folder in the project root.
* Start Recogito using `play start` or `play "start {portnumber}"`.

* To create an Eclipse project, type `play eclipse`
* To create an Eclipse project with dependency's sources attached, type `play` to enter the Play console, and then `eclipse with-source=true`
## Importing Documents

To work with Recogito, you first need to import data. The _admin_ area includes a facility to upload a ZIP file with document metadata and
accompanying UTF-8 plaintext files.

Document metadata must be provided as a JSON file. The name of the file can be chosen arbitrarily. The only requirement is that it has a
`.json` extension. The JSON structure defines the document's __title__, __description__ and __source__ properties, as well as the __parts__
the document consists of, and where in the ZIP file the text for the document (or it's parts) are located.

A possible ZIP folder structure is e.g.:

```
isidore.json
texts/Isidore_Book IX.txt
texts/Isidore_Book XIII.txt
texts/Isidore_Book XIV.txt
```
The corresponding JSON should look like this:

```javascript
{
"title": "Isidore of Seville, Etymyologiae (English)",
"parts": [{
"title": "Book IX",
"text": "texts/Isidore_Book IX.txt"
},{
"title": "Book XIII",
"text": "texts/Isidore_Book XIII.txt"
},{
"title": "Book XIV",
"text": "texts/Isidore_Book XIV.txt"
}]
}
```

The ZIP file can also contain data for multiple documents. In this case, each document must be defined in a separate JSON file.

## Importing Data
## Importing Annotations

The recogito database will be empty when you first start it. Data is imported in two steps:
You can also import annotations as CSV files. E.g. if you want to upload automatically generated annotations before you start
manual annotation, or to restore the results of previous work. In general the order of the columns is irrelevant, it is only
necessary to use the correct pre-defined column labels.

* __Document metadata__ is imported as a JSON file. The JSON file can include metadata for
multiple documents.

* __Annotations__ are imported as CSV files. A CSV file does not need to contain all annotations
for a specific document (i.e. there can be multiple CSV files for one document). However, each
CVS file must belong to exactly one document (i.e. there can't be annotations for different documents
in one CSV file.)
An example CSV file is shown below:

```
gdoc_part;status;toponym;offset;gazetteer_uri;
Book IX;NOT_VERIFIED;Greek;1647;http://pleiades.stoa.org/places/59649;
Book IX;NOT_VERIFIED;Athens;1795;http://pleiades.stoa.org/places/579885;
Book IX;NOT_VERIFIED;Greece;1842;;
Book IX;NOT_VERIFIED;Italy;2287;http://pleiades.stoa.org/places/456048;
Book IX;NOT_VERIFIED;Salii;2371;http://pleiades.stoa.org/places/99034;
```

If the annotations pertain to a document that has parts, the ``gdoc_part`` column must match the name of the part. The other
columns match with the fields in the Recogito data model, and are (generally) optional.

## API

...

## Hacking on Recogito

...

* To create an Eclipse project, type `play eclipse`
* To create an Eclipse project with dependency's sources attached, type `play` to enter the Play console, and then `eclipse with-source=true`

## License

GPL v3
Recogito is licensed under the [GNU General Public License v3.0](http://www.gnu.org/licenses/gpl.html).

0 comments on commit 609bfe5

Please sign in to comment.