# GATE Slave

The GATE Slave is a module that allows to run anything in a Java GATE process from Python and interchange documents between Python and Java.

One possible use of this is to run an existing GATE pipeline on a Python GateNLP document.

This is done by the python module communicating with a Java process over a socket connection. 
Java calls on the Python side are sent over to Java, executed and the result is send back to Python. 

For this to work, GATE and Java have to be installed on the machine that runs the GATE Slave.

The easiest way to run this is by first manually starting the GATE Slave in the Java GATE GUI and then 
connecting to it from the Python side. 

## Manually starting the GATE Slave from GATE

1. Start GATE
2. Load the Python plugin using the CREOLE Plugin Manager
3. Create a new Language Resource: "PythonSlaveLr"

When creating the PyhonSlaveLr, the following initialization parameters can be specified:
* `authToken`: this is used to prevent other processes from connecting to the slave. You can either specify 
  some string here or with `useAuthToken` set to `true` let GATE choose a random one and display it in the 
  message pane after the resource has been created. 
* `host`:  The host name or address to bind to. The default 127.0.0.1 makes the slave only visible on the same
  machine. In order to make it visible on other machines, use the host name or IP address on the network
  or use 0.0.0.0 
* `logActions`: if this is set to true, the actions requested by the Python process are logged to the message pane. 
* `port`: the port number to use. Each slave requires their own port number so if more than one slave is running
  on a machine, they need to use different, unused port numbers. 
* `useAuthToken`: if this is set to false, no auth token is generated and used, and the connection can be 
  established by any process connecting to that port number. 

A GATE Slave started via the PythonSlaveLr keeps running until the resource is deleted or GATE is ended.


## Using the GATE Slave from Python

Once the PythonSlaveLr resource has been created it is ready to get used by a Python program:


In [1]:
from gatenlp.gateslave import GateSlave

To connect to an already running slave process, the parameter `start=False` must be specified. 
In addition the auth token must be provided and the port and host, if they differ from the default.

In [2]:
gs = GateSlave(start=False, auth_token="841e634a-d1f0-4768-b763-a7738ddee003")

The gate slave instance can now be used to run arbitrary Java methods on the Java side. 
The gate slave instance provides a number of useful methods directly (see [PythonDoc for gateslave](https://gatenlp.github.io/python-gatenlp/pythondoc/gatenlp/gateslave.html) )
* `gs.load_gdoc(filepath, mimetype=None`: load a GATE document on the Java side and return it to Python
* `gs.save_gdoc(gatedocument, filepath, mimetype=None)`: save a GATE document on the Java side
* `gs.gdoc2pdoc(gatedocument)`: convert the Java GATE document as a Python GateNLP document and return it
* `gs.pdoc2gdoc(doc)`: convert the Python GateNLP document to a Java GATE document and return it
* `gs.del_gdoc(gatedocument)`: remove a Java GATE document on the Java side (this necessary to release memory)
* `gs.load_pdoc(filepath, mimetype=None)`: load a document on the Java side using the file format specified via the mime type and return it as a Python GateNLP document
* `gs.log_actions(trueorfalse)`: switch logging of actions on the slave side off/on

In addition, there is a larger number of utility methods which are available through `gs.slave` (see 
[PythonSlave Source code](https://github.com/GateNLP/gateplugin-Python/blob/master/src/main/java/gate/plugin/python/PythonSlave.java), here are a few examples:

* `loadMavenPlugin(group, artifact, version)`: make the plugin identified by the given Maven coordinates available
* `loadPipelineFromFile(filepath)`: load the pipeline/controller from the given file path and return it
* `loadDocumentFromFile(filepath)`: load a GATE document from the file and return it
* `loadDocumentFromFile(filepath, mimetype)`: load a GATE document from the file using the format corresponding to the given mime type and return it
* `saveDocumentToFile(gatedocument, filepath, mimetype)`: save the document to the file, using the format corresponding to the mime type
* `createDocument(content)`: create a new document from the given String content and return it
* `run4Document(pipeline, document)`: run the given pipeline on the given document



In [3]:
# Create a new Java document from a string
gdoc1 = gs.slave.createDocument("This is a 💩 document. It mentions Barack Obama and George Bush and New York.")
gdoc1

JavaObject id=o0

In [4]:
# you can call the API methods for the document directly from Python
print(gdoc1.getName())
print(gdoc1.getFeatures())

GATE Document_00014
{'gate.SourceURL': 'created from String'}


In [5]:
# so far the document only "lives" in the Java process. In order to copy it to Python, it has to be converted
# to a Python GateNLP document:
pdoc1 = gs.gdoc2pdoc(gdoc1)
pdoc1.text

'This is a 💩 document. It mentions Barack Obama and George Bush and New York.'

In [6]:
# Let's load ANNIE on the Java side and run it on that document:
# First we have to load the ANNIE plugin:
gs.slave.loadMavenPlugin("uk.ac.gate.plugins", "annie", "8.6")

In [7]:
# now load the prepared ANNIE pipeline from the plugin
pipeline = gs.slave.loadPipelineFromPlugin("uk.ac.gate.plugins","annie", "/resources/ANNIE_with_defaults.gapp")
pipeline.getName()

'ANNIE'

In [8]:
# run the pipeline on the document and convert it to a GateNLP Python document and display it
gs.slave.run4Document(pipeline, gdoc1)
pdoc1 = gs.gdoc2pdoc(gdoc1)
pdoc1

## Manually starting the GATE Slave from Python

After installation of Python `gatenlp`, the command `gatenlp-gate-slave` is available. 

You can run `gatenlp-gate-slave --help` to get help information:

```
usage: gatenlp-gate-slave [-h] [--port PORT] [--host HOST] [--auth AUTH]
                          [--noauth] [--gatehome GATEHOME]
                          [--platform PLATFORM] [--log_actions] [--keep]

Start Java GATE Slave

optional arguments:
  -h, --help           show this help message and exit
  --port PORT          Port (25333)
  --host HOST          Host to bind to (127.0.0.1)
  --auth AUTH          Auth token to use (generate random)
  --noauth             Do not use auth token
  --gatehome GATEHOME  Location of GATE (environment variable GATE_HOME)
  --platform PLATFORM  OS/Platform: windows or linux (autodetect)
  --log_actions        If slave actions should be logged
  --keep               Prevent shutting down the slave
```

For example to start a gate slave as with the PythonSlaveLr above, but this time re-using the exact same
auth token and switching on logging of the actions:
    
```
gatenlp-gate-slave --auth 841e634a-d1f0-4768-b763-a7738ddee003 --log_actions
```

Again the Python program can connect to the server as before:


In [9]:
gs = GateSlave(start=False, auth_token="841e634a-d1f0-4768-b763-a7738ddee003")
gs

<gatenlp.gateslave.GateSlave at 0x7f987fabcf50>

The GATE slave started that way keeps running until it is interrupted from the keyboard using "Ctrl-C" or 
until the GATE slave sends the "close" request:

In [10]:
gs.close()