Separate client from server #40

markrogersjr · 2021-05-12T22:21:36Z

For some distributed computations, the client must be instantiated separately from the server. It must be created within a separate environment since the server cannot be serialized.

Language Tool Python lumps the client and server into a single object so that when the LanguageTool object is created, the server starts. The same object is used to perform queries. This design precludes the distributed computations mentioned above.

Consider the design that Stanford NLP uses, which is to run a server in a separate process and create a client within any Python process.

jxmorris12 · 2021-05-24T00:05:08Z

Thanks for the comments @markrogersjr -- can you explain a bit more? In this package, the server is basically a fork spawned from the child process. You're saying to start them separately? I don't understand the difference you're trying to describe between this package and Stanford NLP.

markrogersjr · 2021-05-24T00:33:31Z

I guess a concrete suggestion would be to define a server class and a client class. You can instantiate the server class anywhere in your code and it will kick off the language tool server with the specified host and port. Then you can have a client query said server anywhere else, referencing only three host and port, not the server object. Hope that helps.

I tried using your package but ended up creating my own thin wrapper to suit my needs.

Bachstelze · 2021-05-26T07:30:39Z

Could we use a docker image e.g. https://github.com/silvio/docker-languagetool and connect with it?

markrogersjr · 2021-05-26T07:34:47Z

No, that is overkill and probably won't work. Here's a snippet I use in from one of my private repos:

java_server.py

import os
import json
from json.decoder import JSONDecodeError
from time import sleep
from subprocess import call, check_output, CalledProcessError
from traceback import print_exception
from uuid import uuid4

COMMANDS = {
    'pid': 'fuser {port}/tcp',
    'stop': 'kill {pid}',
}


class Cluster:

    def start(self):
        self.servers = [self.server_class(port) for port in self.ports]

    def stop(self):
        for server in self.servers:
            server.stop()

    def __init__(self, ports, server_class):
        self.ports = ports
        self.server_class = server_class
        self.start()

    def __enter__(self):
        return self

    def __exit__(self, exc_type, exc_value, traceback):
        self.stop()
        if exc_type is not None:
            print_exception(exc_type, exc_value, traceback)
        return True


class Server:

    def start(self):
        if self.is_running():
            return
        cmd = self.start_command.format(port=self.port)
        call(cmd, shell=True, cwd=self.path)
        sleep(3)

    def is_running(self):
        try:
            self.get_pid()
        except CalledProcessError:
            return False
        return True

    def get_pid(self):
        cmd = COMMANDS['pid'].format(port=self.port)
        pid = check_output(cmd, shell=True).decode().strip()
        return pid

    def stop(self):
        try:
            pid = self.get_pid()
            cmd = COMMANDS['stop'].format(pid=pid)
            call(cmd, shell=True)
        except CalledProcessError:
            return

    def __init__(self, port, start_command, path):
        self.port = port
        self.start_command = start_command
        self.path = os.environ[path]
        self.start()
        
    def __enter__(self):
        return self

    def __exit__(self, exc_type, exc_value, traceback):
        self.stop()
        if exc_type is not None:
            print_exception(exc_type, exc_value, traceback)
        return True


class Client:

    def start(self):
        pass

    def stop(self):
        pass

    def __init__(self, port, query_command, path):
        self.port = port
        self.query_command = query_command
        self.path = os.environ[path]
        self.start()

    def __enter__(self):
        return self

    def __exit__(self, exc_type, exc_value, traceback):
        self.stop()
        if exc_type is not None:
            print_exception(exc_type, exc_value, traceback)
        return True

    def submit_query(self, query):
        query = query.strip()#.replace("'", "'\"'\"'")
        path = f'/tmp/q{uuid4().hex}'
        with open(path, 'w') as f:
            f.write(query)
        query = f'$(cat {path})'
        cmd = self.query_command.format(query=query, port=self.port)
        try:
            stdout = check_output(cmd, shell=True, cwd=self.path).decode()
            response = json.loads(stdout)
        except (CalledProcessError, JSONDecodeError):
            response = None
        os.remove(path)
        return response

grammar_checker.py

import json
from java_server import Cluster, Server, Client


COMMANDS = {
    'start': 'java -cp languagetool-server.jar org.languagetool.server.HTTPServer --port {port} --allow-origin "*" &',
    'query': "curl --data 'language=en-US&text={query}' http://localhost:{port}/v2/check",
}
DEFAULT_PORT = 8081
DEFAULT_PATH = 'LANGUAGE_TOOL_PATH'


class GrammarCheckerCluster(Cluster):

    def __init__(self, ports):
        Cluster.__init__(self, ports, GrammarCheckerServer)


class GrammarCheckerServer(Server):

    def __init__(self, port=DEFAULT_PORT):
        Server.__init__(self, port, COMMANDS['start'], DEFAULT_PATH)


class GrammarCheckerClient(Client):

    def __init__(self, port=DEFAULT_PORT):
        Client.__init__(self, port, COMMANDS['query'], DEFAULT_PATH)

    def check(self, sentences):
        query = ' '.join(sentences)
        response = self.submit_query(query)
        if response is None:
            return None
        invalid = {d['sentence'] for d in response['matches']}
        outputs = [sentence not in invalid for sentence in sentences]
        return outputs

Bachstelze · 2021-05-26T07:43:40Z

OK this looks straight foreward.
But why shouldn't it work to use the docker and use it as remote server e.g. LanguageTool('en-US', remote_server='localhost:8010')?

markrogersjr · 2021-05-26T07:46:35Z

I'm not sure exactly what LanguageTool(...) means (python class?) but Docker would be an unnecessary layer of complexity IMO. As shown in the above snippet, running the language tool server with a unix command in the background is sufficient.

Bachstelze · 2021-05-26T07:59:36Z

Yes, it is the python class from this package. So the complete connection to a seperate client is:

import language_tool_python
# test the connection with https://github.com/silvio/docker-languagetool
# with version 2.5.3 https://github.com/jxmorris12/language_tool_python
tool = language_tool_python.LanguageTool('en-US', remote_server='localhost:8010')

In many use-cases you have docker anyways in your stack. Therefore it could be a possible solution for the java tasks in #38

markrogersjr · 2021-05-26T08:06:06Z

This doesn't address the original issue.

jxmorris12 · 2022-01-12T14:28:29Z

Hi @markrogersjr - I made a few changes to support this. You can pass a remote_server parameter to the LanguageTool() initialization, which I think should do what you want. You can try something like this:

server

>>> import language_tool_python
>>> tool = language_tool_python.LanguageTool('en-US', host='0.0.0.0')
>>> tool._url
'http://0.0.0.0:8081/v2/'

client

>>> import language_tool_python
>>> lang_tool = language_tool_python.LanguageTool("en-US", remote_server='http://0.0.0.0:8081')
>>>
>>>
>>> lang_tool.check('helo darknes my old frend')
[Match({'ruleId': 'UPPERCASE_SENTENCE_START', 'message': 'This sentence does not start with an uppercase letter.', 'replacements': ['Helo'], 'offsetInContext': 0, 'context': 'helo darknes my old frend', 'offset': 0, 'errorLength': 4, 'category': 'CASING', 'ruleIssueType': 'typographical', 'sentence': 'helo darknes my old frend'}), Match({'ruleId': 'MORFOLOGIK_RULE_EN_US', 'message': 'Possible spelling mistake found.', 'replacements': ['darkness', 'darkens', 'darkies'], 'offsetInContext': 5, 'context': 'helo darknes my old frend', 'offset': 5, 'errorLength': 7, 'category': 'TYPOS', 'ruleIssueType': 'misspelling', 'sentence': 'helo darknes my old frend'}), Match({'ruleId': 'MORFOLOGIK_RULE_EN_US', 'message': 'Possible spelling mistake found.', 'replacements': ['friend', 'trend', 'Fred', 'freed', 'Freud', 'Friend', 'fend', 'fiend', 'frond', 'rend', 'fr end'], 'offsetInContext': 20, 'context': 'helo darknes my old frend', 'offset': 20, 'errorLength': 5, 'category': 'TYPOS', 'ruleIssueType': 'misspelling', 'sentence': 'helo darknes my old frend'})]
>>>

jxmorris12 mentioned this issue Dec 14, 2021

Is there any solution to speed up the grammar checking/correction? #50

Closed

jxmorris12 closed this as completed Jan 12, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Separate client from server #40

Separate client from server #40

markrogersjr commented May 12, 2021

jxmorris12 commented May 24, 2021

markrogersjr commented May 24, 2021

Bachstelze commented May 26, 2021

markrogersjr commented May 26, 2021 •

edited

Loading

Bachstelze commented May 26, 2021

markrogersjr commented May 26, 2021

Bachstelze commented May 26, 2021

markrogersjr commented May 26, 2021

jxmorris12 commented Jan 12, 2022

Separate client from server #40

Separate client from server #40

Comments

markrogersjr commented May 12, 2021

jxmorris12 commented May 24, 2021

markrogersjr commented May 24, 2021

Bachstelze commented May 26, 2021

markrogersjr commented May 26, 2021 • edited Loading

Bachstelze commented May 26, 2021

markrogersjr commented May 26, 2021

Bachstelze commented May 26, 2021

markrogersjr commented May 26, 2021

jxmorris12 commented Jan 12, 2022

server

client

markrogersjr commented May 26, 2021 •

edited

Loading