Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Separate client from server #40

Closed
markrogersjr opened this issue May 12, 2021 · 9 comments
Closed

Separate client from server #40

markrogersjr opened this issue May 12, 2021 · 9 comments

Comments

@markrogersjr
Copy link

For some distributed computations, the client must be instantiated separately from the server. It must be created within a separate environment since the server cannot be serialized.

Language Tool Python lumps the client and server into a single object so that when the LanguageTool object is created, the server starts. The same object is used to perform queries. This design precludes the distributed computations mentioned above.

Consider the design that Stanford NLP uses, which is to run a server in a separate process and create a client within any Python process.

@jxmorris12
Copy link
Owner

Thanks for the comments @markrogersjr -- can you explain a bit more? In this package, the server is basically a fork spawned from the child process. You're saying to start them separately? I don't understand the difference you're trying to describe between this package and Stanford NLP.

@markrogersjr
Copy link
Author

I guess a concrete suggestion would be to define a server class and a client class. You can instantiate the server class anywhere in your code and it will kick off the language tool server with the specified host and port. Then you can have a client query said server anywhere else, referencing only three host and port, not the server object. Hope that helps.

I tried using your package but ended up creating my own thin wrapper to suit my needs.

@Bachstelze
Copy link

Could we use a docker image e.g. https://github.com/silvio/docker-languagetool and connect with it?

@markrogersjr
Copy link
Author

markrogersjr commented May 26, 2021

No, that is overkill and probably won't work. Here's a snippet I use in from one of my private repos:

java_server.py

import os
import json
from json.decoder import JSONDecodeError
from time import sleep
from subprocess import call, check_output, CalledProcessError
from traceback import print_exception
from uuid import uuid4

COMMANDS = {
    'pid': 'fuser {port}/tcp',
    'stop': 'kill {pid}',
}


class Cluster:

    def start(self):
        self.servers = [self.server_class(port) for port in self.ports]

    def stop(self):
        for server in self.servers:
            server.stop()

    def __init__(self, ports, server_class):
        self.ports = ports
        self.server_class = server_class
        self.start()

    def __enter__(self):
        return self

    def __exit__(self, exc_type, exc_value, traceback):
        self.stop()
        if exc_type is not None:
            print_exception(exc_type, exc_value, traceback)
        return True


class Server:

    def start(self):
        if self.is_running():
            return
        cmd = self.start_command.format(port=self.port)
        call(cmd, shell=True, cwd=self.path)
        sleep(3)

    def is_running(self):
        try:
            self.get_pid()
        except CalledProcessError:
            return False
        return True

    def get_pid(self):
        cmd = COMMANDS['pid'].format(port=self.port)
        pid = check_output(cmd, shell=True).decode().strip()
        return pid

    def stop(self):
        try:
            pid = self.get_pid()
            cmd = COMMANDS['stop'].format(pid=pid)
            call(cmd, shell=True)
        except CalledProcessError:
            return

    def __init__(self, port, start_command, path):
        self.port = port
        self.start_command = start_command
        self.path = os.environ[path]
        self.start()
        
    def __enter__(self):
        return self

    def __exit__(self, exc_type, exc_value, traceback):
        self.stop()
        if exc_type is not None:
            print_exception(exc_type, exc_value, traceback)
        return True


class Client:

    def start(self):
        pass

    def stop(self):
        pass

    def __init__(self, port, query_command, path):
        self.port = port
        self.query_command = query_command
        self.path = os.environ[path]
        self.start()

    def __enter__(self):
        return self

    def __exit__(self, exc_type, exc_value, traceback):
        self.stop()
        if exc_type is not None:
            print_exception(exc_type, exc_value, traceback)
        return True

    def submit_query(self, query):
        query = query.strip()#.replace("'", "'\"'\"'")
        path = f'/tmp/q{uuid4().hex}'
        with open(path, 'w') as f:
            f.write(query)
        query = f'$(cat {path})'
        cmd = self.query_command.format(query=query, port=self.port)
        try:
            stdout = check_output(cmd, shell=True, cwd=self.path).decode()
            response = json.loads(stdout)
        except (CalledProcessError, JSONDecodeError):
            response = None
        os.remove(path)
        return response

grammar_checker.py

import json
from java_server import Cluster, Server, Client


COMMANDS = {
    'start': 'java -cp languagetool-server.jar org.languagetool.server.HTTPServer --port {port} --allow-origin "*" &',
    'query': "curl --data 'language=en-US&text={query}' http://localhost:{port}/v2/check",
}
DEFAULT_PORT = 8081
DEFAULT_PATH = 'LANGUAGE_TOOL_PATH'


class GrammarCheckerCluster(Cluster):

    def __init__(self, ports):
        Cluster.__init__(self, ports, GrammarCheckerServer)


class GrammarCheckerServer(Server):

    def __init__(self, port=DEFAULT_PORT):
        Server.__init__(self, port, COMMANDS['start'], DEFAULT_PATH)


class GrammarCheckerClient(Client):

    def __init__(self, port=DEFAULT_PORT):
        Client.__init__(self, port, COMMANDS['query'], DEFAULT_PATH)

    def check(self, sentences):
        query = ' '.join(sentences)
        response = self.submit_query(query)
        if response is None:
            return None
        invalid = {d['sentence'] for d in response['matches']}
        outputs = [sentence not in invalid for sentence in sentences]
        return outputs

@Bachstelze
Copy link

OK this looks straight foreward.
But why shouldn't it work to use the docker and use it as remote server e.g. LanguageTool('en-US', remote_server='localhost:8010')?

@markrogersjr
Copy link
Author

I'm not sure exactly what LanguageTool(...) means (python class?) but Docker would be an unnecessary layer of complexity IMO. As shown in the above snippet, running the language tool server with a unix command in the background is sufficient.

@Bachstelze
Copy link

Yes, it is the python class from this package. So the complete connection to a seperate client is:

import language_tool_python
# test the connection with https://github.com/silvio/docker-languagetool
# with version 2.5.3 https://github.com/jxmorris12/language_tool_python
tool = language_tool_python.LanguageTool('en-US', remote_server='localhost:8010')

In many use-cases you have docker anyways in your stack. Therefore it could be a possible solution for the java tasks in #38

@markrogersjr
Copy link
Author

This doesn't address the original issue.

@jxmorris12
Copy link
Owner

Hi @markrogersjr - I made a few changes to support this. You can pass a remote_server parameter to the LanguageTool() initialization, which I think should do what you want. You can try something like this:

server

>>> import language_tool_python
>>> tool = language_tool_python.LanguageTool('en-US', host='0.0.0.0')
>>> tool._url
'http://0.0.0.0:8081/v2/'

client

>>> import language_tool_python
>>> lang_tool = language_tool_python.LanguageTool("en-US", remote_server='http://0.0.0.0:8081')
>>>
>>>
>>> lang_tool.check('helo darknes my old frend')
[Match({'ruleId': 'UPPERCASE_SENTENCE_START', 'message': 'This sentence does not start with an uppercase letter.', 'replacements': ['Helo'], 'offsetInContext': 0, 'context': 'helo darknes my old frend', 'offset': 0, 'errorLength': 4, 'category': 'CASING', 'ruleIssueType': 'typographical', 'sentence': 'helo darknes my old frend'}), Match({'ruleId': 'MORFOLOGIK_RULE_EN_US', 'message': 'Possible spelling mistake found.', 'replacements': ['darkness', 'darkens', 'darkies'], 'offsetInContext': 5, 'context': 'helo darknes my old frend', 'offset': 5, 'errorLength': 7, 'category': 'TYPOS', 'ruleIssueType': 'misspelling', 'sentence': 'helo darknes my old frend'}), Match({'ruleId': 'MORFOLOGIK_RULE_EN_US', 'message': 'Possible spelling mistake found.', 'replacements': ['friend', 'trend', 'Fred', 'freed', 'Freud', 'Friend', 'fend', 'fiend', 'frond', 'rend', 'fr end'], 'offsetInContext': 20, 'context': 'helo darknes my old frend', 'offset': 20, 'errorLength': 5, 'category': 'TYPOS', 'ruleIssueType': 'misspelling', 'sentence': 'helo darknes my old frend'})]
>>>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants