-
Notifications
You must be signed in to change notification settings - Fork 61
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Separate client from server #40
Comments
Thanks for the comments @markrogersjr -- can you explain a bit more? In this package, the server is basically a fork spawned from the child process. You're saying to start them separately? I don't understand the difference you're trying to describe between this package and Stanford NLP. |
I guess a concrete suggestion would be to define a server class and a client class. You can instantiate the server class anywhere in your code and it will kick off the language tool server with the specified host and port. Then you can have a client query said server anywhere else, referencing only three host and port, not the server object. Hope that helps. I tried using your package but ended up creating my own thin wrapper to suit my needs. |
Could we use a docker image e.g. https://github.com/silvio/docker-languagetool and connect with it? |
No, that is overkill and probably won't work. Here's a snippet I use in from one of my private repos:
import os
import json
from json.decoder import JSONDecodeError
from time import sleep
from subprocess import call, check_output, CalledProcessError
from traceback import print_exception
from uuid import uuid4
COMMANDS = {
'pid': 'fuser {port}/tcp',
'stop': 'kill {pid}',
}
class Cluster:
def start(self):
self.servers = [self.server_class(port) for port in self.ports]
def stop(self):
for server in self.servers:
server.stop()
def __init__(self, ports, server_class):
self.ports = ports
self.server_class = server_class
self.start()
def __enter__(self):
return self
def __exit__(self, exc_type, exc_value, traceback):
self.stop()
if exc_type is not None:
print_exception(exc_type, exc_value, traceback)
return True
class Server:
def start(self):
if self.is_running():
return
cmd = self.start_command.format(port=self.port)
call(cmd, shell=True, cwd=self.path)
sleep(3)
def is_running(self):
try:
self.get_pid()
except CalledProcessError:
return False
return True
def get_pid(self):
cmd = COMMANDS['pid'].format(port=self.port)
pid = check_output(cmd, shell=True).decode().strip()
return pid
def stop(self):
try:
pid = self.get_pid()
cmd = COMMANDS['stop'].format(pid=pid)
call(cmd, shell=True)
except CalledProcessError:
return
def __init__(self, port, start_command, path):
self.port = port
self.start_command = start_command
self.path = os.environ[path]
self.start()
def __enter__(self):
return self
def __exit__(self, exc_type, exc_value, traceback):
self.stop()
if exc_type is not None:
print_exception(exc_type, exc_value, traceback)
return True
class Client:
def start(self):
pass
def stop(self):
pass
def __init__(self, port, query_command, path):
self.port = port
self.query_command = query_command
self.path = os.environ[path]
self.start()
def __enter__(self):
return self
def __exit__(self, exc_type, exc_value, traceback):
self.stop()
if exc_type is not None:
print_exception(exc_type, exc_value, traceback)
return True
def submit_query(self, query):
query = query.strip()#.replace("'", "'\"'\"'")
path = f'/tmp/q{uuid4().hex}'
with open(path, 'w') as f:
f.write(query)
query = f'$(cat {path})'
cmd = self.query_command.format(query=query, port=self.port)
try:
stdout = check_output(cmd, shell=True, cwd=self.path).decode()
response = json.loads(stdout)
except (CalledProcessError, JSONDecodeError):
response = None
os.remove(path)
return response
import json
from java_server import Cluster, Server, Client
COMMANDS = {
'start': 'java -cp languagetool-server.jar org.languagetool.server.HTTPServer --port {port} --allow-origin "*" &',
'query': "curl --data 'language=en-US&text={query}' http://localhost:{port}/v2/check",
}
DEFAULT_PORT = 8081
DEFAULT_PATH = 'LANGUAGE_TOOL_PATH'
class GrammarCheckerCluster(Cluster):
def __init__(self, ports):
Cluster.__init__(self, ports, GrammarCheckerServer)
class GrammarCheckerServer(Server):
def __init__(self, port=DEFAULT_PORT):
Server.__init__(self, port, COMMANDS['start'], DEFAULT_PATH)
class GrammarCheckerClient(Client):
def __init__(self, port=DEFAULT_PORT):
Client.__init__(self, port, COMMANDS['query'], DEFAULT_PATH)
def check(self, sentences):
query = ' '.join(sentences)
response = self.submit_query(query)
if response is None:
return None
invalid = {d['sentence'] for d in response['matches']}
outputs = [sentence not in invalid for sentence in sentences]
return outputs |
OK this looks straight foreward. |
I'm not sure exactly what LanguageTool(...) means (python class?) but Docker would be an unnecessary layer of complexity IMO. As shown in the above snippet, running the language tool server with a unix command in the background is sufficient. |
Yes, it is the python class from this package. So the complete connection to a seperate client is:
In many use-cases you have docker anyways in your stack. Therefore it could be a possible solution for the java tasks in #38 |
This doesn't address the original issue. |
Hi @markrogersjr - I made a few changes to support this. You can pass a server>>> import language_tool_python
>>> tool = language_tool_python.LanguageTool('en-US', host='0.0.0.0')
>>> tool._url
'http://0.0.0.0:8081/v2/' client>>> import language_tool_python
>>> lang_tool = language_tool_python.LanguageTool("en-US", remote_server='http://0.0.0.0:8081')
>>>
>>>
>>> lang_tool.check('helo darknes my old frend')
[Match({'ruleId': 'UPPERCASE_SENTENCE_START', 'message': 'This sentence does not start with an uppercase letter.', 'replacements': ['Helo'], 'offsetInContext': 0, 'context': 'helo darknes my old frend', 'offset': 0, 'errorLength': 4, 'category': 'CASING', 'ruleIssueType': 'typographical', 'sentence': 'helo darknes my old frend'}), Match({'ruleId': 'MORFOLOGIK_RULE_EN_US', 'message': 'Possible spelling mistake found.', 'replacements': ['darkness', 'darkens', 'darkies'], 'offsetInContext': 5, 'context': 'helo darknes my old frend', 'offset': 5, 'errorLength': 7, 'category': 'TYPOS', 'ruleIssueType': 'misspelling', 'sentence': 'helo darknes my old frend'}), Match({'ruleId': 'MORFOLOGIK_RULE_EN_US', 'message': 'Possible spelling mistake found.', 'replacements': ['friend', 'trend', 'Fred', 'freed', 'Freud', 'Friend', 'fend', 'fiend', 'frond', 'rend', 'fr end'], 'offsetInContext': 20, 'context': 'helo darknes my old frend', 'offset': 20, 'errorLength': 5, 'category': 'TYPOS', 'ruleIssueType': 'misspelling', 'sentence': 'helo darknes my old frend'})]
>>> |
For some distributed computations, the client must be instantiated separately from the server. It must be created within a separate environment since the server cannot be serialized.
Language Tool Python lumps the client and server into a single object so that when the LanguageTool object is created, the server starts. The same object is used to perform queries. This design precludes the distributed computations mentioned above.
Consider the design that Stanford NLP uses, which is to run a server in a separate process and create a client within any Python process.
The text was updated successfully, but these errors were encountered: