Skip to content

Commit

Permalink
added to readme, replaced jsonschema dependency
Browse files Browse the repository at this point in the history
  • Loading branch information
Oren Baldinger committed Apr 11, 2019
1 parent 66cf85f commit 372cb40
Show file tree
Hide file tree
Showing 4 changed files with 76 additions and 6 deletions.
70 changes: 69 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Python JSON-NLP Module

(C) 2019 by [Damir Cavar], Oren Baldinger, Maanvitha Gongalla, Anurag Kumar, Murali Kammili
(C) 2019 by [Damir Cavar], [Oren Baldinger], Maanvitha Gongalla, Anurag Kumar, Murali Kammili

Brought to you by the [NLP-Lab.org]!

Expand All @@ -24,12 +24,80 @@ To install this package, run the following command:

You might have to use *pip3* on some systems.

## Validation

[JSON-NLP] is based on a schema, built by [NLP-Lab.org], to comprehensively and concisely represent linguistic annotations.
We provide a validator to help ensure that generated JSON validates against the schema:

result = MyPipeline().proces(text="I am a sentence")
assert pyjsonnlp.validation.is_valid(result)

## Conversion

To enable interoperability with other annotation formats, we support conversions between them.
Note that conversion could be lossy, if the relative depths of annotation are not the same.
Currently we have a [CoNLL-U] to [JSON-NLP] converter, that covers most annotations:

pyjsonnlp.conversion.parse_conllu(conllu_text)

This functionality is still a work in progress.

## Pipeline

[JSON-NLP] provides a simple `Pipeline` interface that should be implemented for embedding into a microservice:

from collections import OrderedDict

class MockPipeline(pyjsonnlp.pipeline.Pipeline):
@staticmethod
def process(text='', coreferences=False, constituents=False, dependencies=False, expressions=False,
**kwargs) -> OrderedDict:
return OrderedDict()
The provided keyword arguments should be used to toggle on or off processing components within the method.

## Microservice

The next step is the [JSON-NLP] a Microservice class, with a pre-built implementation of [Flask].

from pyjsonnlp.microservices.flask_server import FlaskMicroservice

app = FlaskMicroservice(__name__, MyPipeline(), base_route='/')

We recommend creating a `server.py` with the `FlaskMicroservice` class, which extends the [Flask] app. A corresponding WSGI file would contain:

from mypipeline.server import app as application

To disable a pipeline component (such as phrase structure parsing), add

application.constituents = False

The full list of properties available that can be disabled or enabled are
- constituents
- dependencies
- coreference
- expressions

The microservice exposes the following URIs:
- /constituents
- /dependencies
- /coreference
- /expressions
- /token_list

These URIs are shortcuts to disable the other components of the parse. In all cases, `tokenList` will be included in the `JSON-NLP` output. An example url is:

http://localhost:5000/dependencies?text=I am a sentence

Text is provided to the microservice with the `text` parameter, via either `GET` or `POST`. If you pass `url` as a parameter, the microservice will scrape that url and process the text of the website.

Other parameters specific to your pipeline implementation can be passed as well:

http://localhost:5000?lang=en&constituents=0&text=I am a sentence.


[Damir Cavar]: http://damir.cavar.me/ "Damir Cavar"
[Oren Baldinger]: https://oren.baldinger.me/ "Oren Baldinger"
[NLP-Lab.org]: http://nlp-lab.org/ "NLP-Lab.org"
[JSON-NLP]: https://github.com/dcavar/JSON-NLP "JSON-NLP"
[Flair]: https://github.com/zalandoresearch/flair "Flair"
Expand Down
2 changes: 1 addition & 1 deletion pyjsonnlp/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@
from typing import List

name = "pyjsonnlp"
__version__ = "0.2.4"
__version__ = "0.2.5"


def get_base() -> OrderedDict:
Expand Down
2 changes: 1 addition & 1 deletion pyjsonnlp/validation.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@
from typing import List, Tuple

from pyjsonnlp.pipeline import Pipeline
from jsonschema import Draft7Validator, ValidationError
from jsonschemanlplab import Draft7Validator, ValidationError
from pyjsonnlp import remove_empty_fields


Expand Down
8 changes: 5 additions & 3 deletions setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,8 @@

setuptools.setup(
name="pyjsonnlp",
version="0.2.4",
version='0.2.5',
python_requires='>=3.6',
author="Damir Cavar, Oren Baldinger, Maanvitha Gongalla, Anurag Kumar, Murali Kammili",
author_email="damir@cavar.me",
description="The Python JSON-NLP package",
Expand All @@ -16,13 +17,14 @@
packages=setuptools.find_packages(),
install_requires=[
'conllu>=1.2.3',
'jsonschema>=3.0.1',
'jsonschemanlplab>=3.0.1.1',
'flask',
'iso639',
'bs4'
],
classifiers=[
"Programming Language :: Python :: 3",
"Programming Language :: Python :: 3.6",
"Programming Language :: Python :: 3.7",
"License :: OSI Approved :: Apache Software License",
"Operating System :: OS Independent",
],
Expand Down

0 comments on commit 372cb40

Please sign in to comment.