Skip to content

brochington/sematle

main
Switch branches/tags
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Sematle NLU API

Sematle NLU is an NLU API that is a no-holds-barred attempt to convert plain English into structured data. The project currently uses no ML, instead leveraging linguistic analysis libraries such as OpenCog's link-grammar, and Duckling. There is a large focus on not just determining objects, but also defining the connections between them.

Status

The project is functional, but is currently in a very early stage. Some good docs are needed, but in general the following objects types are supported:

  • Agents: people, e.g. "John Smith"
  • Entities: more or less nouns, e.g. "vitamin A"
  • Temporal: time, e.g. "today".
  • Actions: verbs, e.g. "eat"
  • Events: usually a combination of an action that happens at a given time, e.g. "John Smith eats at 2pm".
  • Queries: questions are detected, but connections to other objects are not yet supported.

TODO:

  • Locations
  • Relations
  • Logic

Example

Request

curl --location --request POST '<api-endpoint>/text-to-json' \
--header 'Authorization: Bearer <auth_token>' \
--header 'Content-Type: application/json' \
--data-raw '{
    "sentences": ["Jane Smith baked a cake for Thomas on January 21 , 1990"], 
}

Response

{
    "sema_sentences": [
        {
            "agents": [
                {
                    "agent_type": "person",
                    "symbol": "$1",
                    "properties": [
                        {
                            "first_name": "jane"
                        },
                        {
                            "last_name": "smith"
                        }
                    ]
                },
                {
                    "agent_type": "person",
                    "symbol": "$2",
                    "properties": [
                        {
                            "name": "thomas"
                        }
                    ]
                }
            ],
            "entities": [
                {
                    "entity_type": "cake",
                    "symbol": "$4",
                    "properties": []
                }
            ],
            "locations": [],
            "temporal": [
                {
                    "temporal_type": "absolute",
                    "symbol": "$5",
                    "text": "on January 21 , 1990",
                    "properties": [
                        {
                            "iso": "1990-01-21T00:00:00.000-08:00"
                        }
                    ]
                }
            ],
            "relations": [],
            "actions": [
                {
                    "action_type": "bake",
                    "symbol": "$3",
                    "properties": [
                        {
                            "agent": "$1"
                        },
                        {
                            "patient": "$4"
                        },
                        {
                            "recipient": "$2"
                        }
                    ]
                }
            ],
            "events": [
                {
                    "event_type": "event",
                    "symbol": "$6",
                    "properties": [
                        {
                            "occurs": "$5"
                        },
                        {
                            "action": "$3"
                        }
                    ]
                }
            ],
            "queries": []
        }
    ]
}

Dependencies

At the moment this project requires a running Duckling server. The easiest way to get started is to create a server using the Dockerfile included in the root of the project.

Config

example config:

Config(
  logging_directive: "actix_web=info",
  tcp_port: 8088,
  allowed_origins: [
    "http://localhost:8088"
  ],
  graceful_shutdown_timeout_sec: 3,
  max_payload_size_bytes: 1024,
  database_connection_pool_size: 5,
  database_connection_timeout_sec: 3,
  database_url: "", // not used for the moment, but will be used for the future.
  use_jwt_auth: false, // set to true to use JWT auth.
  jwt_secret: "<secret>", // used for JWT auth, if turned on.
  data_path: "<path to project>/sema-api/sema-api/data",
  duckling_url: "<duckling-url>/parse",
)

Installation

The easiest way to get started is to run the project inside of a docker container. The project includes a Dockerfile to get an image created. after you create an image, you will need to pass in either a CONFIG or CONFIG_PATH environment variable to the container when it is started.

You will need to build the link-grammar lib. from the project root, run:

$ ./install_link_grammar.sh

Development

Make sure you have a recent version of Rust installed.

# from project root
cargo install cargo-watch
cd sema-api
cargo watch -x run --clear --no-gitignore

This should start the server and watch for changes to the project.

Help

Feel free to create an issue or open a pull request!