Skip to content
Permalink
Browse files

Added Wit.ai document type determiner (#63)

* wit.ai configurations

* working search type and query

* finall wit.ai commit

* finall wit.ai commit

* changed settings configurations

* added new search fucntion

*  pep8 changes

* removed .idea file

* new search url with wit

* added in line comment

* added search parameter url

* removed new url, modified to use doc type instead

* removed commented line

* removed find_search_doc used doc_exist instead

* updated read me with wit details

* added images to gitignore

* correcting gramatical error

* intergrated wit.ai into runquery function

* removed commented line

* Update .gitignore
  • Loading branch information...
Awinja-Andela authored and DavidLemayian committed Oct 28, 2017
1 parent f6c4b0b commit a6aeeb94d90168eea3faf54599f2f4b1aee0f202
@@ -55,6 +55,60 @@ $ nosetests healthtools/tests/nhif_inpatient.py
$ nosetests healthtools/tests/nhif_outpatient.py
$ nosetests healthtools/tests/nhif_outpatient_cs.py
```
## Wit.ai
Wit.ai is a simple natural language processing API that helps developers turn speech and text into actionable data. In healthtools.API, we use it to distinguish the doc_type and the query param from the parameters being sent to the API.
For example: `dr mary`

```
{
"msg_id": "0csXmLxhRzgowQFJI",
"_text": "dr mary",
"entities": {
"doctors": [
{
"confidence": 0.96676127393653,
"value": "dr",
"type": "value"
}
],
"query": [
{
"suggested": true,
"confidence": 0.99538476989175,
"value": "mary",
"type": "value"
}
]
}
}
```
The doc_type has been identified as `doctors` and `mary` has been identified as query param.
To configure wit.ai:
- Sign up on wit.ai website with either your github or facebook account
- Create an app with the name you prefer to use.
- Detect your first entities:

![Alt text](images/teach.gif?raw=true "Title")

Improve the detection

This will be done by continuously training your app to identify keyword or sentences. Input a message and check its prediction, if wrong, correct it if right validate it.

![Alt text](images/training.gif?raw=true "Title")

Configure Server Access Token in the ENV.

The token can be found from the settings tab, API details segment.
![Alt text](images/at.png?raw=true "Title")

To set it in the ENV, run `export WIT_ACCESS_TOKEN='Token'`

The following endpoint is used to run wit.ai on healthtools.API

| URL Endpoint | HTTP Method | Functionality | Parameter |
|--------------|-------------|---------------|-----------|
| /search/wit | GET | Search Query | q=[query] |


## Contributing

@@ -57,7 +57,6 @@

}


def get_docs():
return DOCUMENTS

@@ -1,3 +1,7 @@
from wit import Wit
from nested_lookup import nested_lookup
from healthtools.settings import WIT_ACCESS_TOKEN

from healthtools.documents import DOCUMENTS, doc_exists

from healthtools.search import elastic, nurses
@@ -7,6 +11,14 @@ def run_query(query, doc_type=None):

search_type = None

if doc_type == 'wit':
doc_type, query = determine_doc_type_using_wit(query)
if doc_type is not 'nurses':
search_type = 'elastic'
else:
search_type = 'nurses'
return doc_type, search_type

doc_type, search_type = determine_doc_type(query, doc_type)

if not doc_type:
@@ -53,3 +65,25 @@ def remove_keywords(query):
if query.startswith(keyword + ' '):
return query.replace(keyword, '', 1).strip()
return query


"""
Wit.ai will help predict the doc name the user is trying to search for.
This will be based on how the wit will have been trained.
wit.ai will be used when the query is made using /search?=<query>
"""

def determine_doc_type_using_wit(query, doc_type=None):
"""
This returns the doc name and the query.
The response will return 2 keys one being the doc name and the other query
wit.ai will returns all hyphens as underscores.
"""
client = Wit(access_token=WIT_ACCESS_TOKEN)
message_text = query
resp = client.message(message_text)
query = ''.join(nested_lookup('value', resp['entities']['query']))
doc_type = ''.join([var for var in (resp['entities'].keys()) if var != 'query'])
doc_type = doc_type.replace("_", "-") # changes underscore to hyphen
if doc_exists(doc_type) == True:
return doc_type, query
@@ -56,3 +56,9 @@
}

SLACK_URL = os.getenv('HTOOLS_SLACK_URL')

################################################################################

# Wit.ai Access Token

WIT_ACCESS_TOKEN = os.getenv("WIT_ACCESS_TOKEN")
@@ -2,18 +2,17 @@

from healthtools.search import run_query


blueprint = Blueprint('search_api', __name__)


@blueprint.route('/search', methods=['GET'], strict_slashes=False)
@blueprint.route('/search/<doc_type>', methods=['GET'], strict_slashes=False)
def index(doc_type=None):
query = request.args.get('q')

result, doc_type = run_query(query, doc_type)

# Error with run_query (run_query returns false)
if(not result):
if not result:
return jsonify({
'result': {'hits': [], 'total': 0},
'doc_type': doc_type,
BIN +243 KB images/at.png
Binary file not shown.
BIN +884 KB images/teach.gif
Binary file not shown.
BIN +492 KB images/training.gif
Binary file not shown.

0 comments on commit a6aeeb9

Please sign in to comment.
You can’t perform that action at this time.