Skip to content
Permalink
Browse files

Added Dockerfiles for both yake and yake-server (Rest API)

To build both images, cd to ./docker and run ./build.sh
Added @arianpasquali's gist for the REST API, but with bind to 0.0.0.0:5000 to accept incoming connections from outside the Docker container (with 127.0.0.1 it would not respond to requests from the host, for example)
Overhauled README to have the 3 installation scenarios clearly explained
Added script to test out the REST api (./test_rest_api_on_docker.sh in ./docker)
Added credits for DevOps work (see AUTHORS.rst) ;-)
Next step is to add automated builds of the images on Docker Hub when you push to the repo...
  • Loading branch information...
silvae86 committed Jan 30, 2019
1 parent 439579c commit 2f4253a8fc92f2e3c973ce1d06e06004be521588
@@ -14,3 +14,8 @@ Contributors
------------

None yet. Why not be the first?


DevOps - Docker
----------------
* João Rocha da Silva <joaorosilva@gmail.com>
111 README.md
@@ -55,12 +55,56 @@ YAKE! Collection-independent Automatic Keyword Extractor
Proceedings of the 40th European Conference on Information Retrieval (ECIR'18), Grenoble, France. March 26 – 29
https://link.springer.com/chapter/10.1007/978-3-319-76941-7_80

## Requirements
## Installing YAKE!

Python3
There are three installation alternatives.

- To run YAKE! in the command line (say, to integrate in a script), you can use our [simple YAKE! Docker container](#cli-image).
- To run YAKE! as a RESTful API that *runs in the background*, say to integrate in a web application, you can use our [RESTful API server image](#rest-api-image).
- To install YAKE! straight "on the metal" or you want to integrate it in your Python app, you can [install it and its dependencies](#standalone-installation).

<a name="cli-image"></a>
### Option 1. YAKE as a CLI utility inside a Docker container

First, install Docker. Ubuntu users, please see our [script below](#installing-docker) for a complete installation script.

Then, run:

```bash
docker run feupinfolab/yake:latest -ti "Caffeine is a central nervous system (CNS) stimulant of the methylxanthine class.[10] It is the world's most widely consumed psychoactive drug. Unlike many other psychoactive substances, it is legal and unregulated in nearly all parts of the world. There are several known mechanisms of action to explain the effects of caffeine. The most prominent is that it reversibly blocks the action of adenosine on its receptor and consequently prevents the onset of drowsiness induced by adenosine. Caffeine also stimulates certain portions of the autonomic nervous system."
```
*Example text from Wikipedia*

<a name="rest-api-image"></a>
### Option 2. REST API Server in a Docker container

This install will provide you a mirror of the original REST API of YAKE! available [here](https://boiling-castle-88317.herokuapp.com).

```bash
docker run -d feupinfolab/yake-rest:latest
```

After it starts up, the container will run in the background, at http://127.0.0.1:5000. To access the YAKE! API documentation, go to http://127.0.0.1:5000/apidocs/.

## Installation
You can test the RESTful API using `curl`:

```bash
curl 'http://127.0.0.1:5000/yake/' \
-XPOST \
-H 'Accept: application/json' \
-H 'Content-Type: application/x-www-form-urlencoded' \
--data 'text=Coffee%20is%20a%20brewed%20drink%20prepared%20from%20roasted%20coffee%20beans%2C%20the%20seeds%20of%20berries%20from%20certain%20Coffea%20species.%20The%20genus%20Coffea%20is%20native%20to%20tropical%20Africa%20(specifically%20having%20its%20origin%20in%20Ethiopia%20and%20Sudan)%20and%20Madagascar%2C%20the%20Comoros%2C%20Mauritius%2C%20and%20R%C3%A9union%20in%20the%20Indian%20Ocean.%5B2%5D%20Coffee%20plants%20are%20now%20cultivated%20in%20over%2070%20countries%2C%20primarily%20in%20the%20equatorial%20regions%20of%20the%20Americas%2C%20Southeast%20Asia%2C%20Indian%20subcontinent%2C%20and%20Africa.%20The%20two%20most%20commonly%20grown%20are%20C.%20arabica%20and%20C.%20robusta.%20Once%20ripe%2C%20coffee%20berries%20are%20picked%2C%20processed%2C%20and%20dried.%20Dried%20coffee%20seeds%20(referred%20to%20as%20%22beans%22)%20are%20roasted%20to%20varying%20degrees%2C%20depending%20on%20the%20desired%20flavor.%20Roasted%20beans%20are%20ground%20and%20then%20brewed%20with%20near-boiling%20water%20to%20produce%20the%20beverage%20known%20as%20coffee.&language=en&max_ngram_size=4&number_of_keywords=10'
```
*Example text from Wikipedia*

<a name="standalone-installation"></a>
### Option 3. Standalone Installation (for development or integration)

#### Requirements

Python3

#### Installation

To install Yake using pip:

@@ -70,9 +114,7 @@ To upgrade using pip:

pip install git+https://github.com/LIAAD/yake –upgrade

## Usage

### Command line
#### Usage (Command line)

How to use it on your favorite command line

@@ -91,7 +133,7 @@ How to use it on your favorite command line
-v, --verbose
--help Show this message and exit.

### Python
### Usage (Python)

How to use it on Python

@@ -122,19 +164,58 @@ How to use it on Python

## Related projects

### yake-dockerfile

https://github.com/feup-infolab/yake-dockerfile - Dockerfile for building an image for this package.

Credits to https://github.com/silvae86
### Dockerfiles

https://github.com/feup-infolab/yake-dockerfile - Dockerfile for building an image for this package.
https://github.com/feup-infolab/yake-rest-dockerfile - Dockerfile for building an image of the RESTful API version of this package.

### `pke` - python keyphrase extraction

https://github.com/boudinfl/pke - `pke` is an **open source** python-based **keyphrase extraction** toolkit. It
provides an end-to-end keyphrase extraction pipeline in which each component can
be easily modified or extended to develop new models. `pke` also allows for
easy benchmarking of state-of-the-art keyphrase extraction models, and
be easily modified or extended to develop new models. `pke` also allows for
easy benchmarking of state-of-the-art keyphrase extraction models, and
ships with supervised models trained on the SemEval-2010 dataset (http://aclweb.org/anthology/S10-1004).

Credits to https://github.com/boudinfl
Credits to https://github.com/boudinfl

<a name="installing-docker"></a>
## How to install Docker

Here is the "just copy and paste" installations script for Docker in Ubuntu. Enjoy.

```bash
# Install dependencies
sudo apt-get update
sudo apt-get install \
apt-transport-https \
ca-certificates \
curl \
software-properties-common
# Add Docker repo
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
sudo apt-key fingerprint 0EBFCD88
sudo add-apt-repository \
"deb [arch=amd64] https://download.docker.com/linux/ubuntu \
$(lsb_release -cs) \
stable"
sudo apt-get update
# Install Docker
sudo apt-get install -y docker-ce
# Start Docker Daemon
sudo service docker start
# Add yourself to the Docker user group, otherwise docker will complain that
# it does not know if the Docker Daemon is running
sudo usermod -aG docker ${USER}
# Install docker-compose
sudo curl -L "https://github.com/docker/compose/releases/download/1.23.1/docker-compose-$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compose
sudo chmod +x /usr/local/bin/docker-compose
source ~/.bashrc
docker-compose --version
echo "Done!"
```
@@ -0,0 +1,28 @@
FROM library/python:3.7.1-alpine

# change to temp dir
WORKDIR /temp

# install git and build-base (GCC, etc.)
RUN apk update && apk upgrade && \
apk add --no-cache bash git openssh && \
apk add build-base

RUN pip install flasgger

# install requirements first to engage docker cache
RUN wget https://raw.githubusercontent.com/feup-infolab/yake/master/requirements.txt -O requirements.txt
RUN pip install -r requirements.txt

# install yake via pip
RUN pip install git+https://github.com/liaad/yake.git

# Copy server startup script
COPY ./yake-rest-api.py /temp

# Expose server port
ENV SERVER_PORT 5000
EXPOSE "$SERVER_PORT"

# set default command
CMD [ "python", "yake-rest-api.py" ]
@@ -0,0 +1,82 @@
"""
Credits @arianpasquali
https://gist.githubusercontent.com/arianpasquali/16b2b0ab2095ee35dbede4dd2f4f8520/raw/ba4ea7da0d958fc4b1b2e694f45f17cc71d8238d/yake_rest_api.py
The simple example serving YAKE as a rest api.
instructions:
pip install flasgger
pip install git+https://github.com/LIAAD/yake
python yake_rest_api.py
open http://127.0.0.1:5000/apidocs/
"""

from flask import Flask, jsonify, request

from flasgger import Swagger
import yake

app = Flask(__name__)
app.config['SWAGGER'] = {
'title': 'Yake API sample'
}
Swagger(app)

@app.route('/yake/',methods=['POST'])
def handle_yake():
"""Example endpoint return a list of keywords using YAKE
---
parameters:
- name: text
in: formData
type: string
description: text
required: true
- name: language
in: formData
type: string
description: language
required: true
default: "en"
enum: ["pt", "en", "es", "fr", "it", "de" ]
- name: max_ngram_size
in: formData
type: integer
description: max size of ngram
required: true
default: 4
- name: number_of_keywords
in: formData
type: integer
description: number of keywords to return
required: true
default: 10
responses:
200:
description: Extract keywords from input text
"""
print(request.form)
text = request.form["text"]
language = request.form["language"]
max_ngram_size = int(request.form["max_ngram_size"])
number_of_keywords = int(request.form["number_of_keywords"])

my_yake = yake.KeywordExtractor(lan=language,
n=max_ngram_size,
top=number_of_keywords,
dedupLim=0.8,
windowsSize=2
)

keywords = my_yake.extract_keywords(text)
result = [{"ngram":x[1] ,"score":x[0]} for x in keywords]
return jsonify(result)



if __name__ == "__main__":
app.run(host='0.0.0.0', debug=True)
@@ -0,0 +1,15 @@
FROM library/python:3.7.1-alpine

# change to temp dir
WORKDIR /temp

# install git and build-base (GCC, etc.)
RUN apk update && apk upgrade && \
apk add --no-cache bash git openssh && \
apk add build-base

# install yake via pip
RUN pip install git+https://github.com/liaad/yake.git

# set default command
ENTRYPOINT ["yake"]
@@ -0,0 +1,22 @@
#!/usr/bin/env bash
if [ $# -eq 0 ]
then
tag='latest'
else
tag=$1
fi

INITIAL_DIR=$(pwd)
DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" >/dev/null 2>&1 && pwd )"

cd "$DIR/Dockerfiles/yake-server"
docker build -t feupinfolab/yake-server:$tag .
docker run -p 5000:5000 feupinfolab/yake-server:$tag

cd "$DIR/Dockerfiles/yake"
docker build -t feupinfolab/yake:$tag .
docker run -d feupinfolab/yake:$tag

docker ps -a

cd $INITIAL_DIR
@@ -0,0 +1,41 @@
#!/usr/bin/env bash
YAKE_PORT="5000"

function wait_for_server_to_boot_on_port()
{
local ip=$1
local port=$2

if [[ $ip == "" ]]; then
ip="127.0.0.1"
fi
local attempts=0
local max_attempts=60

echo "Waiting for server on $ip:$port to boot up..."

response=$(curl -s $ip:$port)
echo $response

until $(curl --output /dev/null --silent --head --fail http://$ip:$port) || [[ $attempts > $max_attempts ]]; do
attempts=$((attempts+1))
echo "waiting... (${attempts}/${max_attempts})"
sleep 1;
done

if (( $attempts == $max_attempts ));
then
echo "Server on $ip:$port failed to start after $max_attempts"
elif (( $attempts < $max_attempts ));
then
echo "Server on $ip:$port started successfully at attempt (${attempts}/${max_attempts})"
fi
}

wait_for_server_to_boot_on_port "127.0.0.1" "$YAKE_PORT"

curl "http://127.0.0.1:5000/yake/v2/extract_keywords?max_ngram_size=3&number_of_keywords=30" \
--header "Content-Type: \"application/x-www-form-urlencoded\"" \
--header "Accept: \"application/json\"" \
--request "POST" \
--data "{\"content\":\"Caffeine is a central nervous system (CNS) stimulant of the methylxanthine class.[10] It is the world\'s most widely consumed psychoactive drug. Unlike many other psychoactive substances, it is legal and unregulated in nearly all parts of the world. There are several known mechanisms of action to explain the effects of caffeine. The most prominent is that it reversibly blocks the action of adenosine on its receptor and consequently prevents the onset of drowsiness induced by adenosine. Caffeine also stimulates certain portions of the autonomic nervous system.\"}"

0 comments on commit 2f4253a

Please sign in to comment.
You can’t perform that action at this time.