This is a repository for a Flask application which serves GenParse requests from a GCP instance.
-
Create a GCP instance (ideally with at least one GPU).
-
Setup a static external IP address for this instance:
- Promote the VM’s ephemeral external address to a static one by following the instructions here.
- Allow HTTP and HTTPS traffic.
- Expose ports
8888and9999. Inprobcomp-caliban, this can be done by adding thecalibanandexpose-port-9999network tags.
-
SSH into the GCP instance.
-
Clone this repository
-
Setup
genparseand download server dependencies. To do this, you can either- run
./setup.sh, which will create a conda environment withgenparseand the repositories dependencies; or - Setup
genparseand this repository manually:- Setup
genparseby running:We recommend setting up a genparse in a conda environment with python 3.10.git clone git@github.com:probcomp/genparse.git cd genparse make env - Install flask and waitress with
pip install flask waitress
- Setup
- run
-
Create a service file and start the server with
sh start.sh path-to-genparse-conda-env
For example:
sh start.sh /home/lebrunb/miniconda3/envs/genparse
-
Check the status of the server with
sudo systemctl status genparse-server-app.service.- Error and log files are written to
src/genparse_server/log. Check these logs after initializing. - Note: You may need to authenticate with huggingface. To do so, run
huggingface-cli login.
- Error and log files are written to
-
You can restart the server with
sudo systemctl restart genparse-server-app.service.
This repository also contains a Flask application to remotely restart the GenParser server in case of failure. This application is located in src/restart_service_app and can be accessed at <STATIC-IP>:9999/ in a browser. This app will be started when you run start.sh, and you can check its status with sudo systemctl status restart-service-app.service.
Once setup, the server should accept requests made to <STATIC-IP>:8888/infer.
import requests
url = '<STATIC-IP>:8888/infer'
headers = {'Content-Type': 'application/json'}
data = {
"prompt": "",
"method": "smc-standard",
"n_particles": 5,
"lark_grammar": "start: \"Sequential Monte Carlo is \" ( \"good\" | \"bad\" )",
"proposal_name": "character",
"proposal_args": {},
"max_tokens": 25
}
response = requests.post(url, json=data, headers=headers)curl -X POST <STATIC-IP>:8888/infer \
-H "Content-Type: application/json" \
-d '{
"prompt": "",
"method": "smc-standard",
"n_particles": 5,
"lark_grammar": "start: \"Sequential Monte Carlo is \" ( \"good\" | \"bad\" )",
"proposal_name": "character",
"proposal_args": {},
"max_tokens": 25
}'