New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding e2e test #47

Closed
wants to merge 1 commit into
base: nyt-untouched
from
Jump to file or symbol
Failed to load files and symbols.
+164,630 −4
Diff settings

Always

Just for now

Adding e2e test

  • Loading branch information...
mtlynch committed Jul 27, 2018
commit 5876e039a6e5dd36373c94bd793c83d7457034a6
Copy path View file
@@ -0,0 +1,8 @@
---
sudo: required
dist: trusty
language: python
python:
- "2.7"
services: docker
script: ./docker_build
Copy path View file
@@ -13,8 +13,6 @@ RUN git clone https://github.com/mtlynch/crfpp.git && \
cd ..
# Install ingredient-phrase-tagger.
RUN git clone https://github.com/NYTimes/ingredient-phrase-tagger && \
cd ingredient-phrase-tagger && \
python setup.py install
ADD . /ingredient-phrase-tagger
WORKDIR /ingredient-phrase-tagger
RUN python setup.py install
Copy path View file
@@ -0,0 +1,46 @@
#!/bin/sh
# Exit build script on first failure
set -e
# Echo commands to stdout.
set -x
COUNT_TRAIN=20000
COUNT_TEST=2000
OUTPUT_DIR=$(mktemp -d)
ACTUAL_CRF_TRAINING_FILE="${OUTPUT_DIR}/training_data.crf"
ACTUAL_CRF_TESTING_FILE="${OUTPUT_DIR}/testing_data.crf"
ACTUAL_CRF_MODEL_FILE="${OUTPUT_DIR}/model.crfmodel"
ACTUAL_TESTING_OUTPUT_FILE="${OUTPUT_DIR}/testing_output"
ACTUAL_EVAL_OUTPUT_FILE="${OUTPUT_DIR}/eval_output"
bin/generate_data \
--data-path=nyt-ingredients-snapshot-2015.csv \
--count=$COUNT_TRAIN \
--offset=0 > "$ACTUAL_CRF_TRAINING_FILE"
bin/generate_data \
--data-path=nyt-ingredients-snapshot-2015.csv \
--count=$COUNT_TEST \
--offset=$COUNT_TRAIN > "$ACTUAL_CRF_TESTING_FILE"
crf_learn template_file "$ACTUAL_CRF_TRAINING_FILE" "$ACTUAL_CRF_MODEL_FILE"
crf_test \
-m "$ACTUAL_CRF_MODEL_FILE" \
"$ACTUAL_CRF_TESTING_FILE" > "$ACTUAL_TESTING_OUTPUT_FILE"
python bin/evaluate.py "$ACTUAL_TESTING_OUTPUT_FILE" > "$ACTUAL_EVAL_OUTPUT_FILE"
# Check against golden output.
GOLDEN_DIR=tests/golden
GOLDEN_CRF_TRAINING_FILE="${GOLDEN_DIR}/training_data.crf"
GOLDEN_CRF_TESTING_FILE="${GOLDEN_DIR}/testing_data.crf"
GOLDEN_TESTING_OUTPUT_FILE="${GOLDEN_DIR}/testing_output"
GOLDEN_EVAL_OUTPUT_FILE="${GOLDEN_DIR}/eval_output"
diff --context=2 "$GOLDEN_CRF_TRAINING_FILE" "$ACTUAL_CRF_TRAINING_FILE"
diff --context=2 "$GOLDEN_CRF_TESTING_FILE" "$ACTUAL_CRF_TESTING_FILE"
diff --context=2 "$GOLDEN_TESTING_OUTPUT_FILE" "$ACTUAL_TESTING_OUTPUT_FILE"
diff "$GOLDEN_EVAL_OUTPUT_FILE" "$ACTUAL_EVAL_OUTPUT_FILE"
Copy path View file
@@ -0,0 +1,21 @@
#!/bin/bash
# Exit on first failing command.
set -e
# Echo commands to console.
set -x
IMAGE_NAME="ingredient-phrase-tagger-image"
CONTAINER_NAME="ingredient-phrase-tagger-container"
docker build \
--tag "$IMAGE_NAME" \
.
docker run \
--tty \
--detach \
--name "$CONTAINER_NAME" \
"$IMAGE_NAME"
docker exec "$CONTAINER_NAME" ./build.sh
Copy path View file
@@ -0,0 +1,10 @@
Sentence-Level Stats:
correct: 1493
total: 1999
% correct: 74.6873436718
Word-Level Stats:
correct: 10402
total: 11450
% correct: 90.8471615721
Oops, something went wrong.
ProTip! Use n and p to navigate between commits in a pull request.