Skip to content

Commit

Permalink
Make jiant pip installable (#887)
Browse files Browse the repository at this point in the history
* Create initial package for PyPI (pip) distribution

* Working PyPI package - can run tutorial experiment

This works for the tutorial experiment, but is still a work-in-progress.

* Add pip dependencies - matched environment.yml as closely as possible
* Move config into jiant package

* PEP 8 long strings changed to multi-line

* Update CI demo.conf path

* Use locally installed pip package for CircleCI

* Update call to ipdb

* Update config paths in non-python files

* Document release and distribution

* Name fix.

* Add note to 'Contributing'
  • Loading branch information
davidbenton authored and sleepinyourhat committed Aug 25, 2019
1 parent 3f19d61 commit b11d710
Show file tree
Hide file tree
Showing 43 changed files with 834 additions and 681 deletions.
39 changes: 24 additions & 15 deletions .circleci/config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
jobs:
test:
docker:
- image: continuumio/miniconda3:4.6.14
- image: circleci/python:3.7.4

working_directory: ~/repo

Expand All @@ -14,39 +14,48 @@
steps:
# Step 1: obtain repo from GitHub
- checkout
# Step 2: style check
# Step 2: restore pip dependencies from cache, keyed on branch and content of setup.py
- restore_cache:
key: deps1-{{ .Branch }}-{{ checksum "setup.py" }}
# Step 3: style check
- run:
name: check style
command: |
pip install black
black . --check --exclude "/(\.eggs|\.git|\.hg|\.mypy_cache|\.nox|\.tox|\.venv|_build|buck-out|build|dist|jiant/modules/cove)/"
# Step 3: get data needed for demo.sh
pip install --user black
black . --check --exclude "/(\.eggs|\.git|\.hg|\.mypy_cache|\.nox|\.tox|\.venv|_build|buck-out|build|dist|jiant/modules/cove|venv)/"
# Step 4: get data needed for demo.sh
- run:
name: download demo data
command: |
python scripts/download_glue_data.py --data_dir $JIANT_DATA_DIR --tasks MRPC,STS,SST
python scripts/download_superglue_data.py --data_dir $JIANT_DATA_DIR --tasks CB
# Step 4: create virtual env and install dependencies
# Step 5: create virtual env and install dependencies
- run:
name: install dependencies
command: |
apt-get update
apt-get install -y cmake build-essential gcc g++
conda env create -q -f environment.yml
# Step 5: demo.sh run test
python3 -m venv venv
source venv/bin/activate
pip install -e ~/repo
pip install nose2
# Step 6: cache dependencies for future CI jobs
- save_cache:
key: deps1-{{ .Branch }}-{{ checksum "setup.py" }}
paths:
- "venv"
# Step 7: demo.sh run test
- run:
name: running demo.sh
command: |
source activate jiant
source venv/bin/activate
python -m nltk.downloader perluniprops nonbreaking_prefixes punkt
python main.py --config_file config/demo.conf
python main.py --config_file config/demo.conf --overrides "do_pretrain = 0, do_target_task_training = 0, load_model = 1"
# Step 6: run tests
python main.py --config_file jiant/config/demo.conf
python main.py --config_file jiant/config/demo.conf --overrides "do_pretrain = 0, do_target_task_training = 0, load_model = 1"
# Step 8: run tests
- run:
name: run tests
command: |
source venv/bin/activate
mkdir ~/repo/test_output
source activate jiant
nose2 -v
workflows:
version: 2
Expand Down
8 changes: 8 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -8,3 +8,11 @@ user_config.sh
.ipynb_checkpoints/
perluniprops/
.DS_Store

# package/distribution artifacts
build/
dist/
.eggs/
*.egg-info/
*.egg
pip-wheel-metadata/
1 change: 1 addition & 0 deletions MANIFEST.in
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
recursive-include jiant/config/ *.conf
20 changes: 18 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,9 +23,9 @@ Our official documentation is here: https://jiant.info/documentation#/


## Running
To run an experiment, make a config file similar to `config/demo.conf` with your model configuration. In addition, you can use the `--overrides` flag to override specific variables. For example:
To run an experiment, make a config file similar to `jiant/config/demo.conf` with your model configuration. In addition, you can use the `--overrides` flag to override specific variables. For example:
```sh
python main.py --config_file config/demo.conf \
python main.py --config_file jiant/config/demo.conf \
--overrides "exp_name = my_exp, run_name = foobar, d_hid = 256"
```
will run the demo config, but output to `$JIANT_PROJECT_PREFIX/my_exp/foobar`.
Expand Down Expand Up @@ -80,6 +80,22 @@ For any PR, make sure to update any existing `conf` files, tutorials, and script

For PRs that typical users will need to be aware of, include make a matching PR to the [documentation](https://github.com/nyu-mll/jiant-site/edit/master/documentation/README.md). We will merge that documentation PR once the original PR is merged in _and pushed out in a release_. (Proposals for better ways to do this are welcome.)

For PRs that change package dependencies, update both `environment.yml` (used for conda) and `setup.py` (used by pip, and in automatic CircleCI tests).

## Releases

Releases are identified using git tags and distributed via PyPI for pip installation. After passing CI tests and creating a new git tag for a release, it can be uploaded to PyPI by running:

```bash
# create distribution
python setup.py sdist bdist_wheel

# upload to PyPI
python -m twine upload dist/*
```

More details can be found in [setup.py](setup.py).


## License

Expand Down
2 changes: 1 addition & 1 deletion gcp/kubernetes/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -117,7 +117,7 @@ The `run_batch.sh` script handles creating an appropriate config on the fly; see
```sh
export JIANT_PATH="/nfs/jiant/home/$USER/jiant"
./run_batch.sh <job_name> \
"python $JIANT_PATH/main.py --config_file $JIANT_PATH/config/demo.conf --overrides 'run_name = kubernetes-demo, target_tasks = \"wnli,commitbank\"'"
"python $JIANT_PATH/main.py --config_file $JIANT_PATH/jiant/config/demo.conf --overrides 'run_name = kubernetes-demo, target_tasks = \"wnli,commitbank\"'"
```

You should see your job as `<job_name>` in Kubernetes Engine -> Workloads, and can monitor status, resource usage, and logs from that page.
6 changes: 3 additions & 3 deletions gcp/kubernetes/run_batch.sh
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@
# Example usage:
# export JIANT_PATH="/nfs/jiant/home/$USER/jiant"
# ./run_batch.sh <job_name> "python $JIANT_PATH/main.py \
# --config_file $JIANT_PATH/config/demo.conf \
# --config_file $JIANT_PATH/jiant/config/demo.conf \
# --notify <your_email_address>"
#
# You can specify additional arguments as flags:
Expand All @@ -21,7 +21,7 @@
#
# For example:
# ./run_batch.sh -p demos -m k80 jiant-demo \
# "python $JIANT_PATH/main.py --config_file $JIANT_PATH/config/demo.conf"
# "python $JIANT_PATH/main.py --config_file $JIANT_PATH/jiant/config/demo.conf"
#
# will run as job name 'demos.jiant-demo' and write results to /nfs/jsalt/exp/demos
#
Expand All @@ -32,7 +32,7 @@ GPU_TYPE="p100"
PROJECT_NAME="$USER"
NOTIFY_EMAIL=""

# Get the NFS path from the Kubernetes config, so that it doesn't need to be
# Get the NFS path from the Kubernetes config, so that it doesn't need to be
# hardcoded here.
pushd $(dirname $0)/templates
NFS_EXP_DIR=$(jsonnet -S -e "local env = import 'jiant_env.libsonnet'; env.nfs_exp_dir")
Expand Down

0 comments on commit b11d710

Please sign in to comment.