Skip to content
This repository has been archived by the owner on Dec 16, 2022. It is now read-only.

Commit

Permalink
Switch to torchvision for vision components 👀, simplify and improve M…
Browse files Browse the repository at this point in the history
…ultiProcessDataLoader (#4821)

* implement TorchImageLoader

* implement ResnetBackbone

* add resize + normalize to image loader

* finalize FasterRcnnRegionDetector

* pin torchvision

* fix VQAv2Reader

* add box mask field

* dataset reader fixes

* fix model tests

* doc fixes

* add threshold parameters to FasterRcnnRegionDetector

* address @dirkgr comments

* mask fixes

* shape comments

* add some more comments

* cache answers_by_question_id

* implement LocalCacheResource

* fix

* add read-only option to cache

* fix

* simplify data loader

* make featurizer and detector optional in readers

* Cache in memory

* back pressure is important I guess

* merge

* Updated configs

* Fixes the way we apply masks

* Use more of Jiasen's real settings

* Upgrade the from_huggingface config

* Switch back to the images on corpnet

* Fix random seeds

* Bigger model needs smaller batch size

* Adds ability to selectively ignore one input

* address some comments

* format + lint

* fixes

* Bring back bert-base configs

* fix error handling

* fix test

* fix typo

* use lock when possible

Co-authored-by: Dirk Groeneveld <dirkg@allenai.org>
  • Loading branch information
epwalsh and dirkgr committed Dec 17, 2020
1 parent 3da8e62 commit c4e3f77
Show file tree
Hide file tree
Showing 46 changed files with 1,441 additions and 1,479 deletions.
4 changes: 1 addition & 3 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -293,10 +293,8 @@ jobs:
# Check the install instructions on https://pytorch.org/ to keep these up-to-date.
if [[ $CUDA == '10.2' ]]; then
echo "DOCKER_TORCH_VERSION='torch==1.7.0 torchvision==0.8.1'" >> $GITHUB_ENV;
echo "DOCKER_DETECTRON_VERSION='detectron2 -f https://dl.fbaipublicfiles.com/detectron2/wheels/cu102/torch1.7/index.html'" >> $GITHUB_ENV;
elif [[ $CUDA == '11.0' ]]; then
echo "DOCKER_TORCH_VERSION='torch==1.7.0+cu110 torchvision==0.8.1+cu110 -f https://download.pytorch.org/whl/torch_stable.html'" >> $GITHUB_ENV;
echo "DOCKER_DETECTRON_VERSION='detectron2 -f https://dl.fbaipublicfiles.com/detectron2/wheels/cu110/torch1.7/index.html'" >> $GITHUB_ENV;
else
echo "Unhandled CUDA version $CUDA";
exit 1;
Expand All @@ -314,7 +312,7 @@ jobs:
- name: Build image
run: |
make docker-image DOCKER_IMAGE_NAME="$DOCKER_IMAGE_NAME" DOCKER_TORCH_VERSION="$DOCKER_TORCH_VERSION" DOCKER_DETECTRON_VERSION="$DOCKER_DETECTRON_VERSION"
make docker-image DOCKER_IMAGE_NAME="$DOCKER_IMAGE_NAME" DOCKER_TORCH_VERSION="$DOCKER_TORCH_VERSION"
- name: Test image
run: |
Expand Down
8 changes: 3 additions & 5 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -18,12 +18,10 @@ LABEL com.nvidia.volumes.needed="nvidia_driver"

WORKDIR /stage/allennlp

# Install torch and detectron2 first. This build arg should be in the form of a version requirement,
# Install torch ecosystem first. This build arg should be in the form of a version requirement,
# like 'torch==1.7' or 'torch==1.7+cu102 -f https://download.pytorch.org/whl/torch_stable.html'.
ARG TORCH
RUN pip install --no-cache-dir ${TORCH}
ARG DETECTRON
RUN pip install --no-cache-dir ${DETECTRON}

# Installing AllenNLP's dependencies is the most time-consuming part of building
# this Docker image, so we make use of layer caching here by adding the minimal files
Expand All @@ -32,11 +30,11 @@ COPY allennlp/version.py allennlp/version.py
COPY setup.py .
RUN touch allennlp/__init__.py \
&& touch README.md \
&& pip install --no-cache-dir -e .[vision]
&& pip install --no-cache-dir -e .

# Now add the full package source and re-install just the package.
COPY allennlp allennlp
RUN pip install --no-cache-dir --no-deps -e .[vision]
RUN pip install --no-cache-dir --no-deps -e .

WORKDIR /app/

Expand Down
9 changes: 3 additions & 6 deletions Dockerfile.test
Original file line number Diff line number Diff line change
Expand Up @@ -17,12 +17,10 @@ LABEL com.nvidia.volumes.needed="nvidia_driver"

WORKDIR /stage/allennlp

# Install torch and detectron2 first. This build arg should be in the form of a version requirement,
# Install torch ecosystem first. This build arg should be in the form of a version requirement,
# like 'torch==1.7' or 'torch==1.7+cu102 -f https://download.pytorch.org/whl/torch_stable.html'.
ARG TORCH
RUN pip install --no-cache-dir ${TORCH}
ARG DETECTRON
RUN pip install --no-cache-dir ${DETECTRON}

# Installing AllenNLP's dependencies is the most time-consuming part of building
# this Docker image, so we make use of layer caching here by adding the minimal files
Expand All @@ -32,11 +30,10 @@ COPY setup.py .
COPY dev-requirements.txt .
RUN touch allennlp/__init__.py \
&& touch README.md \
&& grep -Ev 'detectron' dev-requirements.txt \
| pip install --no-cache-dir -e .[vision] -r /dev/stdin
&& pip install --no-cache-dir -e . -r dev-requirements.txt

# Now add the full package source and re-install just the package.
COPY . .
RUN pip install --no-cache-dir --no-deps -e .[vision]
RUN pip install --no-cache-dir --no-deps -e .

ENTRYPOINT ["make"]
9 changes: 1 addition & 8 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -14,10 +14,8 @@ DOCKER_TAG = latest
DOCKER_IMAGE_NAME = allennlp/allennlp:$(DOCKER_TAG)
DOCKER_TEST_IMAGE_NAME = allennlp/test:$(DOCKER_TAG)
DOCKER_TORCH_VERSION = 'torch==1.7.0 torchvision==0.8.1'
DOCKER_DETECTRON_VERSION = 'detectron2 -f https://dl.fbaipublicfiles.com/detectron2/wheels/cu102/torch1.7/index.html'
# Our self-hosted runner currently has CUDA 11.0.
DOCKER_TEST_TORCH_VERSION = 'torch==1.7.0+cu110 torchvision==0.8.1+cu110 -f https://download.pytorch.org/whl/torch_stable.html'
DOCKER_TEST_DETECTRON_VERSION = 'detectron2 -f https://dl.fbaipublicfiles.com/detectron2/wheels/cu110/torch1.7/index.html'
DOCKER_RUN_CMD = docker run --rm \
-v $$HOME/.allennlp:/root/.allennlp \
-v $$HOME/.cache/huggingface:/root/.cache/huggingface \
Expand Down Expand Up @@ -87,10 +85,7 @@ install :
# Due to a weird thing with pip, we may need egg-info before running `pip install -e`.
# See https://github.com/pypa/pip/issues/4537.
python setup.py install_egg_info
# Install allennlp as editable and all dependencies except detectron since it requires torch to already be installed.
grep -Ev 'detectron' dev-requirements.txt | pip install --upgrade --upgrade-strategy eager -e .[vision] -r /dev/stdin
# Now install detectron.
grep -E 'detectron' dev-requirements.txt | pip install --upgrade -r /dev/stdin
pip install --upgrade --upgrade-strategy eager -e . -r dev-requirements.txt

#
# Documention helpers.
Expand Down Expand Up @@ -149,7 +144,6 @@ docker-image :
--pull \
-f Dockerfile \
--build-arg TORCH=$(DOCKER_TORCH_VERSION) \
--build-arg DETECTRON=$(DOCKER_DETECTRON_VERSION) \
-t $(DOCKER_IMAGE_NAME) .

.PHONY : docker-run
Expand All @@ -162,7 +156,6 @@ docker-test-image :
--pull \
-f Dockerfile.test \
--build-arg TORCH=$(DOCKER_TEST_TORCH_VERSION) \
--build-arg DETECTRON=$(DOCKER_TEST_DETECTRON_VERSION) \
-t $(DOCKER_TEST_IMAGE_NAME) .

.PHONY : docker-test-run
Expand Down
4 changes: 1 addition & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -150,10 +150,8 @@ to distribute as a plugin, see the [subcommand API docs](https://docs.allennlp.o

AllenNLP requires Python 3.6.1 or later and [PyTorch](https://pytorch.org/).
It's recommended that you install the PyTorch ecosystem **before** installing AllenNLP by following the instructions on [pytorch.org](https://pytorch.org/).
If you intend to utilize the vision features of this library, you'll also need to install [detectron2](https://github.com/facebookresearch/detectron2), which requires PyTorch to be installed first.

The preferred way to install AllenNLP is via `pip`. Just run `pip install allennlp`. Or, if you want the vision
features of the library, run `pip install allennlp[vision]`.
The preferred way to install AllenNLP is via `pip`. Just run `pip install allennlp`.

> ⚠️ If you're using Python 3.7 or greater, you should ensure that you don't have the PyPI version of `dataclasses` installed after running the above command, as this could cause issues on certain platforms. You can quickly check this by running `pip freeze | grep dataclasses`. If you see something like `dataclasses=0.6` in the output, then just run `pip uninstall -y dataclasses`.
Expand Down

0 comments on commit c4e3f77

Please sign in to comment.