Skip to content
This repository has been archived by the owner on Dec 16, 2022. It is now read-only.

Commit

Permalink
Merge remote-tracking branch 'origin/master' into vision
Browse files Browse the repository at this point in the history
  • Loading branch information
dirkgr committed Oct 6, 2020
2 parents f1e46fd + 39ddb52 commit e39a5f6
Show file tree
Hide file tree
Showing 54 changed files with 1,381 additions and 345 deletions.
5 changes: 1 addition & 4 deletions .github/ISSUE_TEMPLATE/bug_report.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,10 +11,7 @@ assignees: ''
Please fill this template entirely and do not erase any of it.
We reserve the right to close without a response bug reports which are incomplete.
If you can't fill in the checklist then it's likely that this is a question, not a bug,
in which case it probably belongs on our discource forum instead:
https://discourse.allennlp.org/
If you have a question rather than a bug, please ask on [Stack Overflow](https://stackoverflow.com/questions/tagged/allennlp) rather than posting an issue here.
-->

## Checklist
Expand Down
10 changes: 10 additions & 0 deletions .github/ISSUE_TEMPLATE/question.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
---
name: Question
about: Ask a question
title: ''
labels: 'question'
assignees: ''

---

Please ask questions on [Stack Overflow](https://stackoverflow.com/questions/tagged/allennlp) rather than on GitHub. We monitor and triage questions on Stack Overflow with the AllenNLP label and questions there are more easily searchable for others.
6 changes: 3 additions & 3 deletions .github/workflows/master.yml
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ jobs:
runs-on: ubuntu-latest
strategy:
matrix:
python: ['3.6', '3.7']
python: ['3.7', '3.8']

steps:
- uses: actions/checkout@v2
Expand Down Expand Up @@ -110,7 +110,7 @@ jobs:
runs-on: ubuntu-latest
strategy:
matrix:
python: ['3.6', '3.7']
python: ['3.7', '3.8']

steps:
- uses: actions/checkout@v2
Expand Down Expand Up @@ -230,7 +230,7 @@ jobs:
runs-on: ubuntu-latest
strategy:
matrix:
python: ['3.6', '3.7']
python: ['3.7', '3.8']

steps:
- name: Setup Python
Expand Down
4 changes: 2 additions & 2 deletions .github/workflows/pull_request.yml
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@ jobs:
runs-on: ubuntu-latest
strategy:
matrix:
python: ['3.6', '3.7']
python: ['3.7', '3.8']

steps:
- uses: actions/checkout@v2
Expand Down Expand Up @@ -118,7 +118,7 @@ jobs:
runs-on: ubuntu-latest
strategy:
matrix:
python: ['3.6', '3.7']
python: ['3.7', '3.8']

steps:
- uses: actions/checkout@v2
Expand Down
43 changes: 41 additions & 2 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,19 +29,58 @@ data loaders. Those are coming soon.

### Added

- Added a `build-vocab` subcommand that can be used to build a vocabulary from a training config file.
- Added `tokenizer_kwargs` argument to `PretrainedTransformerMismatchedIndexer`.
- Added `tokenizer_kwargs` and `transformer_kwargs` arguments to `PretrainedTransformerMismatchedEmbedder`.
- Added official support for Python 3.8.
- Added a script: `scripts/release_notes.py`, which automatically prepares markdown release notes from the
CHANGELOG and commit history.
- Added a flag `--predictions-output-file` to the `evaluate` command, which tells AllenNLP to write the
predictions from the given dataset to the file as JSON lines.
- Added the ability to ignore certain missing keys when loading a model from an archive. This is done
by adding a class-level variable called `authorized_missing_keys` to any PyTorch module that a `Model` uses.
If defined, `authorized_missing_keys` should be a list of regex string patterns.
- Added `FBetaMultiLabelMeasure`, a multi-label Fbeta metric. This is a subclass of the existing `FBetaMeasure`.
- Added ability to pass additional key word arguments to `cached_transformers.get()`, which will be passed on to `AutoModel.from_pretrained()`.
- Added an `overrides` argument to `Predictor.from_path()`.
- Added a `cached-path` command.
- Added a function `inspect_cache` to `common.file_utils` that prints useful information about the cache. This can also
be used from the `cached-path` command with `allennlp cached-path --inspect`.
- Added a function `remove_cache_entries` to `common.file_utils` that removes any cache entries matching the given
glob patterns. This can used from the `cached-path` command with `allennlp cached-path --remove some-files-*`.

### Changed

- Subcommands that don't require plugins will no longer cause plugins to be loaded or have an `--include-package` flag.
- Allow overrides to be JSON string or `dict`.
- `transformers` dependency updated to version 3.1.0.
- When `cached_path` is called on a local archive with `extract_archive=True`, the archive is now extracted into a unique subdirectory of the cache root instead of a subdirectory of the archive's directory. The extraction directory is also unique to the modification time of the archive, so if the file changes, subsequent calls to `cached_path` will know to re-extract the archive.
- Removed the `truncation_strategy` parameter to `PretrainedTransformerTokenizer`. The way we're calling the tokenizer, the truncation strategy takes no effect anyways.

### Removed

- Removed `common.util.is_master` function.

### Fixed

- Ignore *args when constructing classes with `FromParams`.
- Ensured some consistency in the types of the values that metrics return
- Class decorators now displayed in API docs.
- Fixed up the documentation for the `allennlp.nn.beam_search` module.
- Ignore `*args` when constructing classes with `FromParams`.
- Ensured some consistency in the types of the values that metrics return.
- Fix a PyTorch warning by explicitly providing the `as_tuple` argument (leaving
it as its default value of `False`) to `Tensor.nonzero()`.
- Remove temporary directory when extracting model archive in `load_archive`
at end of function rather than via `atexit`.
- Fixed a bug where using `cached_path()` offline could return a cached resource's lock file instead
of the cache file.
- Fixed a bug where `cached_path()` would fail if passed a `cache_dir` with the user home shortcut `~/`.
- Fixed a bug in our doc building script where markdown links did not render properly
if the "href" part of the link (the part inside the `()`) was on a new line.
- Changed how gradients are zeroed out with an optimization. See [this video from NVIDIA](https://www.youtube.com/watch?v=9mS1fIYj1So)
at around the 9 minute mark.
- Fixed a bug where parameters to a `FromParams` class that are dictionaries wouldn't get logged
when an instance is instantiated `from_params`.
- Fixed a bug in distributed training where the vocab would be saved from every worker, when it should have been saved by only the local master process.

## [v1.1.0](https://github.com/allenai/allennlp/releases/tag/v1.1.0) - 2020-09-08

Expand Down
2 changes: 1 addition & 1 deletion CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -74,7 +74,7 @@ When you're ready to contribute code to address an open issue, please follow the
upstream https://github.com/allenai/allennlp.git (fetch)
upstream https://github.com/allenai/allennlp.git (push)

Finally, you'll need to create a Python 3.6 or 3.7 virtual environment suitable for working on AllenNLP. There a number of tools out there that making working with virtual environments easier, but the most direct way is with the [`venv` module](https://docs.python.org/3.7/library/venv.html) in the standard library.
Finally, you'll need to create a Python 3 virtual environment suitable for working on AllenNLP. There a number of tools out there that making working with virtual environments easier, but the most direct way is with the [`venv` module](https://docs.python.org/3.7/library/venv.html) in the standard library.

Once your virtual environment is activated, you can install your local clone in "editable mode" with

Expand Down
2 changes: 1 addition & 1 deletion Dockerfile
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# This Dockerfile creates an environment suitable for downstream usage of AllenNLP.
# It's built from a wheel installation of allennlp.

FROM python:3.7
FROM python:3.8

ENV LC_ALL=C.UTF-8
ENV LANG=C.UTF-8
Expand Down
2 changes: 1 addition & 1 deletion Dockerfile.test
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Used to build an image for running tests.

FROM python:3.7
FROM python:3.8

ENV LC_ALL=C.UTF-8
ENV LANG=C.UTF-8
Expand Down
9 changes: 5 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,8 +29,9 @@

- [Website](https://allennlp.org/)
- [Guide](https://guide.allennlp.org/)
- [Forum](https://discourse.allennlp.org)
- [Documentation](https://docs.allennlp.org/) ( [latest](https://docs.allennlp.org/latest/) | [stable](https://docs.allennlp.org/stable/) | [master](https://docs.allennlp.org/master/) )
- [Forum](https://discourse.allennlp.org)
- [Stack Overflow](https://stackoverflow.com/questions/tagged/allennlp)
- [Contributing Guidelines](CONTRIBUTING.md)
- [Officially Supported Models](https://github.com/allenai/allennlp-models)
- [Pretrained Models](https://github.com/allenai/allennlp-models/blob/master/allennlp_models/pretrained.py)
Expand All @@ -49,7 +50,7 @@ created a couple of template repositories that you can use as a starting place:
* If you'd prefer to use python code to configure your experiments and run your training loop, use
[this template](https://github.com/allenai/allennlp-template-python-script). There are a few
things that are currently a little harder in this setup (loading a saved model, and using
distributed training), but except for those its functionality is equivalent to the config files
distributed training), but otherwise it's functionality equivalent to the config files
setup.

In addition, there are external tutorials:
Expand Down Expand Up @@ -105,12 +106,12 @@ We support AllenNLP on Mac and Linux environments. We presently do not support W
#### Setting up a virtual environment

[Conda](https://conda.io/) can be used set up a virtual environment with the
version of Python required for AllenNLP. If you already have a Python 3.6 or 3.7
version of Python required for AllenNLP. If you already have a Python 3
environment you want to use, you can skip to the 'installing via pip' section.

1. [Download and install Conda](https://conda.io/projects/conda/en/latest/user-guide/install/index.html).

2. Create a Conda environment with Python 3.7:
2. Create a Conda environment with Python 3.7 (3.6 or 3.8 would work as well):

```
conda create -n allennlp python=3.7
Expand Down
59 changes: 29 additions & 30 deletions RELEASE_PROCESS.md
Original file line number Diff line number Diff line change
@@ -1,63 +1,62 @@
# AllenNLP GitHub and PyPI Release Process

This document describes the procedure for releasing new versions of the core library.
Most of the heavy lifting is actually done on GitHub Actions.
All you have to do is ensure the version in `allennlp/version.py` matches the target release version
and then trigger a GitHub release with the right tag.

> ❗️ This assumes you are using a clone of the main repo with the remote `origin` pointed
to `git@github.com:allenai/allennlp.git` (or the `HTTPS` equivalent).

The format of the tag should be `v{VERSION}`, i.e. the intended version of the release preceeded with a `v`.
So for the version `1.0.0` release the tag will be `v1.0.0`.

To make things easier, start by setting the tag to an environment variable, `TAG`.
Then you can copy and paste the commands below without worrying about mistyping the tag.

## Steps

1. Update `allennlp/version.py` (if needed) with the correct version and the `CHANGELOG.md` so that everything under the "Unreleased" section is now under a section corresponding to this release. Then commit and push these changes with:
1. Set the environment variable `TAG`, which should be of the form `v{VERSION}`.

For example, if the version of the release is `1.0.0`, you should set `TAG` to `v1.0.0`:

```bash
export TAG='v1.0.0'
```
git commit -a -m "Prepare for release $TAG"
git push

Or if you use `fish`:

```fish
set -x TAG 'v1.0.0'
```

At this point `echo $TAG` should exactly match the output of `./scripts/get_version.py current`.

2. Then add the tag in git to mark the release:
2. Update `allennlp/version.py` with the correct version. Then check that the output of

```
git tag $TAG -m "Release $TAG"
python scripts/get_version.py current
```

3. Push the tag to the main repo.
matches the `TAG` environment variable.

3. Update the `CHANGELOG.md` so that everything under the "Unreleased" section is now under a section corresponding to this release.

4. Commit and push these changes with:

```
git commit -a -m "Prepare for release $TAG" && git push
```

5. Then add the tag in git to mark the release:

```
git push --tags origin master
git tag $TAG -m "Release $TAG" && git push --tags
```

4. Find the tag you just pushed [on GitHub](https://github.com/allenai/allennlp/tags) and
click edit. Now copy over the latest section from the [`CHANGELOG.md`](https://raw.githubusercontent.com/allenai/allennlp/master/CHANGELOG.md). And finally, add a section called "Commits" with the output of a command like the following:
6. Find the tag you just pushed [on GitHub](https://github.com/allenai/allennlp/tags), click edit, then copy over the output of:

```bash
OLD_TAG=$(git describe --always --tags --abbrev=0 $TAG^)
git log $OLD_TAG..$TAG --oneline
```

```fish
set -x OLD_TAG (git describe --always --tags --abbrev=0 $TAG^)
git log $OLD_TAG..$TAG --oneline
python scripts/release_notes.py
```

On a Mac, for example, you can just pipe the above command into `pbcopy`.

5. Click "Publish Release", and if this is a pre-release make sure you check that box.
7. Check the box "This is a pre-release" if the release is a release candidate (ending with `rc*`). Otherwise leave it unchecked.

That's it! GitHub Actions will handle the rest.
8. Click "Publish Release". GitHub Actions will then handle the rest, including publishing the package to PyPI the Docker image to Docker Hub.


6. After publishing the release for the core repo, follow the same process to publish a release for the `allennlp-models` repo.
9. After the [GitHub Actions workflow](https://github.com/allenai/allennlp/actions?query=workflow%3AMaster+event%3Arelease) finishes, follow the same process to publish a release for the `allennlp-models` repo.


## Fixing a failed release
Expand Down
66 changes: 46 additions & 20 deletions allennlp/commands/__init__.py
Original file line number Diff line number Diff line change
@@ -1,10 +1,13 @@
import argparse
import logging
from typing import Any, Optional
import sys
from typing import Any, Optional, Tuple, Set

from overrides import overrides

from allennlp import __version__
from allennlp.commands.build_vocab import BuildVocab
from allennlp.commands.cached_path import CachedPath
from allennlp.commands.evaluate import Evaluate
from allennlp.commands.find_learning_rate import FindLearningRate
from allennlp.commands.predict import Predict
Expand Down Expand Up @@ -46,28 +49,54 @@ def add_argument(self, *args, **kwargs):
super().add_argument(*args, **kwargs)


def create_parser(prog: Optional[str] = None) -> argparse.ArgumentParser:
def parse_args(prog: Optional[str] = None) -> Tuple[argparse.ArgumentParser, argparse.Namespace]:
"""
Creates the argument parser for the main program.
Creates the argument parser for the main program and uses it to parse the args.
"""
parser = ArgumentParserWithDefaults(description="Run AllenNLP", prog=prog)
parser.add_argument("--version", action="version", version=f"%(prog)s {__version__}")

subparsers = parser.add_subparsers(title="Commands", metavar="")

for subcommand_name in sorted(Subcommand.list_available()):
subcommand_class = Subcommand.by_name(subcommand_name)
subcommand = subcommand_class()
subparser = subcommand.add_subparser(subparsers)
subparser.add_argument(
"--include-package",
type=str,
action="append",
default=[],
help="additional packages to include",
)
subcommands: Set[str] = set()

def add_subcommands():
for subcommand_name in sorted(Subcommand.list_available()):
if subcommand_name in subcommands:
continue
subcommands.add(subcommand_name)
subcommand_class = Subcommand.by_name(subcommand_name)
subcommand = subcommand_class()
subparser = subcommand.add_subparser(subparsers)
if subcommand_class.requires_plugins:
subparser.add_argument(
"--include-package",
type=str,
action="append",
default=[],
help="additional packages to include",
)

# Add all default registered subcommands first.
add_subcommands()

# If we need to print the usage/help, or the subcommand is unknown,
# we'll call `import_plugins()` to register any plugin subcommands first.
argv = sys.argv[1:]
plugins_imported: bool = False
if not argv or argv == ["--help"] or argv[0] not in subcommands:
import_plugins()
plugins_imported = True
# Add subcommands again in case one of the plugins has a registered subcommand.
add_subcommands()

# Now we can parse the arguments.
args = parser.parse_args()

if not plugins_imported and Subcommand.by_name(argv[0]).requires_plugins: # type: ignore
import_plugins()

return parser
return parser, args


def main(prog: Optional[str] = None) -> None:
Expand All @@ -77,17 +106,14 @@ def main(prog: Optional[str] = None) -> None:
work for them, unless you use the ``--include-package`` flag or you make your code available
as a plugin (see [`plugins`](./plugins.md)).
"""
import_plugins()

parser = create_parser(prog)
args = parser.parse_args()
parser, args = parse_args(prog)

# If a subparser is triggered, it adds its work as `args.func`.
# So if no such attribute has been added, no subparser was triggered,
# so give the user some help.
if "func" in dir(args):
# Import any additional modules needed (to register custom classes).
for package_name in args.include_package:
for package_name in getattr(args, "include_package", []):
import_module_and_submodules(package_name)
args.func(args)
else:
Expand Down

0 comments on commit e39a5f6

Please sign in to comment.