Skip to content

Commit

Permalink
feat: CLI and Python argument to install model requirements before in…
Browse files Browse the repository at this point in the history
…teraction (#1603)
  • Loading branch information
IgnatovFedor committed Nov 21, 2022
1 parent bb06448 commit c1c1441
Show file tree
Hide file tree
Showing 7 changed files with 62 additions and 57 deletions.
44 changes: 19 additions & 25 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -93,24 +93,26 @@ evaluate and infer it:

#### GPU requirements

To run supported DeepPavlov models on GPU you should have [CUDA](https://developer.nvidia.com/cuda-toolkit) compatible
with used GPU and [library PyTorch version](deeppavlov/requirements/pytorch.txt).
By default, DeepPavlov installs models requirements from PyPI. PyTorch from PyPI could not support your device CUDA
capability. To run supported DeepPavlov models on GPU you should have [CUDA](https://developer.nvidia.com/cuda-toolkit)
compatible with used GPU and [PyTorch version](deeppavlov/requirements/pytorch.txt) required by DeepPavlov models.
See [docs](https://docs.deeppavlov.ai/en/master/intro/quick_start.html#using-gpu) for details.

### Command line interface (CLI)

To get predictions from a model interactively through CLI, run

```bash
python -m deeppavlov interact <config_path> [-d]
python -m deeppavlov interact <config_path> [-d] [-i]
```

* `-d` downloads required data -- pretrained model files and embeddings
(optional).
* `-d` downloads required data - pretrained model files and embeddings (optional).
* `-i` installs model requirements (optional).

You can train it in the same simple way:

```bash
python -m deeppavlov train <config_path> [-d]
python -m deeppavlov train <config_path> [-d] [-i]
```

Dataset will be downloaded regardless of whether there was `-d` flag or not.
Expand All @@ -122,10 +124,11 @@ The data format is specified in the corresponding model doc page.
There are even more actions you can perform with configs:

```bash
python -m deeppavlov <action> <config_path> [-d]
python -m deeppavlov <action> <config_path> [-d] [-i]
```

* `<action>` can be
* `install` to install model requirements (same as `-i`),
* `download` to download model's data (same as `-d`),
* `train` to train the model on the data specified in the config file,
* `evaluate` to calculate metrics on the same dataset,
Expand All @@ -136,6 +139,7 @@ python -m deeppavlov <action> <config_path> [-d]
*<file_path>* if `-f <file_path>` is specified.
* `<config_path>` specifies path (or name) of model's config file
* `-d` downloads required data
* `-i` installs model requirements


### Python
Expand All @@ -145,33 +149,26 @@ To get predictions from a model interactively through Python, run
```python
from deeppavlov import build_model

model = build_model(<config_path>, download=True)
model = build_model(<config_path>, install=True, download=True)

# get predictions for 'input_text1', 'input_text2'
model(['input_text1', 'input_text2'])
```

* where `download=True` downloads required data from web -- pretrained model
files and embeddings (optional),
* `<config_path>` is path to the chosen model's config file (e.g.
`"deeppavlov/configs/ner/ner_ontonotes_bert_mult.json"`) or
`deeppavlov.configs` attribute (e.g.
where
* `install=True` installs model requirements (optional),
* `download=True` downloads required data from web - pretrained model files and embeddings (optional),
* `<config_path>` is model name (e.g. `'ner_ontonotes_bert_mult'`), path to the chosen model's config file (e.g.
`"deeppavlov/configs/ner/ner_ontonotes_bert_mult.json"`), or `deeppavlov.configs` attribute (e.g.
`deeppavlov.configs.ner.ner_ontonotes_bert_mult` without quotation marks).

You can train it in the same simple way:

```python
from deeppavlov import train_model

model = train_model(<config_path>, download=True)
model = train_model(<config_path>, install=True, download=True)
```

* `download=True` downloads pretrained model, therefore the pretrained
model will be, first, loaded and then train (optional).

Dataset will be downloaded regardless of whether there was ``-d`` flag or
not.

To train on your own data you need to modify dataset reader path in the
[train config doc](http://docs.deeppavlov.ai/en/master/intro/config_description.html#train-config).
The data format is specified in the corresponding model doc page.
Expand All @@ -181,12 +178,9 @@ You can also calculate metrics on the dataset specified in your config file:
```python
from deeppavlov import evaluate_model

model = evaluate_model(<config_path>, download=True)
model = evaluate_model(<config_path>, install=True, download=True)
```


## License

DeepPavlov is Apache 2.0 - licensed.

##
11 changes: 7 additions & 4 deletions deeppavlov/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,13 +26,16 @@


# TODO: make better
def train_model(config: [str, Path, dict], download: bool = False, recursive: bool = False) -> Chainer:
train_evaluate_model_from_config(config, download=download, recursive=recursive)
def train_model(config: [str, Path, dict], install: bool = False,
download: bool = False, recursive: bool = False) -> Chainer:
train_evaluate_model_from_config(config, install=install, download=download, recursive=recursive)
return build_model(config, load_trained=True)


def evaluate_model(config: [str, Path, dict], download: bool = False, recursive: bool = False) -> dict:
return train_evaluate_model_from_config(config, to_train=False, download=download, recursive=recursive)
def evaluate_model(config: [str, Path, dict], install: bool = False,
download: bool = False, recursive: bool = False) -> dict:
return train_evaluate_model_from_config(config, to_train=False, install=install,
download=download, recursive=recursive)


# check version
Expand Down
2 changes: 1 addition & 1 deletion deeppavlov/_meta.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
__version__ = '1.0.0'
__version__ = '1.0.1'
__author__ = 'Neural Networks and Deep Learning lab, MIPT'
__description__ = 'An open source library for building end-to-end dialog systems and training chatbots.'
__keywords__ = ['NLP', 'NER', 'SQUAD', 'Intents', 'Chatbot']
Expand Down
6 changes: 4 additions & 2 deletions deeppavlov/core/commands/infer.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,6 @@
# See the License for the specific language governing permissions and
# limitations under the License.
import json
import pickle
import sys
from itertools import islice
from logging import getLogger
Expand All @@ -24,15 +23,18 @@
from deeppavlov.core.common.params import from_params
from deeppavlov.core.data.utils import jsonify_data
from deeppavlov.download import deep_download
from deeppavlov.utils.pip_wrapper import install_from_config

log = getLogger(__name__)


def build_model(config: Union[str, Path, dict], mode: str = 'infer',
load_trained: bool = False, download: bool = False) -> Chainer:
load_trained: bool = False, install: bool = False, download: bool = False) -> Chainer:
"""Build and return the model described in corresponding configuration file."""
config = parse_config(config)

if install:
install_from_config(config)
if download:
deep_download(config)

Expand Down
4 changes: 4 additions & 0 deletions deeppavlov/core/commands/train.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@
from deeppavlov.core.data.data_learning_iterator import DataLearningIterator
from deeppavlov.core.data.utils import get_all_elems_from_json
from deeppavlov.download import deep_download
from deeppavlov.utils.pip_wrapper import install_from_config

log = getLogger(__name__)

Expand Down Expand Up @@ -70,12 +71,15 @@ def train_evaluate_model_from_config(config: Union[str, Path, dict],
iterator: Union[DataLearningIterator, DataFittingIterator] = None, *,
to_train: bool = True,
evaluation_targets: Optional[Iterable[str]] = None,
install: bool = False,
download: bool = False,
start_epoch_num: Optional[int] = None,
recursive: bool = False) -> Dict[str, Dict[str, float]]:
"""Make training and evaluation of the model described in corresponding configuration file."""
config = parse_config(config)

if install:
install_from_config(config)
if download:
deep_download(config)

Expand Down
5 changes: 3 additions & 2 deletions deeppavlov/deep.py
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,7 @@
parser.add_argument("-b", "--batch-size", dest="batch_size", default=None, help="inference batch size", type=int)
parser.add_argument("-f", "--input-file", dest="file_path", default=None, help="Path to the input file", type=str)
parser.add_argument("-d", "--download", action="store_true", help="download model components")
parser.add_argument("-i", "--install", action="store_true", help="install model requirements")

parser.add_argument("--folds", help="number of folds", type=int, default=5)

Expand All @@ -67,6 +68,8 @@ def main():
args = parser.parse_args()
pipeline_config_path = find_config(args.config_path)

if args.install or args.mode == 'install':
install_from_config(pipeline_config_path)
if args.download or args.mode == 'download':
deep_download(pipeline_config_path)

Expand Down Expand Up @@ -95,8 +98,6 @@ def main():
rabbit_virtualhost=args.rabbit_virtualhost)
elif args.mode == 'predict':
predict_on_stream(pipeline_config_path, args.batch_size, args.file_path)
elif args.mode == 'install':
install_from_config(pipeline_config_path)
elif args.mode == 'crossval':
if args.folds < 2:
log.error('Minimum number of Folds is 2')
Expand Down
47 changes: 24 additions & 23 deletions docs/intro/quick_start.rst
Original file line number Diff line number Diff line change
Expand Up @@ -26,9 +26,8 @@ Before making choice of an interface, install model's package requirements
python -m deeppavlov install <config_path>
* where ``<config_path>`` is path to the chosen model's config file (e.g.
``deeppavlov/configs/classifiers/insults_kaggle_bert.json``) or just name without
`.json` extension (e.g. ``insults_kaggle_bert``)
* where ``<config_path>`` is model name without ``.json`` extension (e.g. ``insults_kaggle_bert``) or path to the
chosen model's config file (e.g. ``deeppavlov/configs/classifiers/insults_kaggle_bert.json``)


Command line interface (CLI)
Expand All @@ -38,19 +37,18 @@ To get predictions from a model interactively through CLI, run

.. code:: bash
python -m deeppavlov interact <config_path> [-d]
python -m deeppavlov interact <config_path> [-d] [-i]
* ``-d`` downloads required data -- pretrained model files and embeddings
(optional).
* ``-d`` downloads required data -- pretrained model files and embeddings (optional).
* ``-i`` installs model requirements (optional).

You can train it in the same simple way:

.. code:: bash
python -m deeppavlov train <config_path> [-d]
python -m deeppavlov train <config_path> [-d] [-i]
Dataset will be downloaded regardless of whether there was ``-d`` flag or
not.
Dataset will be downloaded regardless of whether there was ``-d`` flag or not.

To train on your own data, you need to modify dataset reader path in the
`train section doc <configuration.html#Train-config>`__. The data format is
Expand All @@ -60,9 +58,10 @@ There are even more actions you can perform with configs:

.. code:: bash
python -m deeppavlov <action> <config_path> [-d]
python -m deeppavlov <action> <config_path> [-d] [-i]
* ``<action>`` can be
* ``install`` to install model requirements (same as ``-i``),
* ``download`` to download model's data (same as ``-d``),
* ``train`` to train the model on the data specified in the config file,
* ``evaluate`` to calculate metrics on the same dataset,
Expand All @@ -71,10 +70,11 @@ There are even more actions you can perform with configs:
</integrations/rest_api>`),
* ``risesocket`` to run a socket API server (see :doc:`docs
</integrations/socket_api>`),
* ``predict`` to get prediction for samples from `stdin` or from
`<file_path>` if ``-f <file_path>`` is specified.
* ``predict`` to get prediction for samples from ``stdin`` or from
``<file_path>`` if ``-f <file_path>`` is specified.
* ``<config_path>`` specifies path (or name) of model's config file
* ``-d`` downloads required data
* ``-i`` installs model requirements


Python
Expand All @@ -86,13 +86,15 @@ To get predictions from a model interactively through Python, run
from deeppavlov import build_model
model = build_model(<config_path>, download=True)
model = build_model(<config_path>, install=True, download=True)
# get predictions for 'input_text1', 'input_text2'
model(['input_text1', 'input_text2'])
* where ``download=True`` downloads required data from web -- pretrained model
files and embeddings (optional),
where

* ``install=True`` installs model requirements (optional),
* ``download=True`` downloads required data from web -- pretrained model files and embeddings (optional),
* ``<config_path>`` is path to the chosen model's config file (e.g.
``"deeppavlov/configs/ner/ner_ontonotes_bert_mult.json"``) or
``deeppavlov.configs`` attribute (e.g.
Expand All @@ -104,13 +106,12 @@ You can train it in the same simple way:
from deeppavlov import train_model
model = train_model(<config_path>, download=True)
model = train_model(<config_path>, install=True, download=True)
* ``download=True`` downloads pretrained model, therefore the pretrained
model will be, first, loaded and then trained (optional).

Dataset will be downloaded regardless of whether there was ``-d`` flag or
not.
Dataset will be downloaded regardless of whether there was ``-d`` flag or not.

To train on your own data, you need to modify dataset reader path in the
`train section doc <configuration.html#Train-config>`__. The data format is
Expand All @@ -122,7 +123,7 @@ You can also calculate metrics on the dataset specified in your config file:
from deeppavlov import evaluate_model
model = evaluate_model(<config_path>, download=True)
model = evaluate_model(<config_path>, install=True, download=True)
Using GPU
Expand Down Expand Up @@ -173,10 +174,10 @@ You can find a list of our out-of-the-box models `below <#out-of-the-box-pretrai
Docker images
~~~~~~~~~~~~~

You can run DeepPavlov models in :doc:`riseapi </integrations/rest_api>` mode
via Docker without installing DP. Both your CPU and GPU (we support NVIDIA graphic
processors) can be utilised, please refer our `CPU <https://hub.docker.com/r/deeppavlov/base-cpu>`_
and `GPU <https://hub.docker.com/r/deeppavlov/base-gpu>`_ Docker images run instructions.
You can run DeepPavlov models in :doc:`riseapi </integrations/rest_api>` mode or start Jupyter server
via Docker without installing DeepPavlov. Both your CPU and GPU (we support NVIDIA graphic
processors) can be utilised, please refer our `Docker <https://hub.docker.com/r/deeppavlov/deeppavlov>`_
images run instructions.


Out-of-the-box pretrained models
Expand Down

0 comments on commit c1c1441

Please sign in to comment.