feat: CLI and Python argument to install model requirements before in…

…teraction (#1603)
deeppavlov · Nov 21, 2022 · c1c1441 · c1c1441
1 parent bb06448
commit c1c1441
Show file tree

Hide file tree

Showing 7 changed files with 62 additions and 57 deletions.
diff --git a/README.md b/README.md
@@ -93,24 +93,26 @@ evaluate and infer it:
 
 #### GPU requirements
 
-To run supported DeepPavlov models on GPU you should have [CUDA](https://developer.nvidia.com/cuda-toolkit) compatible
-with used GPU and [library PyTorch version](deeppavlov/requirements/pytorch.txt).
+By default, DeepPavlov installs models requirements from PyPI. PyTorch from PyPI could not support your device CUDA
+capability. To run supported DeepPavlov models on GPU you should have [CUDA](https://developer.nvidia.com/cuda-toolkit)
+compatible with used GPU and [PyTorch version](deeppavlov/requirements/pytorch.txt) required by DeepPavlov models.
+See [docs](https://docs.deeppavlov.ai/en/master/intro/quick_start.html#using-gpu) for details.
 
 ### Command line interface (CLI)
 
 To get predictions from a model interactively through CLI, run
 
 ```bash
-python -m deeppavlov interact <config_path> [-d]
+python -m deeppavlov interact <config_path> [-d] [-i]
 ```
 
-* `-d` downloads required data -- pretrained model files and embeddings
-  (optional).
+* `-d` downloads required data - pretrained model files and embeddings (optional).
+* `-i` installs model requirements (optional).
 
 You can train it in the same simple way:
 
 ```bash
-python -m deeppavlov train <config_path> [-d]
+python -m deeppavlov train <config_path> [-d] [-i]
 ```
 
 Dataset will be downloaded regardless of whether there was `-d` flag or not.
@@ -122,10 +124,11 @@ The data format is specified in the corresponding model doc page.
 There are even more actions you can perform with configs:
 
 ```bash
-python -m deeppavlov <action> <config_path> [-d]
+python -m deeppavlov <action> <config_path> [-d] [-i]
 ```
 
 * `<action>` can be
+    * `install` to install model requirements (same as `-i`),
     * `download` to download model's data (same as `-d`),
     * `train` to train the model on the data specified in the config file,
     * `evaluate` to calculate metrics on the same dataset,
@@ -136,6 +139,7 @@ python -m deeppavlov <action> <config_path> [-d]
       *<file_path>* if `-f <file_path>` is specified.
 * `<config_path>` specifies path (or name) of model's config file
 * `-d` downloads required data
+* `-i` installs model requirements
 
 
 ### Python
@@ -145,33 +149,26 @@ To get predictions from a model interactively through Python, run
 ```python
 from deeppavlov import build_model
 
-model = build_model(<config_path>, download=True)
+model = build_model(<config_path>, install=True, download=True)
 
 # get predictions for 'input_text1', 'input_text2'
 model(['input_text1', 'input_text2'])
 ```
-
-* where `download=True` downloads required data from web -- pretrained model
-  files and embeddings (optional),
-* `<config_path>` is path to the chosen model's config file (e.g.
-  `"deeppavlov/configs/ner/ner_ontonotes_bert_mult.json"`) or
-  `deeppavlov.configs` attribute (e.g.
+where
+* `install=True` installs model requirements (optional),
+* `download=True` downloads required data from web - pretrained model files and embeddings (optional),
+* `<config_path>` is model name (e.g. `'ner_ontonotes_bert_mult'`), path to the chosen model's config file (e.g.
+  `"deeppavlov/configs/ner/ner_ontonotes_bert_mult.json"`),  or `deeppavlov.configs` attribute (e.g.
   `deeppavlov.configs.ner.ner_ontonotes_bert_mult` without quotation marks).
 
 You can train it in the same simple way:
 
 ```python
 from deeppavlov import train_model 
 
-model = train_model(<config_path>, download=True)
+model = train_model(<config_path>, install=True, download=True)
 ```
 
-* `download=True` downloads pretrained model, therefore the pretrained
-model will be, first, loaded and then train (optional).
-
-Dataset will be downloaded regardless of whether there was ``-d`` flag or
-not.
-
 To train on your own data you need to modify dataset reader path in the
 [train config doc](http://docs.deeppavlov.ai/en/master/intro/config_description.html#train-config).
 The data format is specified in the corresponding model doc page. 
@@ -181,12 +178,9 @@ You can also calculate metrics on the dataset specified in your config file:
 ```python
 from deeppavlov import evaluate_model 
 
-model = evaluate_model(<config_path>, download=True)
+model = evaluate_model(<config_path>, install=True, download=True)
 ```
 
-
 ## License
 
 DeepPavlov is Apache 2.0 - licensed.
-
-##
diff --git a/deeppavlov/__init__.py b/deeppavlov/__init__.py
@@ -26,13 +26,16 @@
 
 
 # TODO: make better
-def train_model(config: [str, Path, dict], download: bool = False, recursive: bool = False) -> Chainer:
-    train_evaluate_model_from_config(config, download=download, recursive=recursive)
+def train_model(config: [str, Path, dict], install: bool = False,
+                download: bool = False, recursive: bool = False) -> Chainer:
+    train_evaluate_model_from_config(config, install=install, download=download, recursive=recursive)
     return build_model(config, load_trained=True)
 
 
-def evaluate_model(config: [str, Path, dict], download: bool = False, recursive: bool = False) -> dict:
-    return train_evaluate_model_from_config(config, to_train=False, download=download, recursive=recursive)
+def evaluate_model(config: [str, Path, dict], install: bool = False,
+                   download: bool = False, recursive: bool = False) -> dict:
+    return train_evaluate_model_from_config(config, to_train=False, install=install,
+                                            download=download, recursive=recursive)
 
 
 # check version

diff --git a/deeppavlov/_meta.py b/deeppavlov/_meta.py
@@ -1,4 +1,4 @@
-__version__ = '1.0.0'
+__version__ = '1.0.1'
 __author__ = 'Neural Networks and Deep Learning lab, MIPT'
 __description__ = 'An open source library for building end-to-end dialog systems and training chatbots.'
 __keywords__ = ['NLP', 'NER', 'SQUAD', 'Intents', 'Chatbot']

diff --git a/deeppavlov/core/commands/infer.py b/deeppavlov/core/commands/infer.py
@@ -12,7 +12,6 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 import json
-import pickle
 import sys
 from itertools import islice
 from logging import getLogger
@@ -24,15 +23,18 @@
 from deeppavlov.core.common.params import from_params
 from deeppavlov.core.data.utils import jsonify_data
 from deeppavlov.download import deep_download
+from deeppavlov.utils.pip_wrapper import install_from_config
 
 log = getLogger(__name__)
 
 
 def build_model(config: Union[str, Path, dict], mode: str = 'infer',
-                load_trained: bool = False, download: bool = False) -> Chainer:
+                load_trained: bool = False, install: bool = False, download: bool = False) -> Chainer:
     """Build and return the model described in corresponding configuration file."""
     config = parse_config(config)
 
+    if install:
+        install_from_config(config)
     if download:
         deep_download(config)
 

diff --git a/deeppavlov/core/commands/train.py b/deeppavlov/core/commands/train.py
@@ -24,6 +24,7 @@
 from deeppavlov.core.data.data_learning_iterator import DataLearningIterator
 from deeppavlov.core.data.utils import get_all_elems_from_json
 from deeppavlov.download import deep_download
+from deeppavlov.utils.pip_wrapper import install_from_config
 
 log = getLogger(__name__)
 
@@ -70,12 +71,15 @@ def train_evaluate_model_from_config(config: Union[str, Path, dict],
                                      iterator: Union[DataLearningIterator, DataFittingIterator] = None, *,
                                      to_train: bool = True,
                                      evaluation_targets: Optional[Iterable[str]] = None,
+                                     install: bool = False,
                                      download: bool = False,
                                      start_epoch_num: Optional[int] = None,
                                      recursive: bool = False) -> Dict[str, Dict[str, float]]:
     """Make training and evaluation of the model described in corresponding configuration file."""
     config = parse_config(config)
 
+    if install:
+        install_from_config(config)
     if download:
         deep_download(config)
 

diff --git a/deeppavlov/deep.py b/deeppavlov/deep.py
@@ -41,6 +41,7 @@
 parser.add_argument("-b", "--batch-size", dest="batch_size", default=None, help="inference batch size", type=int)
 parser.add_argument("-f", "--input-file", dest="file_path", default=None, help="Path to the input file", type=str)
 parser.add_argument("-d", "--download", action="store_true", help="download model components")
+parser.add_argument("-i", "--install", action="store_true", help="install model requirements")
 
 parser.add_argument("--folds", help="number of folds", type=int, default=5)
 
@@ -67,6 +68,8 @@ def main():
     args = parser.parse_args()
     pipeline_config_path = find_config(args.config_path)
 
+    if args.install or args.mode == 'install':
+        install_from_config(pipeline_config_path)
     if args.download or args.mode == 'download':
         deep_download(pipeline_config_path)
 
@@ -95,8 +98,6 @@ def main():
                              rabbit_virtualhost=args.rabbit_virtualhost)
     elif args.mode == 'predict':
         predict_on_stream(pipeline_config_path, args.batch_size, args.file_path)
-    elif args.mode == 'install':
-        install_from_config(pipeline_config_path)
     elif args.mode == 'crossval':
         if args.folds < 2:
             log.error('Minimum number of Folds is 2')

diff --git a/docs/intro/quick_start.rst b/docs/intro/quick_start.rst
@@ -26,9 +26,8 @@ Before making choice of an interface, install model's package requirements
         
         python -m deeppavlov install <config_path>
 
-    * where ``<config_path>`` is path to the chosen model's config file (e.g.
-      ``deeppavlov/configs/classifiers/insults_kaggle_bert.json``) or just name without
-      `.json` extension (e.g. ``insults_kaggle_bert``)
+    * where ``<config_path>`` is model name without ``.json`` extension (e.g. ``insults_kaggle_bert``) or path to the
+      chosen model's config file (e.g. ``deeppavlov/configs/classifiers/insults_kaggle_bert.json``)
 
 
 Command line interface (CLI)
@@ -38,19 +37,18 @@ To get predictions from a model interactively through CLI, run
 
     .. code:: bash
         
-        python -m deeppavlov interact <config_path> [-d]
+        python -m deeppavlov interact <config_path> [-d] [-i]
 
-    * ``-d`` downloads required data -- pretrained model files and embeddings
-      (optional).
+    * ``-d`` downloads required data -- pretrained model files and embeddings (optional).
+    * ``-i`` installs model requirements (optional).
 
 You can train it in the same simple way:
 
     .. code:: bash
         
-        python -m deeppavlov train <config_path> [-d]
+        python -m deeppavlov train <config_path> [-d] [-i]
 
-    Dataset will be downloaded regardless of whether there was ``-d`` flag or
-    not.
+    Dataset will be downloaded regardless of whether there was ``-d`` flag or not.
 
     To train on your own data, you need to modify dataset reader path in the
     `train section doc <configuration.html#Train-config>`__. The data format is
@@ -60,9 +58,10 @@ There are even more actions you can perform with configs:
 
     .. code:: bash
         
-        python -m deeppavlov <action> <config_path> [-d]
+        python -m deeppavlov <action> <config_path> [-d] [-i]
 
     * ``<action>`` can be
+        * ``install`` to install model requirements (same as ``-i``),
         * ``download`` to download model's data (same as ``-d``),
         * ``train`` to train the model on the data specified in the config file,
         * ``evaluate`` to calculate metrics on the same dataset,
@@ -71,10 +70,11 @@ There are even more actions you can perform with configs:
           </integrations/rest_api>`),
         * ``risesocket`` to run a socket API server (see :doc:`docs
           </integrations/socket_api>`),
-        * ``predict`` to get prediction for samples from `stdin` or from
-          `<file_path>` if ``-f <file_path>`` is specified.
+        * ``predict`` to get prediction for samples from ``stdin`` or from
+          ``<file_path>`` if ``-f <file_path>`` is specified.
     * ``<config_path>`` specifies path (or name) of model's config file
     * ``-d`` downloads required data
+    * ``-i`` installs model requirements
 
 
 Python
@@ -86,13 +86,15 @@ To get predictions from a model interactively through Python, run
         
         from deeppavlov import build_model
 
-        model = build_model(<config_path>, download=True)
+        model = build_model(<config_path>, install=True, download=True)
 
         # get predictions for 'input_text1', 'input_text2'
         model(['input_text1', 'input_text2'])
 
-    * where ``download=True`` downloads required data from web -- pretrained model
-      files and embeddings (optional),
+where
+
+    * ``install=True`` installs model requirements (optional),
+    * ``download=True`` downloads required data from web -- pretrained model files and embeddings (optional),
     * ``<config_path>`` is path to the chosen model's config file (e.g.
       ``"deeppavlov/configs/ner/ner_ontonotes_bert_mult.json"``) or
       ``deeppavlov.configs`` attribute (e.g.
@@ -104,13 +106,12 @@ You can train it in the same simple way:
         
         from deeppavlov import train_model 
 
-        model = train_model(<config_path>, download=True)
+        model = train_model(<config_path>, install=True, download=True)
 
     * ``download=True`` downloads pretrained model, therefore the pretrained
       model will be, first, loaded and then trained (optional).
 
-    Dataset will be downloaded regardless of whether there was ``-d`` flag or
-    not.
+    Dataset will be downloaded regardless of whether there was ``-d`` flag or not.
 
     To train on your own data, you need to modify dataset reader path in the
     `train section doc <configuration.html#Train-config>`__. The data format is
@@ -122,7 +123,7 @@ You can also calculate metrics on the dataset specified in your config file:
         
         from deeppavlov import evaluate_model 
 
-        model = evaluate_model(<config_path>, download=True)
+        model = evaluate_model(<config_path>, install=True, download=True)
 
 
 Using GPU
@@ -173,10 +174,10 @@ You can find a list of our out-of-the-box models `below <#out-of-the-box-pretrai
 Docker images
 ~~~~~~~~~~~~~
 
-You can run DeepPavlov models in :doc:`riseapi </integrations/rest_api>` mode
-via Docker without installing DP. Both your CPU and GPU (we support NVIDIA graphic
-processors) can be utilised, please refer our `CPU <https://hub.docker.com/r/deeppavlov/base-cpu>`_
-and `GPU <https://hub.docker.com/r/deeppavlov/base-gpu>`_ Docker images run instructions.
+You can run DeepPavlov models in :doc:`riseapi </integrations/rest_api>` mode or start Jupyter server
+via Docker without installing DeepPavlov. Both your CPU and GPU (we support NVIDIA graphic
+processors) can be utilised, please refer our `Docker <https://hub.docker.com/r/deeppavlov/deeppavlov>`_
+images run instructions.
 
 
 Out-of-the-box pretrained models