Downloading requirements #93

ohld · 2018-03-12T07:38:24Z

I was trying to install deeppavlov and had a problem following the installation steps.

There is no download.py file in root folder, it is in deeppavlov/download.py

python download.py [-all]

Even if I use that file it outputs the error:

(env) root@mysexyhost:~/work/ipavlov/DeepPavlov# python3 deeppavlov/download.py
/home/ubuntu/work/ipavlov/env/local/lib/python3.5/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
  from ._conv import register_converters as _register_converters
Using TensorFlow backend.
2018-03-12 07:34:11.490 ERROR in 'deeppavlov.core.models.serializable'['log'] at line 54: LOGGER ERROR: Can not initialise deeppavlov.core.models.serializable logger, logging to the stderr. Error traceback:
Traceback (most recent call last):
  File "/home/ubuntu/work/ipavlov/DeepPavlov/deeppavlov/core/common/log.py", line 32, in get_logger
    with open(log_config_path) as log_config_json:
TypeError: invalid file: PosixPath('/home/ubuntu/work/ipavlov/DeepPavlov/deeppavlov/log_config.json')
2018-03-12 07:34:11.491 ERROR in 'deeppavlov.core.models.keras_model'['log'] at line 54: LOGGER ERROR: Can not initialise deeppavlov.core.models.keras_model logger, logging to the stderr. Error traceback:
Traceback (most recent call last):
  File "/home/ubuntu/work/ipavlov/DeepPavlov/deeppavlov/core/common/log.py", line 32, in get_logger
    with open(log_config_path) as log_config_json:
TypeError: invalid file: PosixPath('/home/ubuntu/work/ipavlov/DeepPavlov/deeppavlov/log_config.json')
Traceback (most recent call last):
  File "deeppavlov/download.py", line 24, in <module>
    from deeppavlov.core.data.utils import download, download_decompress
  File "/home/ubuntu/work/ipavlov/DeepPavlov/deeppavlov/__init__.py", line 1, in <module>
    import deeppavlov.core.models.keras_model
  File "/home/ubuntu/work/ipavlov/DeepPavlov/deeppavlov/core/models/keras_model.py", line 39, in <module>
    class KerasModel(NNModel, metaclass=TfModelMeta):
  File "/home/ubuntu/work/ipavlov/DeepPavlov/deeppavlov/core/models/keras_model.py", line 143, in KerasModel
    sample_weight_mode=None, weighted_metrics=None, target_tensors=None):
  File "/home/ubuntu/work/ipavlov/env/local/lib/python3.5/site-packages/overrides/overrides.py", line 70, in overrides
    method.__name__)
AssertionError: No super class method found for "load"

The text was updated successfully, but these errors were encountered:

…ry root fixes #93

…ry root (#94) * docs: correct paths to scripts and configs to be relative to repository root fixes #93 * docs: set paths in basic examples to be relative to the project root * docs: run deep.py as a python module in examples

* fixed grammar and style * Update README.md * fix grammar & style * fix grammar & style * fix grammar&style in Intent classification README * doc: add supported platform notes * docs: correct paths to scripts and configs to be relative to repository root (#94) * docs: correct paths to scripts and configs to be relative to repository root fixes #93 * docs: set paths in basic examples to be relative to the project root * docs: run deep.py as a python module in examples * doc: add notes for python 3.5

* feat: tests can be run from project root (#86) * refactor: instead of juggling global random states use instances of Random for datasets * test(): add test for interacting with custom queries After refactoring, it is possible to easily add list of query-response pairs for every model (config), which will be used to compare pretrained model output with expected output. Initial lists added for error_model and ner. Also URL for downloading pretrained ner_conll2003_model added IP-1344 #done * Update docs from master (#96) * fixed grammar and style * Update README.md * fix grammar & style * fix grammar & style * fix grammar&style in Intent classification README * doc: add supported platform notes * docs: correct paths to scripts and configs to be relative to repository root (#94) * docs: correct paths to scripts and configs to be relative to repository root fixes #93 * docs: set paths in basic examples to be relative to the project root * docs: run deep.py as a python module in examples * doc: add notes for python 3.5 * test(): change downloading to temp dir (#97) * feat: assert python version is 3.6 or higher * Rename dataset to dataset_iterator and other renames (#103) * refactor: rename 'dataset' to 'dataset_iterator' * refactor: rename dataset readers and iterators * refactor: classification iterator and reader * fix: dialog_iterator * test: fix downloading procedure (#108) * Feature/tf layers to core (#67) * feat: layers moved to core * feat: attention added * fix: highway/skip connections for different dimensionality of units are fixed * feat: NER now supports core layers * fix: minor docstrings fixes * feat: CuDNN GRU and LSTM added * feat: Bidirectional CuDNN GRU and LSTM added * feat: stacked bi-rnn refactored * fix: fixed arguments order in rnn * fix: remove duplicate mult_att * chore: merge with dev * fix: backward forward bug in cudnnrnn * refactor: use single fasttext module, clean dependencies * fix: add error when n_classes is zero * feat: add fastText model usage instead of fasttext * fix: emb_module default fastText * chore: embedding fixed in configs * chore: change new models names * feat: change intent embeddings in gobot configs * chore: fastText to fasttext, new model, change intents in gobot configs * chore: new url on new fasttext embeddings * fix: delete dowload all true * fix: add url of old embedding file * fix: delete comma * fix: delete old embedding file from urls * fix: delete pyfasttext from requirements, fasttext_embedder * fix: change pyfasttext embeddings from gobot * fix: delete from requirements * fix: delete gensim from fasttext_embedder * fix: simplify requirements * fix: fix dim in gobot_all config * refactor: remove redundant parameter 'emb_module' * feat: use wiki.en.bin embeddings in gobot_all * feat: check saved model params and fix lowercase for interact * fix: lowercase text while interact * feat: check saved model params * fix: rm extra configs * feat: add support for classification data in csv/json formats (#115) * feat: add support for csv/json classification datasets * feat: add tests for snips and samples * fix: gobot_all config fix * feat: add REST API for all models * Moved telegram_utils -> utils; Refactored telegram_ui.py * Moved telegram_utils -> utils: modified deeppavlov/deep.py * Fixed getting model name with get_main_component() in telegram_ui.py * chaner.py: minor fix in get_main_component() * Added riseapi launch mode * README.md: added riseapi mode reference * Updated README.MD and fixed requirements.txt * minor fixes in README.md * Fixes in utils/server * refactor: change endpoint names * feat: add SteamSpacyTokenizer * refactor: remove duplicating from script naming * refactor: outline detokenize() meth in utils, because it should be used by all tokenizers and doesn't depend on tokenize() * feat: add streaming spaCy tokenizer * refactor: DELETE original spaCy tokenizer, rename stream_spacy to spacy * refactor: rename tokenoizer scripts back * fix: wrong grammar * feat: include spacy_tokenizer import * feat: replace old SpacyTokenizer with new StreamSpacyTokenizer * feat: ability to manage lowercasing from class constructor, typing improvements * fix: update go-bot configs, so they would work with StreamSpacyTokenizer the same as with the old tokenizer * feat: add optional logging to the spacy tokenizer * docs: update docstrings * refactor: replace custom logger with deppavlov's, pep8 style update * refactor: uotline ngramize() cause it is independent from tokenizer classes * refactor: return original JSON formatting * fix: add **kwargs to __init__() * chore: update .gitignore * refactor: more stable and consistent code * feat: add TravisCI integration * build(): add TravisCI integration * build(): add TravisCI integration * feat: add ranking model * feat: add ranking model to deeppavlov * feat: add download of dataset and embedding_model * feat: adapt to new deeppavlov interfaces * refactor: use pathlib where available in the ranking model * feat: add saving and loading responses saving with np.save * feat: add saving and loading response embeddings saving with np.save, use response embeddings to calculate predictions in __call__ function * feat: add interact regime * feat: add interact_pred_num parameter * refactor: change parameter default value, change check if the file with embeddings model exists * fix: fix non-string keys in EmbeddingDict class * feat: add parameters dict for autotests * feat: add tests support * feat: add context embeddings vocabulary (it is used in interact regime to predict the most similar contexts) * chore: change shuffle parameter default value to True in batch_generator * refactor: change config to chainer representation * fix: bug fix in urls.py file * refactor: remove emb_vocab_file saving, move build_tok2int_vocab and make_ints funcs to InsuranceDict class, add set_embeddings and reset_embeddings funcs in RankingModel * feat: add initial documentation * refactor: remove idx2int vocabulary, add vocabularies saving * change config parameters default values, remove examples in tests * feat: add table in documentation * fix: fix bug in urls.py * refactor: remove paths from config * feat: add documentation * feat: add True in tests * feat: add documentation * refactor: move init/load in the load function. * refactor: change parameters in config * feat: add logging * feat: add more logging * feat: add documentation, change parameters values in config * fix: add genesis for ranking model * fix: requirements installation order that caused setup.py error * refactor: train script * feat: add documentation * feat: models parameters check for ner * feat: parameters check added to ner * feat: parameters check added to slotfill * chore: minor clean-up * fix: fix conll-2003 model file names and archive names * refactor: remove blank line * feat: allow to stop training after n batches (#127) * fix: many minor fixes * fix: fix mark_done data_path * refactor: rename ranking_dataset to ranking_iterator.py and move it to the dataset_iterators folder * fix: fix embedding matrix construction, change epochs num default parameter value * refactor: rename registered name and name of the class * refactor: rename files and classes * refactor: change dataset downlaod * feat: add insurance embeddings and datasets in urls.py * refactor: change batch data representation (#131) * feat: install tensorflow-gpu * feat: add SQUAD model * feat: add SQuAD dataset reader * feat: add dataset, preprocessing, config * feat: add VocabEmbedder for chars and tokens * feat&fix: add model realization * feat: add training support, answer postprocessing * fix: predicted answer extraction from context * fix: dropout mask * feat: true_answer is a list of answers now * merge with dev * docs: add some docstrings * refactor: renaming variables * docs: add README.md * feat: add support of multiple inputs and outputs in interact mode * docs: upd README.md * fix: bugs after merge with dev * fix: turn on training vocabs * fix: remove keep_prob multiplier for dropout mask * fix: add short contexts support * docs: upd README.md * feat: chainer returns batch of tuples instead of tuple of batches * docs: upd squad README.md * docs: upd squad README.md * feat: add link to pretrained SQuAD model * fix: SQuAD model url * feat: add embeddings downloading and upd config * feat: add variable scope for optimizer * refactor: do not override __init__ method for squad_iterator * fix: ensure that directory exists before saving SquadVocabEmbedder * style: upd names in config and docs * chore: remove main.py used for debugging * docs: upd README.md * fix: change batch_size to fix possible OOM * test: add possibility to interact with several input query * chore: add max_batches to squad config * docs: upd README.md * fix(ranking_network): wrap y as np.array * fix: fix training stop for pytest * style: add license header * fix: refactor training stop for pytest * test: specify pytest_max_batches * feat: use all pytest keys and not only max_batches (#134) * fix: remove result stringification * feat: add GPU_only and Slow marks for tests * feat: add SQuAD dataset reader * feat: add dataset, preprocessing, config * feat: add VocabEmbedder for chars and tokens * feat&fix: add model realization * feat: add training support, answer postprocessing * fix: predicted answer extraction from context * fix: dropout mask * feat: true_answer is a list of answers now * merge with dev * docs: add some docstrings * refactor: renaming variables * docs: add README.md * feat: add support of multiple inputs and outputs in interact mode * docs: upd README.md * fix: bugs after merge with dev * fix: turn on training vocabs * fix: remove keep_prob multiplier for dropout mask * fix: add short contexts support * docs: upd README.md * feat: chainer returns batch of tuples instead of tuple of batches * docs: upd squad README.md * docs: upd squad README.md * feat: add link to pretrained SQuAD model * fix: SQuAD model url * feat: add embeddings downloading and upd config * feat: add variable scope for optimizer * refactor: do not override __init__ method for squad_iterator * fix: ensure that directory exists before saving SquadVocabEmbedder * style: upd names in config and docs * chore: remove main.py used for debugging * docs: upd README.md * fix: change batch_size to fix possible OOM * test: add possibility to interact with several input query * chore: add max_batches to squad config * docs: upd README.md * fix(ranking_network): wrap y as np.array * fix: fix training stop for pytest * style: add license header * fix: refactor training stop for pytest * test: specify pytest_max_batches * test: add couple of marks for selecting tests * test: make Travis running only fast tests without GPU * fix: ranking config works in interactbot * fix: add downloading nltk punkt for tokenization (#140) * feat: bot start message for intents does not say anything about dstc2 (#142) * feat: interactbot command works with pipes that require multiple inputs (#137) * build: change TravisCI script (#143) * feat: add Glove embedder (#138) * feat: glove embedder added * feat: embeddings added to NER network * feat: dataset and embeddings are added to urls.py for downloading * fix: char embeddings added to pretrained embeddings * feat: embedder return list of embeddings instead zero padded np array * feat: capitalization added * feat: config modified according to new features * feat: double dense added to input parameters * feat:config parameters updated * chore: fix urls for conll NER, ontonotes model url added * feat: pytest_max_batches added for faster tran check * feat: ontonotes tests added * feat: test conll max batches added * Update README.md * feat: add seq2seq go bot * fix: lowercase text while interact * feat: check saved model params * fix: rm extra configs * feat: add kvret dataset_reader * feat: add kvret_dataset_iterator * fix: add configerror * fix: dirty fix for dialog data to be lowercased * feat: check np.int and int in Vocabulary * feat: seq2seqbot works for train and infer * feat: add bleu-metric * feat: add simple seq2seq_go_bot config * fix: fix inference and load() * feat: add variable scope for optimizer * feat: add support of multiple inputs and outputs in interact mode * fix: fix padding * feat: tokenizer argument in Vocabulary * feat: chainer returns batch of tuples instead of tuple of batches * fix: spacy_tokenizer returns [['']] for batch with empty string and add alpha_only argument * feat: add per_item_bleu * feat: train seq2seq_go_bot on utterance batches * feat: tokenize y_true * feat: fit kb_entries knowledge base * feat: add split tokenizer * feat: standartize tokenizers output * feat: normalize kb entities * feat: db_columns, db_items in each sample * fix: go_bot configs (for new vocab) and loading of network * style: minor restyling * feat: add config for infer * feat: add config for infer * feat: add seq2seq_go_bot pretrained model * feat: update telegram start and help messages * style: minor styling * docs: add simple readme * doc: remove red ... blocks * doc: change Dataset to DatasetIterator * doc: update list of configs * doc: update package structure * doc: add notes about dataset element in config * feat: add squad model description to README.md * doc: add config specification for seq2seq_go_bot * fix: lowercase text while interact * feat: check saved model params * fix: rm extra configs * feat: add kvret dataset_reader * feat: add kvret_dataset_iterator * fix: add configerror * fix: dirty fix for dialog data to be lowercased * feat: check np.int and int in Vocabulary * feat: seq2seqbot works for train and infer * feat: add bleu-metric * feat: add simple seq2seq_go_bot config * fix: fix inference and load() * feat: add variable scope for optimizer * feat: add support of multiple inputs and outputs in interact mode * fix: fix padding * feat: tokenizer argument in Vocabulary * feat: chainer returns batch of tuples instead of tuple of batches * fix: spacy_tokenizer returns [['']] for batch with empty string and add alpha_only argument * feat: add per_item_bleu * feat: train seq2seq_go_bot on utterance batches * feat: tokenize y_true * feat: fit kb_entries knowledge base * feat: add split tokenizer * feat: standartize tokenizers output * feat: normalize kb entities * feat: db_columns, db_items in each sample * fix: go_bot configs (for new vocab) and loading of network * style: minor restyling * feat: add config for infer * feat: add config for infer * feat: add seq2seq_go_bot pretrained model * feat: update telegram start and help messages * style: minor styling * docs: add simple readme * docs: add seq2seq_go_bot in main readme * docs: small fix * docs: add config specification for seq2seq_go_bot * chore: remove install.py (#151) * feat: add support for batches in go-bot * feat: batching v1 * feat: bow_encoder is optional * fix: probs calculation for use_action_mask=true * refactor: do not feed inital_state during train * feat: feed sequence lengths in dynamic_rnn * refactor: rename go_bot.py -> bot.py

* release 0.0.3 (#150) * feat: tests can be run from project root (#86) * refactor: instead of juggling global random states use instances of Random for datasets * test(): add test for interacting with custom queries After refactoring, it is possible to easily add list of query-response pairs for every model (config), which will be used to compare pretrained model output with expected output. Initial lists added for error_model and ner. Also URL for downloading pretrained ner_conll2003_model added IP-1344 #done * Update docs from master (#96) * fixed grammar and style * Update README.md * fix grammar & style * fix grammar & style * fix grammar&style in Intent classification README * doc: add supported platform notes * docs: correct paths to scripts and configs to be relative to repository root (#94) * docs: correct paths to scripts and configs to be relative to repository root fixes #93 * docs: set paths in basic examples to be relative to the project root * docs: run deep.py as a python module in examples * doc: add notes for python 3.5 * test(): change downloading to temp dir (#97) * feat: assert python version is 3.6 or higher * Rename dataset to dataset_iterator and other renames (#103) * refactor: rename 'dataset' to 'dataset_iterator' * refactor: rename dataset readers and iterators * refactor: classification iterator and reader * fix: dialog_iterator * test: fix downloading procedure (#108) * Feature/tf layers to core (#67) * feat: layers moved to core * feat: attention added * fix: highway/skip connections for different dimensionality of units are fixed * feat: NER now supports core layers * fix: minor docstrings fixes * feat: CuDNN GRU and LSTM added * feat: Bidirectional CuDNN GRU and LSTM added * feat: stacked bi-rnn refactored * fix: fixed arguments order in rnn * fix: remove duplicate mult_att * chore: merge with dev * fix: backward forward bug in cudnnrnn * refactor: use single fasttext module, clean dependencies * fix: add error when n_classes is zero * feat: add fastText model usage instead of fasttext * fix: emb_module default fastText * chore: embedding fixed in configs * chore: change new models names * feat: change intent embeddings in gobot configs * chore: fastText to fasttext, new model, change intents in gobot configs * chore: new url on new fasttext embeddings * fix: delete dowload all true * fix: add url of old embedding file * fix: delete comma * fix: delete old embedding file from urls * fix: delete pyfasttext from requirements, fasttext_embedder * fix: change pyfasttext embeddings from gobot * fix: delete from requirements * fix: delete gensim from fasttext_embedder * fix: simplify requirements * fix: fix dim in gobot_all config * refactor: remove redundant parameter 'emb_module' * feat: use wiki.en.bin embeddings in gobot_all * feat: check saved model params and fix lowercase for interact * fix: lowercase text while interact * feat: check saved model params * fix: rm extra configs * feat: add support for classification data in csv/json formats (#115) * feat: add support for csv/json classification datasets * feat: add tests for snips and samples * fix: gobot_all config fix * feat: add REST API for all models * Moved telegram_utils -> utils; Refactored telegram_ui.py * Moved telegram_utils -> utils: modified deeppavlov/deep.py * Fixed getting model name with get_main_component() in telegram_ui.py * chaner.py: minor fix in get_main_component() * Added riseapi launch mode * README.md: added riseapi mode reference * Updated README.MD and fixed requirements.txt * minor fixes in README.md * Fixes in utils/server * refactor: change endpoint names * feat: add SteamSpacyTokenizer * refactor: remove duplicating from script naming * refactor: outline detokenize() meth in utils, because it should be used by all tokenizers and doesn't depend on tokenize() * feat: add streaming spaCy tokenizer * refactor: DELETE original spaCy tokenizer, rename stream_spacy to spacy * refactor: rename tokenoizer scripts back * fix: wrong grammar * feat: include spacy_tokenizer import * feat: replace old SpacyTokenizer with new StreamSpacyTokenizer * feat: ability to manage lowercasing from class constructor, typing improvements * fix: update go-bot configs, so they would work with StreamSpacyTokenizer the same as with the old tokenizer * feat: add optional logging to the spacy tokenizer * docs: update docstrings * refactor: replace custom logger with deppavlov's, pep8 style update * refactor: uotline ngramize() cause it is independent from tokenizer classes * refactor: return original JSON formatting * fix: add **kwargs to __init__() * chore: update .gitignore * refactor: more stable and consistent code * feat: add TravisCI integration * build(): add TravisCI integration * build(): add TravisCI integration * feat: add ranking model * feat: add ranking model to deeppavlov * feat: add download of dataset and embedding_model * feat: adapt to new deeppavlov interfaces * refactor: use pathlib where available in the ranking model * feat: add saving and loading responses saving with np.save * feat: add saving and loading response embeddings saving with np.save, use response embeddings to calculate predictions in __call__ function * feat: add interact regime * feat: add interact_pred_num parameter * refactor: change parameter default value, change check if the file with embeddings model exists * fix: fix non-string keys in EmbeddingDict class * feat: add parameters dict for autotests * feat: add tests support * feat: add context embeddings vocabulary (it is used in interact regime to predict the most similar contexts) * chore: change shuffle parameter default value to True in batch_generator * refactor: change config to chainer representation * fix: bug fix in urls.py file * refactor: remove emb_vocab_file saving, move build_tok2int_vocab and make_ints funcs to InsuranceDict class, add set_embeddings and reset_embeddings funcs in RankingModel * feat: add initial documentation * refactor: remove idx2int vocabulary, add vocabularies saving * change config parameters default values, remove examples in tests * feat: add table in documentation * fix: fix bug in urls.py * refactor: remove paths from config * feat: add documentation * feat: add True in tests * feat: add documentation * refactor: move init/load in the load function. * refactor: change parameters in config * feat: add logging * feat: add more logging * feat: add documentation, change parameters values in config * fix: add genesis for ranking model * fix: requirements installation order that caused setup.py error * refactor: train script * feat: add documentation * feat: models parameters check for ner * feat: parameters check added to ner * feat: parameters check added to slotfill * chore: minor clean-up * fix: fix conll-2003 model file names and archive names * refactor: remove blank line * feat: allow to stop training after n batches (#127) * fix: many minor fixes * fix: fix mark_done data_path * refactor: rename ranking_dataset to ranking_iterator.py and move it to the dataset_iterators folder * fix: fix embedding matrix construction, change epochs num default parameter value * refactor: rename registered name and name of the class * refactor: rename files and classes * refactor: change dataset downlaod * feat: add insurance embeddings and datasets in urls.py * refactor: change batch data representation (#131) * feat: install tensorflow-gpu * feat: add SQUAD model * feat: add SQuAD dataset reader * feat: add dataset, preprocessing, config * feat: add VocabEmbedder for chars and tokens * feat&fix: add model realization * feat: add training support, answer postprocessing * fix: predicted answer extraction from context * fix: dropout mask * feat: true_answer is a list of answers now * merge with dev * docs: add some docstrings * refactor: renaming variables * docs: add README.md * feat: add support of multiple inputs and outputs in interact mode * docs: upd README.md * fix: bugs after merge with dev * fix: turn on training vocabs * fix: remove keep_prob multiplier for dropout mask * fix: add short contexts support * docs: upd README.md * feat: chainer returns batch of tuples instead of tuple of batches * docs: upd squad README.md * docs: upd squad README.md * feat: add link to pretrained SQuAD model * fix: SQuAD model url * feat: add embeddings downloading and upd config * feat: add variable scope for optimizer * refactor: do not override __init__ method for squad_iterator * fix: ensure that directory exists before saving SquadVocabEmbedder * style: upd names in config and docs * chore: remove main.py used for debugging * docs: upd README.md * fix: change batch_size to fix possible OOM * test: add possibility to interact with several input query * chore: add max_batches to squad config * docs: upd README.md * fix(ranking_network): wrap y as np.array * fix: fix training stop for pytest * style: add license header * fix: refactor training stop for pytest * test: specify pytest_max_batches * feat: use all pytest keys and not only max_batches (#134) * fix: remove result stringification * feat: add GPU_only and Slow marks for tests * feat: add SQuAD dataset reader * feat: add dataset, preprocessing, config * feat: add VocabEmbedder for chars and tokens * feat&fix: add model realization * feat: add training support, answer postprocessing * fix: predicted answer extraction from context * fix: dropout mask * feat: true_answer is a list of answers now * merge with dev * docs: add some docstrings * refactor: renaming variables * docs: add README.md * feat: add support of multiple inputs and outputs in interact mode * docs: upd README.md * fix: bugs after merge with dev * fix: turn on training vocabs * fix: remove keep_prob multiplier for dropout mask * fix: add short contexts support * docs: upd README.md * feat: chainer returns batch of tuples instead of tuple of batches * docs: upd squad README.md * docs: upd squad README.md * feat: add link to pretrained SQuAD model * fix: SQuAD model url * feat: add embeddings downloading and upd config * feat: add variable scope for optimizer * refactor: do not override __init__ method for squad_iterator * fix: ensure that directory exists before saving SquadVocabEmbedder * style: upd names in config and docs * chore: remove main.py used for debugging * docs: upd README.md * fix: change batch_size to fix possible OOM * test: add possibility to interact with several input query * chore: add max_batches to squad config * docs: upd README.md * fix(ranking_network): wrap y as np.array * fix: fix training stop for pytest * style: add license header * fix: refactor training stop for pytest * test: specify pytest_max_batches * test: add couple of marks for selecting tests * test: make Travis running only fast tests without GPU * fix: ranking config works in interactbot * fix: add downloading nltk punkt for tokenization (#140) * feat: bot start message for intents does not say anything about dstc2 (#142) * feat: interactbot command works with pipes that require multiple inputs (#137) * build: change TravisCI script (#143) * feat: add Glove embedder (#138) * feat: glove embedder added * feat: embeddings added to NER network * feat: dataset and embeddings are added to urls.py for downloading * fix: char embeddings added to pretrained embeddings * feat: embedder return list of embeddings instead zero padded np array * feat: capitalization added * feat: config modified according to new features * feat: double dense added to input parameters * feat:config parameters updated * chore: fix urls for conll NER, ontonotes model url added * feat: pytest_max_batches added for faster tran check * feat: ontonotes tests added * feat: test conll max batches added * Update README.md * feat: add seq2seq go bot * fix: lowercase text while interact * feat: check saved model params * fix: rm extra configs * feat: add kvret dataset_reader * feat: add kvret_dataset_iterator * fix: add configerror * fix: dirty fix for dialog data to be lowercased * feat: check np.int and int in Vocabulary * feat: seq2seqbot works for train and infer * feat: add bleu-metric * feat: add simple seq2seq_go_bot config * fix: fix inference and load() * feat: add variable scope for optimizer * feat: add support of multiple inputs and outputs in interact mode * fix: fix padding * feat: tokenizer argument in Vocabulary * feat: chainer returns batch of tuples instead of tuple of batches * fix: spacy_tokenizer returns [['']] for batch with empty string and add alpha_only argument * feat: add per_item_bleu * feat: train seq2seq_go_bot on utterance batches * feat: tokenize y_true * feat: fit kb_entries knowledge base * feat: add split tokenizer * feat: standartize tokenizers output * feat: normalize kb entities * feat: db_columns, db_items in each sample * fix: go_bot configs (for new vocab) and loading of network * style: minor restyling * feat: add config for infer * feat: add config for infer * feat: add seq2seq_go_bot pretrained model * feat: update telegram start and help messages * style: minor styling * docs: add simple readme * doc: remove red ... blocks * doc: change Dataset to DatasetIterator * doc: update list of configs * doc: update package structure * doc: add notes about dataset element in config * feat: add squad model description to README.md * doc: add config specification for seq2seq_go_bot * fix: lowercase text while interact * feat: check saved model params * fix: rm extra configs * feat: add kvret dataset_reader * feat: add kvret_dataset_iterator * fix: add configerror * fix: dirty fix for dialog data to be lowercased * feat: check np.int and int in Vocabulary * feat: seq2seqbot works for train and infer * feat: add bleu-metric * feat: add simple seq2seq_go_bot config * fix: fix inference and load() * feat: add variable scope for optimizer * feat: add support of multiple inputs and outputs in interact mode * fix: fix padding * feat: tokenizer argument in Vocabulary * feat: chainer returns batch of tuples instead of tuple of batches * fix: spacy_tokenizer returns [['']] for batch with empty string and add alpha_only argument * feat: add per_item_bleu * feat: train seq2seq_go_bot on utterance batches * feat: tokenize y_true * feat: fit kb_entries knowledge base * feat: add split tokenizer * feat: standartize tokenizers output * feat: normalize kb entities * feat: db_columns, db_items in each sample * fix: go_bot configs (for new vocab) and loading of network * style: minor restyling * feat: add config for infer * feat: add config for infer * feat: add seq2seq_go_bot pretrained model * feat: update telegram start and help messages * style: minor styling * docs: add simple readme * docs: add seq2seq_go_bot in main readme * docs: small fix * docs: add config specification for seq2seq_go_bot * chore: remove install.py (#151) * feat: add support for batches in go-bot * feat: batching v1 * feat: bow_encoder is optional * fix: probs calculation for use_action_mask=true * refactor: do not feed inital_state during train * feat: feed sequence lengths in dynamic_rnn * refactor: rename go_bot.py -> bot.py * Update README.md * feat: Ontonotes NER added * chore: train part removed from config * fix: readme dataset_iterator fixed, json removed from striong * feat: raw version of test added * fix: test modes * fix: folder name in ontonotes config and download path now consistent * fix: skip tests * feat: check GPU added to ner OntoNotes

* release 0.0.3 (#150) * feat: tests can be run from project root (#86) * refactor: instead of juggling global random states use instances of Random for datasets * test(): add test for interacting with custom queries After refactoring, it is possible to easily add list of query-response pairs for every model (config), which will be used to compare pretrained model output with expected output. Initial lists added for error_model and ner. Also URL for downloading pretrained ner_conll2003_model added IP-1344 #done * Update docs from master (#96) * fixed grammar and style * Update README.md * fix grammar & style * fix grammar & style * fix grammar&style in Intent classification README * doc: add supported platform notes * docs: correct paths to scripts and configs to be relative to repository root (#94) * docs: correct paths to scripts and configs to be relative to repository root fixes #93 * docs: set paths in basic examples to be relative to the project root * docs: run deep.py as a python module in examples * doc: add notes for python 3.5 * test(): change downloading to temp dir (#97) * feat: assert python version is 3.6 or higher * Rename dataset to dataset_iterator and other renames (#103) * refactor: rename 'dataset' to 'dataset_iterator' * refactor: rename dataset readers and iterators * refactor: classification iterator and reader * fix: dialog_iterator * test: fix downloading procedure (#108) * Feature/tf layers to core (#67) * feat: layers moved to core * feat: attention added * fix: highway/skip connections for different dimensionality of units are fixed * feat: NER now supports core layers * fix: minor docstrings fixes * feat: CuDNN GRU and LSTM added * feat: Bidirectional CuDNN GRU and LSTM added * feat: stacked bi-rnn refactored * fix: fixed arguments order in rnn * fix: remove duplicate mult_att * chore: merge with dev * fix: backward forward bug in cudnnrnn * refactor: use single fasttext module, clean dependencies * fix: add error when n_classes is zero * feat: add fastText model usage instead of fasttext * fix: emb_module default fastText * chore: embedding fixed in configs * chore: change new models names * feat: change intent embeddings in gobot configs * chore: fastText to fasttext, new model, change intents in gobot configs * chore: new url on new fasttext embeddings * fix: delete dowload all true * fix: add url of old embedding file * fix: delete comma * fix: delete old embedding file from urls * fix: delete pyfasttext from requirements, fasttext_embedder * fix: change pyfasttext embeddings from gobot * fix: delete from requirements * fix: delete gensim from fasttext_embedder * fix: simplify requirements * fix: fix dim in gobot_all config * refactor: remove redundant parameter 'emb_module' * feat: use wiki.en.bin embeddings in gobot_all * feat: check saved model params and fix lowercase for interact * fix: lowercase text while interact * feat: check saved model params * fix: rm extra configs * feat: add support for classification data in csv/json formats (#115) * feat: add support for csv/json classification datasets * feat: add tests for snips and samples * fix: gobot_all config fix * feat: add REST API for all models * Moved telegram_utils -> utils; Refactored telegram_ui.py * Moved telegram_utils -> utils: modified deeppavlov/deep.py * Fixed getting model name with get_main_component() in telegram_ui.py * chaner.py: minor fix in get_main_component() * Added riseapi launch mode * README.md: added riseapi mode reference * Updated README.MD and fixed requirements.txt * minor fixes in README.md * Fixes in utils/server * refactor: change endpoint names * feat: add SteamSpacyTokenizer * refactor: remove duplicating from script naming * refactor: outline detokenize() meth in utils, because it should be used by all tokenizers and doesn't depend on tokenize() * feat: add streaming spaCy tokenizer * refactor: DELETE original spaCy tokenizer, rename stream_spacy to spacy * refactor: rename tokenoizer scripts back * fix: wrong grammar * feat: include spacy_tokenizer import * feat: replace old SpacyTokenizer with new StreamSpacyTokenizer * feat: ability to manage lowercasing from class constructor, typing improvements * fix: update go-bot configs, so they would work with StreamSpacyTokenizer the same as with the old tokenizer * feat: add optional logging to the spacy tokenizer * docs: update docstrings * refactor: replace custom logger with deppavlov's, pep8 style update * refactor: uotline ngramize() cause it is independent from tokenizer classes * refactor: return original JSON formatting * fix: add **kwargs to __init__() * chore: update .gitignore * refactor: more stable and consistent code * feat: add TravisCI integration * build(): add TravisCI integration * build(): add TravisCI integration * feat: add ranking model * feat: add ranking model to deeppavlov * feat: add download of dataset and embedding_model * feat: adapt to new deeppavlov interfaces * refactor: use pathlib where available in the ranking model * feat: add saving and loading responses saving with np.save * feat: add saving and loading response embeddings saving with np.save, use response embeddings to calculate predictions in __call__ function * feat: add interact regime * feat: add interact_pred_num parameter * refactor: change parameter default value, change check if the file with embeddings model exists * fix: fix non-string keys in EmbeddingDict class * feat: add parameters dict for autotests * feat: add tests support * feat: add context embeddings vocabulary (it is used in interact regime to predict the most similar contexts) * chore: change shuffle parameter default value to True in batch_generator * refactor: change config to chainer representation * fix: bug fix in urls.py file * refactor: remove emb_vocab_file saving, move build_tok2int_vocab and make_ints funcs to InsuranceDict class, add set_embeddings and reset_embeddings funcs in RankingModel * feat: add initial documentation * refactor: remove idx2int vocabulary, add vocabularies saving * change config parameters default values, remove examples in tests * feat: add table in documentation * fix: fix bug in urls.py * refactor: remove paths from config * feat: add documentation * feat: add True in tests * feat: add documentation * refactor: move init/load in the load function. * refactor: change parameters in config * feat: add logging * feat: add more logging * feat: add documentation, change parameters values in config * fix: add genesis for ranking model * fix: requirements installation order that caused setup.py error * refactor: train script * feat: add documentation * feat: models parameters check for ner * feat: parameters check added to ner * feat: parameters check added to slotfill * chore: minor clean-up * fix: fix conll-2003 model file names and archive names * refactor: remove blank line * feat: allow to stop training after n batches (#127) * fix: many minor fixes * fix: fix mark_done data_path * refactor: rename ranking_dataset to ranking_iterator.py and move it to the dataset_iterators folder * fix: fix embedding matrix construction, change epochs num default parameter value * refactor: rename registered name and name of the class * refactor: rename files and classes * refactor: change dataset downlaod * feat: add insurance embeddings and datasets in urls.py * refactor: change batch data representation (#131) * feat: install tensorflow-gpu * feat: add SQUAD model * feat: add SQuAD dataset reader * feat: add dataset, preprocessing, config * feat: add VocabEmbedder for chars and tokens * feat&fix: add model realization * feat: add training support, answer postprocessing * fix: predicted answer extraction from context * fix: dropout mask * feat: true_answer is a list of answers now * merge with dev * docs: add some docstrings * refactor: renaming variables * docs: add README.md * feat: add support of multiple inputs and outputs in interact mode * docs: upd README.md * fix: bugs after merge with dev * fix: turn on training vocabs * fix: remove keep_prob multiplier for dropout mask * fix: add short contexts support * docs: upd README.md * feat: chainer returns batch of tuples instead of tuple of batches * docs: upd squad README.md * docs: upd squad README.md * feat: add link to pretrained SQuAD model * fix: SQuAD model url * feat: add embeddings downloading and upd config * feat: add variable scope for optimizer * refactor: do not override __init__ method for squad_iterator * fix: ensure that directory exists before saving SquadVocabEmbedder * style: upd names in config and docs * chore: remove main.py used for debugging * docs: upd README.md * fix: change batch_size to fix possible OOM * test: add possibility to interact with several input query * chore: add max_batches to squad config * docs: upd README.md * fix(ranking_network): wrap y as np.array * fix: fix training stop for pytest * style: add license header * fix: refactor training stop for pytest * test: specify pytest_max_batches * feat: use all pytest keys and not only max_batches (#134) * fix: remove result stringification * feat: add GPU_only and Slow marks for tests * feat: add SQuAD dataset reader * feat: add dataset, preprocessing, config * feat: add VocabEmbedder for chars and tokens * feat&fix: add model realization * feat: add training support, answer postprocessing * fix: predicted answer extraction from context * fix: dropout mask * feat: true_answer is a list of answers now * merge with dev * docs: add some docstrings * refactor: renaming variables * docs: add README.md * feat: add support of multiple inputs and outputs in interact mode * docs: upd README.md * fix: bugs after merge with dev * fix: turn on training vocabs * fix: remove keep_prob multiplier for dropout mask * fix: add short contexts support * docs: upd README.md * feat: chainer returns batch of tuples instead of tuple of batches * docs: upd squad README.md * docs: upd squad README.md * feat: add link to pretrained SQuAD model * fix: SQuAD model url * feat: add embeddings downloading and upd config * feat: add variable scope for optimizer * refactor: do not override __init__ method for squad_iterator * fix: ensure that directory exists before saving SquadVocabEmbedder * style: upd names in config and docs * chore: remove main.py used for debugging * docs: upd README.md * fix: change batch_size to fix possible OOM * test: add possibility to interact with several input query * chore: add max_batches to squad config * docs: upd README.md * fix(ranking_network): wrap y as np.array * fix: fix training stop for pytest * style: add license header * fix: refactor training stop for pytest * test: specify pytest_max_batches * test: add couple of marks for selecting tests * test: make Travis running only fast tests without GPU * fix: ranking config works in interactbot * fix: add downloading nltk punkt for tokenization (#140) * feat: bot start message for intents does not say anything about dstc2 (#142) * feat: interactbot command works with pipes that require multiple inputs (#137) * build: change TravisCI script (#143) * feat: add Glove embedder (#138) * feat: glove embedder added * feat: embeddings added to NER network * feat: dataset and embeddings are added to urls.py for downloading * fix: char embeddings added to pretrained embeddings * feat: embedder return list of embeddings instead zero padded np array * feat: capitalization added * feat: config modified according to new features * feat: double dense added to input parameters * feat:config parameters updated * chore: fix urls for conll NER, ontonotes model url added * feat: pytest_max_batches added for faster tran check * feat: ontonotes tests added * feat: test conll max batches added * Update README.md * feat: add seq2seq go bot * fix: lowercase text while interact * feat: check saved model params * fix: rm extra configs * feat: add kvret dataset_reader * feat: add kvret_dataset_iterator * fix: add configerror * fix: dirty fix for dialog data to be lowercased * feat: check np.int and int in Vocabulary * feat: seq2seqbot works for train and infer * feat: add bleu-metric * feat: add simple seq2seq_go_bot config * fix: fix inference and load() * feat: add variable scope for optimizer * feat: add support of multiple inputs and outputs in interact mode * fix: fix padding * feat: tokenizer argument in Vocabulary * feat: chainer returns batch of tuples instead of tuple of batches * fix: spacy_tokenizer returns [['']] for batch with empty string and add alpha_only argument * feat: add per_item_bleu * feat: train seq2seq_go_bot on utterance batches * feat: tokenize y_true * feat: fit kb_entries knowledge base * feat: add split tokenizer * feat: standartize tokenizers output * feat: normalize kb entities * feat: db_columns, db_items in each sample * fix: go_bot configs (for new vocab) and loading of network * style: minor restyling * feat: add config for infer * feat: add config for infer * feat: add seq2seq_go_bot pretrained model * feat: update telegram start and help messages * style: minor styling * docs: add simple readme * doc: remove red ... blocks * doc: change Dataset to DatasetIterator * doc: update list of configs * doc: update package structure * doc: add notes about dataset element in config * feat: add squad model description to README.md * doc: add config specification for seq2seq_go_bot * fix: lowercase text while interact * feat: check saved model params * fix: rm extra configs * feat: add kvret dataset_reader * feat: add kvret_dataset_iterator * fix: add configerror * fix: dirty fix for dialog data to be lowercased * feat: check np.int and int in Vocabulary * feat: seq2seqbot works for train and infer * feat: add bleu-metric * feat: add simple seq2seq_go_bot config * fix: fix inference and load() * feat: add variable scope for optimizer * feat: add support of multiple inputs and outputs in interact mode * fix: fix padding * feat: tokenizer argument in Vocabulary * feat: chainer returns batch of tuples instead of tuple of batches * fix: spacy_tokenizer returns [['']] for batch with empty string and add alpha_only argument * feat: add per_item_bleu * feat: train seq2seq_go_bot on utterance batches * feat: tokenize y_true * feat: fit kb_entries knowledge base * feat: add split tokenizer * feat: standartize tokenizers output * feat: normalize kb entities * feat: db_columns, db_items in each sample * fix: go_bot configs (for new vocab) and loading of network * style: minor restyling * feat: add config for infer * feat: add config for infer * feat: add seq2seq_go_bot pretrained model * feat: update telegram start and help messages * style: minor styling * docs: add simple readme * docs: add seq2seq_go_bot in main readme * docs: small fix * docs: add config specification for seq2seq_go_bot * chore: remove install.py (#151) * feat: add support for batches in go-bot * feat: batching v1 * feat: bow_encoder is optional * fix: probs calculation for use_action_mask=true * refactor: do not feed inital_state during train * feat: feed sequence lengths in dynamic_rnn * refactor: rename go_bot.py -> bot.py * Update README.md * feat: add Ontonotes NER with Senna * feat: Ontonotes NER added * chore: train part removed from config * fix: readme dataset_iterator fixed, json removed from striong * feat: raw version of test added * fix: test modes * fix: folder name in ontonotes config and download path now consistent * fix: skip tests * Revert "feat: add Ontonotes NER with Senna" (#160) This reverts commit ae91d8f. * Add custom slot_filler. * Add custom go_bot configuration gobot_my.json. * Import my_slot_filler.slotfill to register simple_slotfiller. * Add new configuration files for gobot_my and gobot_simple. * Build data_reader for pharma_bot. * Fix bugs about reading data. * Rename modules * Implement the data reader templates * Remove the normalize function because it is not needed. Assuming the slot names are not changed. * feat: add dstc2 with api calls * fix: episode_done fix * fix: mv db_result from y to x * feat: if made api_call respond with next prediction * feat: create dstc2_v2 with api_calls * refactor: change interact_db_result logic and fix debug output * feat: add learning rate polynomial decay * refactor: rm moved to core modules * feat: add Sqlite3Database * feat: rm db_result_during_interaction & add database in go_bot * refactor: logging * fix: import fix * feat: add threshold for levenshtein score * fix: fix l2 regularization * feat: add features to state tracker * feat: add db context features * feat: new config for intents model with wiki.en.bin * docs: readme for intents add info about two models with different embeddings for DSTC 2 * feat: configs use dstc2_v2 * fix: minor db fix * feat: add variational dropout & fix logging * feat: raw slotfiller moved to the separate folder and inherited from Serializable * feat: orthodox slotfill moved to slotfill folder * refactor: external functions to class methods * feat: config for raw slotfiller added * feat: slotfilling config is simplified by dstc_ner config reference * chore: unnecessary imports removed * chore: fix outdated imports * docs: simple description for raw slotfill * fix: fix action mask for api-calls * refactor: dict to tuple in database * feat: add training for database * feat: add optimizer configuration * docs: fix github links * docs: add new network parameters and database * docs: add -d description * feat: add threshold to slotfill configs * fix: attention over intents work * docs: add comparison with external models * docs: dstc2_v2 vs dstc2 * docs: fix dstc2_reader * refactor: add api call action as a config parameter * docs: add template and database doc * feat: template type from str to class * docs: database class * feat: update configs * feat: fix templates * refactor: rm extra files * refactor: database.py -> sqlite_database.py * feat: mv dropout to dense layer, update configs, fix dropout_rate * feat: update gobot_dstc2_best model * feat: dropout on attentioned embeddings * feat: add intents_dstc2_big tests * feat: update gobot_dstc2 model * feat: retrain gobot_dstc2 * refactor: rm unused code * docs: update examples and metrics * fix: remove training from slotfill

yoptar added a commit that referenced this issue Mar 12, 2018

docs: correct paths to scripts and configs to be relative to reposito…

b3e4e43

…ry root fixes #93

This was referenced Mar 12, 2018

docs: correct paths to scripts and configs to be relative to repository root #94

Merged

Say in README that python3.5 is not supported #95

Closed

seliverstov closed this as completed in #94 Mar 12, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Downloading requirements #93

Downloading requirements #93

ohld commented Mar 12, 2018

Downloading requirements #93

Downloading requirements #93

Comments

ohld commented Mar 12, 2018