From af2e07ed17bff2ed8ba69d4d216ae7764af39524 Mon Sep 17 00:00:00 2001 From: Tomasz Kornuta Date: Tue, 28 May 2019 13:16:46 -0700 Subject: [PATCH 1/3] wip --- README.md | 36 ++++++++++++++++++++---------------- 1 file changed, 20 insertions(+), 16 deletions(-) diff --git a/README.md b/README.md index ee9cd50..85a3788 100644 --- a/README.md +++ b/README.md @@ -11,14 +11,30 @@ ## Description -PyTorchPipe (PTP) fosters the development of computational _pipelines_ and comparison of diverse neural network-based models. +PyTorchPipe (PTP) is component-oriented framework that fosters the development of computational _multi-modal pipelines_ and comparison of diverse neural network-based models. PTP frames training and testing procedures as _pipelines_ consisting of many components communicating through data streams. -Each such a stream can consist of several components, including one problem instance (providing batches of data), (zero-or-more) trainable models and (any number of) additional components providing required transformations and computations. +Each such a stream can consist of several components, including one problem instance (providing batches of data), any number of trainable components (models) and additional components providing required transformations and computations. As a result, the training & testing procedures are no longer pinned to a specific problem or model, and built-in mechanisms for compatibility checking (handshaking), configuration management & statistics collection facilitate running diverse experiments. -In its core, to _accelerate the computations_ on their own, PTP relies on PyTorch and extensively uses its mechanisms for distribution of computations on CPUs/GPUs. +In its core, to _accelerate the computations_ on their own, PTP relies on PyTorch and extensively uses its mechanisms for distribution of computations on CPUs/GPUs, including multi-threaded data loaders and multi-GPU data parallelism. +More importantly, the models are agnostic to those and one indicates whether to use them in configuration files (data loaders) or by passing run-time arguments (--gpu). + +**Datasets:** PTP focuses on multi-modal perpeption combining vision and language. Currently it offers the following _Problems_ from both domains: + + * ImageCLEF VQA-Med 2019 (Visual Question Answering) + * MNIST (Image Classification) + * WiLY (Language Identification) + * WikiText-2 / WikiText-103 (Language Modelling) + * ANKI (Machine Translation) + +Aside of providing batches of samples, the Problem class will automatically download the files associated with a given dataset (as long as the dataset is publicly available). + +**Model Zoo:** + + +**Workers:** ## Installation @@ -38,25 +54,13 @@ cd pytorchpipe/ Then, install the dependencies by running:: -```console -python setup.py install -``` - -This command will install all dependencies via pip_. - ---- -**NOTE** - -If you plan to develop and introduce changes, please call the following command instead:: - ```console python setup.py develop ``` -This will enable you to change the code of the existing components/workers and still be able to run them by calling the associated ``ptp-*`` commands. +This command will install all dependencies via pip_, while still enabling you to change the code of the existing components/workers and running them by calling the associated ``ptp-*`` commands. More in that subject can be found in the following blog post on [dev_mode](https://setuptools.readthedocs.io/en/latest/setuptools.html#development-mode). ---- ## Maintainers From 34903d38bffb4a260446533816de3e54122a400d Mon Sep 17 00:00:00 2001 From: Tomasz Kornuta Date: Tue, 28 May 2019 14:44:16 -0700 Subject: [PATCH 2/3] wip on readme --- README.md | 48 ++++++++++-- .../components/models/attn_decoder_rnn..yml | 78 ------------------- 2 files changed, 43 insertions(+), 83 deletions(-) delete mode 100644 configs/default/components/models/attn_decoder_rnn..yml diff --git a/README.md b/README.md index 85a3788..3a72183 100644 --- a/README.md +++ b/README.md @@ -11,7 +11,7 @@ ## Description -PyTorchPipe (PTP) is component-oriented framework that fosters the development of computational _multi-modal pipelines_ and comparison of diverse neural network-based models. +PyTorchPipe (PTP) is a component-oriented framework that facilitates development of computational _multi-modal pipelines_ and comparison of diverse neural network-based models. PTP frames training and testing procedures as _pipelines_ consisting of many components communicating through data streams. Each such a stream can consist of several components, including one problem instance (providing batches of data), any number of trainable components (models) and additional components providing required transformations and computations. @@ -19,9 +19,10 @@ Each such a stream can consist of several components, including one problem inst As a result, the training & testing procedures are no longer pinned to a specific problem or model, and built-in mechanisms for compatibility checking (handshaking), configuration management & statistics collection facilitate running diverse experiments. In its core, to _accelerate the computations_ on their own, PTP relies on PyTorch and extensively uses its mechanisms for distribution of computations on CPUs/GPUs, including multi-threaded data loaders and multi-GPU data parallelism. -More importantly, the models are agnostic to those and one indicates whether to use them in configuration files (data loaders) or by passing run-time arguments (--gpu). +The models are _agnostic_ to those operations and one indicates whether to use them in configuration files (data loaders) or by passing adequate run-time arguments (--gpu). -**Datasets:** PTP focuses on multi-modal perpeption combining vision and language. Currently it offers the following _Problems_ from both domains: +**Datasets:** +PTP focuses on multi-modal reasoning combining vision and language. Currently it offers the following _Problems_ from both domains: * ImageCLEF VQA-Med 2019 (Visual Question Answering) * MNIST (Image Classification) @@ -30,17 +31,54 @@ More importantly, the models are agnostic to those and one indicates whether to * ANKI (Machine Translation) Aside of providing batches of samples, the Problem class will automatically download the files associated with a given dataset (as long as the dataset is publicly available). +The diversity of those problems proves the flexibility of the framework, we are working on incorporation of new ones into PTP. **Model Zoo:** - +What people typically define as _model_ in PTP is decomposed into components, with _Model_ being a defived class that contains trainable elements. +Those components are loosely coupled and care only about the inputs they retrieve and outputs they produce. +The framework offers full flexibility and it is up to the programer to choose the _granularity_ of his/her components/models. +However, PTP provides several ready to use, out of the box components, from ones of general usage to very specialized ones: + + * Feed Forward Network (Fully Connected layers with activation functions and dropout, variable number of hidden layers, general usage) + * Torch Vision Wrapper (wrapping several models from Torch Vision, e.g. VGG-16, ResNet-50, ResNet-152, DenseNet-121, general usage) + * Convnet Encoder (CNNs with ReLU and MaxPooling, can work with different sizes of images) + * LeNet-5 (classical baseline) + * Recurrent Neural Network (different kernels with activation functions and dropout, a single model can work both as encoder or decoder, general usage) + * Seq2Seq (Sequence to Sequence model, classical baseline) + * Attention Decoder (RNN-based decoder implementing Banadau-style attention, classical baseline) + * Sencence Embeddings (encodes words using embedding layer, general usage) + +Currently PTP offers the following models useful for multi-modal fusion and reasoning: + + * VQA Attention (simple question-driven attention over the image) + * Element Wise Multiplication (Multi-modal Low-rank Bilinear pooling, MLB) + * Multimodel Compact Bilinear Pooling (MCB) + * Miltimodal Factorized Bilinear Pooling + * Relational Networks + +The framework also offers several components useful when working with text: + + * Sentence Tokenizer + * Sentence Indexer + * Sentence One Hot Encoder + * Label Indexer + * BoW Encoder + * Word Decoder + +and several general-purpose components, from tensor transformations (List to Tensor, Reshape Tensor, Reduce Tensor, Concatenate Tensor), to components calculating losses (NLL Loss) and statistics (Accuracy Statistics, Precision/Recall Statistics, BLEU Statistics etc.) to viewers (Stream Viewer, Stream File Exporter etc.). **Workers:** +PTP workers are python scripts that are _agnostic_ to the problems/models/pipelines that they are supposed to work with. +Currently framework offers two main workers: + + * ptp-online-trainer (a flexible trainer creating separate instances of training and validation problems and training the models by feeding the created pipeline with batches of data depending, relying on the notion of an _episode_) + * ptp-processor (performing one pass over the samples returned by a given problem instance, useful for collecting scores on test set, answers for submissions to competitions etc.) ## Installation PTP relies on [PyTorch](https://github.com/pytorch/pytorch), so you need to install it first. -Refer to the official installation [guide](https://github.com/pytorch/pytorch#installation) for its installation. +Please refer to the official installation [guide](https://github.com/pytorch/pytorch#installation) for details. It is easily installable via conda_, or you can compile it from source to optimize it for your machine. PTP is not (yet) available as a [pip](https://pip.pypa.io/en/stable/quickstart/) package, or on [conda](https://anaconda.org/pytorch/pytorch). diff --git a/configs/default/components/models/attn_decoder_rnn..yml b/configs/default/components/models/attn_decoder_rnn..yml deleted file mode 100644 index f676809..0000000 --- a/configs/default/components/models/attn_decoder_rnn..yml +++ /dev/null @@ -1,78 +0,0 @@ -# This file defines the default values for the GRU decoder with attention. - -#################################################################### -# 1. CONFIGURATION PARAMETERS that will be LOADED by the component. -#################################################################### - -# Size of the hidden state (LOADED) -hidden_size: 100 - -# Wether to include the last hidden state in the outputs -output_last_state: False - -# Type of recurrent cell (LOADED) -# -> Only GRU is supported - -# Number of "stacked" layers (LOADED) -# -> Only a single layer is supported - -# Dropout rate (LOADED) -# Default: 0 (means that it is turned off) -dropout_rate: 0 - -# Prediction mode (LOADED) -# Options: -# * Dense (passes every activation through output layer) | -# * Last (passes only the last activation though output layer) | -# * None (all outputs are discarded) -prediction_mode: Dense - -# Enable FFN layer at the output of the RNN (before eventual feed back in the case of autoregression). -# Useful if the raw outputs of the RNN are needed, for attention encoder-decoder for example. -ffn_output: True - -# Length of generated output sequence (LOADED) -# User must set it per task, as it is task specific. -autoregression_length: 10 - -# If true, output of the last layer will be additionally processed with Log Softmax (LOADED) -use_logsoftmax: True - -streams: - #################################################################### - # 2. Keymappings associated with INPUT and OUTPUT streams. - #################################################################### - - # Stream containing batch of encoder outputs (INPUT) - inputs: inputs - - # Stream containing the inital state of the RNN (INPUT) - # The stream will be actually created only if `inital_state: Input` - input_state: input_state - - # Stream containing predictions (OUTPUT) - predictions: predictions - - # Stream containing the final output state of the RNN (output) - # The stream will be actually created only if `output_last_state: True` - output_state: output_state - -globals: - #################################################################### - # 3. Keymappings of variables that will be RETRIEVED from GLOBALS. - #################################################################### - - # Size of the input (RETRIEVED) - input_size: input_size - - # Size of the prediction (RETRIEVED) - prediction_size: prediction_size - - #################################################################### - # 4. Keymappings associated with GLOBAL variables that will be SET. - #################################################################### - - #################################################################### - # 5. Keymappings associated with statistics that will be ADDED. - #################################################################### - From 59b10ab219bc72bf1ace3b22449c9a31ab988afb Mon Sep 17 00:00:00 2001 From: Tomasz Kornuta Date: Tue, 28 May 2019 15:25:41 -0700 Subject: [PATCH 3/3] typos fix --- README.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/README.md b/README.md index 3a72183..74c545e 100644 --- a/README.md +++ b/README.md @@ -45,7 +45,7 @@ However, PTP provides several ready to use, out of the box components, from ones * LeNet-5 (classical baseline) * Recurrent Neural Network (different kernels with activation functions and dropout, a single model can work both as encoder or decoder, general usage) * Seq2Seq (Sequence to Sequence model, classical baseline) - * Attention Decoder (RNN-based decoder implementing Banadau-style attention, classical baseline) + * Attention Decoder (RNN-based decoder implementing Bahdanau-style attention, classical baseline) * Sencence Embeddings (encodes words using embedding layer, general usage) Currently PTP offers the following models useful for multi-modal fusion and reasoning: @@ -83,14 +83,14 @@ It is easily installable via conda_, or you can compile it from source to optimi PTP is not (yet) available as a [pip](https://pip.pypa.io/en/stable/quickstart/) package, or on [conda](https://anaconda.org/pytorch/pytorch). However, we provide the `setup.py` script and recommend to use it for installation. -First please clone the project repository:: +First please clone the project repository: ```console git clone git@github.com:IBM/pytorchpipe.git cd pytorchpipe/ ``` -Then, install the dependencies by running:: +Next, install the dependencies by running: ```console python setup.py develop