Harsha/reorg #118

harsha-simhadri · 2019-08-17T17:19:48Z

This PR organizes the code into (a) edge_tf + examples/tf, and (b) edgeml_pytorch + examples/pytorch. All notebooks and scripts need to be tested for disruptions in path.

Also has a cleaned up version of the multi-layer RNN and train_classifer.

…h rolling support.

…Fast(G)RNN cells and models

metastableB

Hey Harsha,

Few comments I've added inline with the code.

metastableB · 2019-08-18T01:52:08Z

README.md


+### Organization
+ - The `edgeml_tf` directory contains the graphs and models in TensorFlow,


graphs and models is not a standard terminology. Mode might even be wrong because by model people usually mean the matrices and those are not included. Could you just use something like implementation of inference and training routines of these algorithms in tensorflow

how about "edgeml_tf contains a definition of these architectures", and "examples/tf contains example training" runs?

Yeah that works. Just remove the word model

metastableB · 2019-08-18T01:53:35Z

README.md

+### Details and project pages
+For details, please see our
+ [project page](https://microsoft.github.io/EdgeML/),
+ [wiki](https://github.com/Microsoft/EdgeML/wiki/), and


De-linking from the wiki might be a good idea. We aren't maintaining it. Its reddundant if we intent to move to microsofot.github.io/EdgeML anyway.

metastableB · 2019-08-18T01:56:48Z

edgeml_pytorch/README.md

+available in Tensorflow:
+
+1. [Bonsai](../docs/publications/Bonsai.pdf)
+2. [S-RNN](../docs/publications/srnn-??.pdf)


Not available yet. Please remove hyper-link.

metastableB · 2019-08-18T02:05:43Z

edgeml_pytorch/graph/rnn.py

-    [batchSize, timeSteps, inputDims] else
-    [timeSteps, batchSize, inputDims]
-    '''
+class RNNCell(nn.Module):


I'm curious: do we have to do anything special to the existing cells, the ones already native to torch, to make it ONIX compatible ?

I think they should be fine to export to ONNX without this ugly contraption.

metastableB · 2019-08-18T02:30:54Z

setup.py

@@ -0,0 +1,11 @@
+import setuptools #enables develop
+
+setuptools.setup(


I don't think this will work. Did you test this?

For pip packages the structure is

package_name/ package_code/ README.md LICENSE.md setup.py

So, for our case with two packages, we will have (within $EDGEML_HOME)

edgeml_tf/ edgeml_tf/ # The package code LICENSE README.md setup.py

and

edgeml_pytorch/ edgeml_pytorch/ # The package code LICENSE README.md setup.py

I overlooked this. I can change the name to setup_python since that needs to be released.

Do you want to release a TF package at this point?

moved this to edgeml_pytorch

Yeah we dont' have to release the tf package to pypi. Lets just stick to pytorch.

But that is not what I meant by structural change. Renaming setup.py doesnt have anything to do with that.

The confusion here is because the parent directory has the same name as the package. Let me try to explain what I am trying to say with the old organization. Earlier we had the following structure:

${EDGEML_ROOT}/ | |-- tf/ # <--- this directory is where you run pip from |-- edgeml_tf/ # <--- This is the actual package. |-- setup.py # <--- This is the setup.py for that package. |-- README.md

Note how the setup.py and the top level directory of the package, that is the first part of edgeml_tf.graph.*, is on the same level.

In the new organization, the tf folder corresponds to the edgeml_tf folder and the tf/edgeml_tf folder corresponds to edgeml_tf/edgeml_tf folder. The code will be in edgeml_tf/edgeml_tf/graph and edgeml_tf/edgeml_tf/trainer etc.

Hope that makes sense. Here is an example for reference.

@SachinG007 Could you have a look at this? Testing should fail in a new environment with this setup.py.

are you suggesting that we create another folder $EDGEML_ROOT/pytorch into which edgeml_pytorch goes

Yes. For a few reasons.

Even though we are only releasing pytorch package and want to move forward with pytorch, I still want to keep the tensorflow package structure in the repository. TF will definitely become useful sometime in the future for RNN related work. One simple scenario that we will immediately need is testing/benchmarking tf and pytorch code within the same virtualenvironment. This will require local installations of both the packages. For this to work, we need two setup.py files; both of them need to be in $EDGEML_ROOT/pytorch/setup.py and $EDGEML_ROOT/tf/setup.py.

As of now, our edgeml repository is a collection of various sub-projects; C++, tensorflow package, pytorch package, applications. In the future, other packages might pop up that would require its own setup.py files. Lets just be future proof and isolate edgeml_pytorch to its own directory. Lets not put any sub-project specific files at the top level directory ($EDGEML_ROOT). Else we will have to deal with this re-org again.

@adityakusupati Can you have a look here? Do you have other suggestions?

@metastableB, @harsha-simhadri I don't understand python packaging and the norms that well.

I completely agree with @metastableB on TF retaining its packaging structure until pytorch is viable for custom RNNs.

I again agree with you, given the possibility of more tools/directories/frameworks which might come up, it is better to isolate installation of each of the packages.

However, having said the above two, I tried the old packaging format that is here and this works seamlessly for pytorch (I am not sure about TF as I haven't tested it) @SachinG007 and @pushkalkatara did the structure here ensure TF examples ran without issues?

@adityakusupati That is odd. The pytoch link should not work, unless you run pip install ./ from ${EDGEML_ROOT}. If you try to install the package from within the edgeml_pytorch folder, pip will complain that setup.py is not found.

Anyway, @harsha-simhadri the only thing you have to do is move $EDGEML_ROOT/edgeml_pytorch to $EDGEML_ROOT/pytorch/edgeml_pytorch/ and similarly move the $EDGEML_ROOT/edgeml_tf/ to $EDGEML_ROOT/tf/edgeml_tf/. Having done that, you then move $EDGEML_ROOT/setup.py to $EDGEML_ROOT/pytorch/. And create a setup file for the tf directory.

OK, will create subfolders for TF and PyTorch. I wont create a TF set up file yet since I have not read through that code much.

I'll raise a PR for that this weekend. Thanks.

SachinG007 · 2019-08-18T05:12:44Z

@harsha-simhadri I ran the notebooks and scripts after the path changes. They run fine

harsha-simhadri · 2019-08-18T09:11:53Z

@mr-yamraj @adityakusupati @metastableB Please take a look at the new organization and updates.
Much work remains in terms of making variable naming homogenous, and gixing ONNX warnings.

@harsha-simhadri , the org structure looks good except the stray SRNN example https://github.com/microsoft/EdgeML/tree/harsha/reorg/pytorch/examples/SRNN

fixed

adityakusupati · 2019-08-18T19:57:35Z

@harsha-simhadri I ran the notebooks and scripts after the path changes. They run fine

@SachinG007 can you please follow-up on Issue 119. Your testing should have caught this error.

mr-yamraj · 2019-08-20T14:17:55Z

In examples folder, I think we should have another folder for data preparation.
Reason to do this:

At present, the same script is stored in different folders. (For ex: process_usps.py)
And the same processed data (ex: google-speech-data or usps) can be used to train models with scripts present in SRNN, FastCells, Bonsai, ProtoNN.

metastableB · 2019-08-20T14:25:44Z

In examples folder, I think we should have another folder for data preparation.
Reason to do this:

At present, the same script is stored in different folders. (For ex: process_usps.py)

And the same processed data (ex: google-speech-data or usps) can be used to train models with scripts present in SRNN, FastCells, Bonsai, ProtoNN.

I'm against code sharing in examples. While I agree that data preparation in examples is pretty much the same and we can re-use the code, I would prefer keeping each example independent, isolated.

Let say we decide to share pre-processing . It is likely that some future algorithm will require a slightly different pre-processing method for the same dataset, and to support this we will have to add more ifs and buts on top of the common data processing code like:

First go to common data processing scripts and run the scripts to download and process data.
Then come here and run this additional processing script on the extracted data to make it compatible to the current algorithm (and so forth).

Its better we isolate examples and all code relating to it.

SachinG007 · 2019-08-20T15:24:08Z

ERROR: File "setup.py" not found.
On executing pip install -e .

adityakusupati · 2019-08-20T18:02:56Z

@SachinG007 the setup file is renamed, please do the needful.
@mr-yamraj , I agree with @metastableB about data preprocessing. Every point he makes is valid.

mr-yamraj · 2019-08-20T20:28:25Z

@metastableB @adityakusupati
I feel that the users of Edgeml will try to use examples in the following way (Let's say to train KWS models):
If I have to train a FastGRNN Model, then I will go inside FastCell dir run data preprocessing then run training script.
Then if I want to try out SRNN Model, then I will go inside SRNN dir run data preprocessing again then run training script.

Reorganisation Fixes

harsha-simhadri · 2019-08-21T09:32:15Z

In examples folder, I think we should have another folder for data preparation.
Reason to do this:

At present, the same script is stored in different folders. (For ex: process_usps.py)

And the same processed data (ex: google-speech-data or usps) can be used to train models with scripts present in SRNN, FastCells, Bonsai, ProtoNN.

@mr-yamraj Agree regarding data prep being another folder under example. That is a future PR. Lets not wait on that for this PR

harsha-simhadri · 2019-08-21T10:12:23Z

will start another PR once directory structure is updated to @metastableB 's request

Yash Gaurkar and others added 30 commits July 22, 2019 23:13

file structure changed and multi layer fastgrnn model class added wit…

085c8bc

…h rolling support.

version changed to 0.2.1

4b30755

training code added

9379189

Update train_classifier.py

8de1b8c

initial commit

e35aee8

moved config optimizer and lr out of fit function

c90b28e

added sparsity and hard thresholding support to train_classifier and …

b8b4dd7

…Fast(G)RNN cells and models

added sparsity and hard thresholding support to train_classifier and …

a2f1ddb

…Fast(G)RNN cells and models

added sparsity and hard thresholding support to train_classifier and …

e2c8a8f

…Fast(G)RNN cells and models

removing unnecessary code

39f8a17

added model size export

ce32db7

reorg code for sparsity

cd2a2fd

ensure model is in device after sparsifying

c1a3eae

ensure model is in device after sparsifying

a35778f

cleaned up FastGRNN code

bd7732a

cleaned up FastGRNN code

27ef35e

cleaned up FastGRNN code

a4deef9

cleaned up FastGRNN code

c2ac41c

renamed RNN classifier model class and get function

cd35fa9

minor edits

eadae5b

renaming num_weight_matrices to separate vars for num of U and W mats

92bf361

trying to generlize ONNX exporter

1329bfb

cleaning up fastmodel.py

d373338

model size call works

4db7109

Merge branch 'master' into harsha/pytorch

bf71343

moving params appropriately for gpu

8f3a558

loopified multi-layer RNN model

5c232c3

loopified multi-layer RNN model init

13a96e5

refactoring BaseRNN forward pass

086c09f

move EMI-RNN ipynbaway from root

455cb83

pushkalkatara added 6 commits August 18, 2019 01:08

Added relative links

64fe8d0

FastCells Doc Fix

1595a02

IHT Routine Fix #119

bd49385

FastCell GRNN example fix

08a3826

ProtoNN Example Fix

5371ff0

EMIRNN example fixes

c7a2cb7

pushkalkatara mentioned this pull request Aug 18, 2019

Testing pytorch implementations. #117

Closed

metastableB suggested changes Aug 18, 2019

View reviewed changes

removing pytorch folder in root

96957a0

harsha-simhadri added 2 commits August 18, 2019 15:03

updated readme. Moving Applications to applications, Tools to tools

51c5ae0

moving package creation setup to edgeml_pytorch

793995c

pushkalkatara and others added 7 commits August 20, 2019 03:33

remove reference from computational graph - .detach()

8a11890

detach followed by clone

414273d

fixing setup_python path and contents

ffe578a

adding num_biases

819976e

fixing base class constructor calls

3325298

Merge branch 'harsha/reorg' into harsha/reorg

dc9bb1b

resolve num_bias

9351a16

Merge pull request #120 from pushkalkatara/harsha/reorg

40663ca

Reorganisation Fixes

harsha-simhadri closed this Aug 21, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Harsha/reorg #118

Harsha/reorg #118

harsha-simhadri commented Aug 17, 2019

metastableB left a comment

metastableB Aug 18, 2019

harsha-simhadri Aug 18, 2019

metastableB Aug 19, 2019

metastableB Aug 18, 2019

harsha-simhadri Aug 18, 2019

metastableB Aug 18, 2019

harsha-simhadri Aug 18, 2019

metastableB Aug 18, 2019

harsha-simhadri Aug 18, 2019

metastableB Aug 18, 2019

harsha-simhadri Aug 18, 2019

harsha-simhadri Aug 18, 2019

metastableB Aug 19, 2019 •

edited

metastableB Aug 19, 2019

metastableB Aug 20, 2019 •

edited

adityakusupati Aug 20, 2019

metastableB Aug 20, 2019 •

edited

harsha-simhadri Aug 21, 2019

metastableB Aug 21, 2019

SachinG007 commented Aug 18, 2019

harsha-simhadri commented Aug 18, 2019

adityakusupati commented Aug 18, 2019

mr-yamraj commented Aug 20, 2019 •

edited

metastableB commented Aug 20, 2019

SachinG007 commented Aug 20, 2019 •

edited

adityakusupati commented Aug 20, 2019

mr-yamraj commented Aug 20, 2019 •

edited

harsha-simhadri commented Aug 21, 2019

harsha-simhadri commented Aug 21, 2019


		### Organization
		- The `edgeml_tf` directory contains the graphs and models in TensorFlow,

		@@ -0,0 +1,11 @@
		import setuptools #enables develop

		setuptools.setup(

Harsha/reorg #118

Harsha/reorg #118

Conversation

harsha-simhadri commented Aug 17, 2019

metastableB left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

metastableB Aug 19, 2019 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

metastableB Aug 20, 2019 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

metastableB Aug 20, 2019 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SachinG007 commented Aug 18, 2019

harsha-simhadri commented Aug 18, 2019

adityakusupati commented Aug 18, 2019

mr-yamraj commented Aug 20, 2019 • edited

metastableB commented Aug 20, 2019

SachinG007 commented Aug 20, 2019 • edited

adityakusupati commented Aug 20, 2019

mr-yamraj commented Aug 20, 2019 • edited

harsha-simhadri commented Aug 21, 2019

harsha-simhadri commented Aug 21, 2019

metastableB Aug 19, 2019 •

edited

metastableB Aug 20, 2019 •

edited

metastableB Aug 20, 2019 •

edited

mr-yamraj commented Aug 20, 2019 •

edited

SachinG007 commented Aug 20, 2019 •

edited

mr-yamraj commented Aug 20, 2019 •

edited