-
Notifications
You must be signed in to change notification settings - Fork 370
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Harsha/reorg #118
Harsha/reorg #118
Conversation
…h rolling support.
…Fast(G)RNN cells and models
…Fast(G)RNN cells and models
…Fast(G)RNN cells and models
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey Harsha,
Few comments I've added inline with the code.
README.md
Outdated
|
||
### Organization | ||
- The `edgeml_tf` directory contains the graphs and models in TensorFlow, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
graphs and models is not a standard terminology. Mode might even be wrong because by model people usually mean the matrices and those are not included. Could you just use something like implementation of inference and training routines of these algorithms in tensorflow
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about "edgeml_tf contains a definition of these architectures", and "examples/tf contains example training" runs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that works. Just remove the word model
README.md
Outdated
### Details and project pages | ||
For details, please see our | ||
[project page](https://microsoft.github.io/EdgeML/), | ||
[wiki](https://github.com/Microsoft/EdgeML/wiki/), and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
De-linking from the wiki might be a good idea. We aren't maintaining it. Its reddundant if we intent to move to microsofot.github.io/EdgeML anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
edgeml_pytorch/README.md
Outdated
available in Tensorflow: | ||
|
||
1. [Bonsai](../docs/publications/Bonsai.pdf) | ||
2. [S-RNN](../docs/publications/srnn-??.pdf) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not available yet. Please remove hyper-link.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes sir.
[batchSize, timeSteps, inputDims] else | ||
[timeSteps, batchSize, inputDims] | ||
''' | ||
class RNNCell(nn.Module): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm curious: do we have to do anything special to the existing cells, the ones already native to torch, to make it ONIX compatible ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think they should be fine to export to ONNX without this ugly contraption.
setup.py
Outdated
@@ -0,0 +1,11 @@ | |||
import setuptools #enables develop | |||
|
|||
setuptools.setup( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this will work. Did you test this?
For pip packages the structure is
package_name/
package_code/
README.md
LICENSE.md
setup.py
So, for our case with two packages, we will have (within $EDGEML_HOME)
edgeml_tf/
edgeml_tf/ # The package code
LICENSE
README.md
setup.py
and
edgeml_pytorch/
edgeml_pytorch/ # The package code
LICENSE
README.md
setup.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I overlooked this. I can change the name to setup_python since that needs to be released.
Do you want to release a TF package at this point?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
moved this to edgeml_pytorch
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah we dont' have to release the tf package to pypi
. Lets just stick to pytorch
.
But that is not what I meant by structural change. Renaming setup.py
doesnt have anything to do with that.
The confusion here is because the parent directory has the same name as the package. Let me try to explain what I am trying to say with the old organization. Earlier we had the following structure:
${EDGEML_ROOT}/
|
|-- tf/ # <--- this directory is where you run pip from
|-- edgeml_tf/ # <--- This is the actual package.
|-- setup.py # <--- This is the setup.py for that package.
|-- README.md
Note how the setup.py
and the top level directory of the package, that is the first part of edgeml_tf.graph.*
, is on the same level.
In the new organization, the tf
folder corresponds to the edgeml_tf
folder and the tf/edgeml_tf
folder corresponds to edgeml_tf/edgeml_tf
folder. The code will be in edgeml_tf/edgeml_tf/graph
and edgeml_tf/edgeml_tf/trainer
etc.
Hope that makes sense. Here is an example for reference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@SachinG007 Could you have a look at this? Testing should fail in a new environment with this setup.py
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are you suggesting that we create another folder $EDGEML_ROOT/pytorch into which edgeml_pytorch goes
Yes. For a few reasons.
-
Even though we are only releasing pytorch package and want to move forward with pytorch, I still want to keep the tensorflow package structure in the repository. TF will definitely become useful sometime in the future for RNN related work. One simple scenario that we will immediately need is testing/benchmarking tf and pytorch code within the same virtualenvironment. This will require local installations of both the packages. For this to work, we need two
setup.py
files; both of them need to be in$EDGEML_ROOT/pytorch/setup.py
and$EDGEML_ROOT/tf/setup.py
. -
As of now, our edgeml repository is a collection of various sub-projects; C++, tensorflow package, pytorch package, applications. In the future, other packages might pop up that would require its own
setup.py
files. Lets just be future proof and isolateedgeml_pytorch
to its own directory. Lets not put any sub-project specific files at the top level directory ($EDGEML_ROOT). Else we will have to deal with this re-org again.
@adityakusupati Can you have a look here? Do you have other suggestions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@metastableB, @harsha-simhadri I don't understand python packaging and the norms that well.
- I completely agree with @metastableB on TF retaining its packaging structure until pytorch is viable for custom RNNs.
- I again agree with you, given the possibility of more tools/directories/frameworks which might come up, it is better to isolate installation of each of the packages.
However, having said the above two, I tried the old packaging format that is here and this works seamlessly for pytorch (I am not sure about TF as I haven't tested it) @SachinG007 and @pushkalkatara did the structure here ensure TF examples ran without issues?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@adityakusupati That is odd. The pytoch link should not work, unless you run pip install ./
from ${EDGEML_ROOT}
. If you try to install the package from within the edgeml_pytorch
folder, pip will complain that setup.py
is not found.
Anyway, @harsha-simhadri the only thing you have to do is move $EDGEML_ROOT/edgeml_pytorch
to $EDGEML_ROOT/pytorch/edgeml_pytorch/
and similarly move the $EDGEML_ROOT/edgeml_tf/
to $EDGEML_ROOT/tf/edgeml_tf/
. Having done that, you then move $EDGEML_ROOT/setup.py
to $EDGEML_ROOT/pytorch/
. And create a setup file for the tf
directory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, will create subfolders for TF and PyTorch. I wont create a TF set up file yet since I have not read through that code much.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll raise a PR for that this weekend. Thanks.
@harsha-simhadri I ran the notebooks and scripts after the path changes. They run fine |
fixed |
@SachinG007 can you please follow-up on Issue 119. Your testing should have caught this error. |
In examples folder, I think we should have another folder for data preparation.
|
I'm against code sharing in examples. While I agree that data preparation in examples is pretty much the same and we can re-use the code, I would prefer keeping each example independent, isolated. Let say we decide to share pre-processing . It is likely that some future algorithm will require a slightly different pre-processing method for the same dataset, and to support this we will have to add more ifs and buts on top of the common data processing code like:
Its better we isolate examples and all code relating to it. |
ERROR: File "setup.py" not found. |
@SachinG007 the setup file is renamed, please do the needful. |
@metastableB @adityakusupati |
Reorganisation Fixes
@mr-yamraj Agree regarding data prep being another folder under example. That is a future PR. Lets not wait on that for this PR |
will start another PR once directory structure is updated to @metastableB 's request |
This PR organizes the code into (a) edge_tf + examples/tf, and (b) edgeml_pytorch + examples/pytorch. All notebooks and scripts need to be tested for disruptions in path.
Also has a cleaned up version of the multi-layer RNN and train_classifer.