Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Added Graph.get_subnetwork for constructing subnetworks #491

Merged
merged 4 commits into from
Jan 31, 2020

Conversation

mreppen
Copy link
Contributor

@mreppen mreppen commented Jan 24, 2020

This is work on #487

This PR contains code for extracting a subnetwork from an existing network.

Some examples:
Original network:

let nn = Graph.(
  let inp = input [| 2 |] in
  let x1 = lambda ~name:"x1" ~out_shape:[|1|] (fun x -> Algodiff.Maths.get_slice [[]; [0]] x) inp in
  let x2 = lambda ~name:"x2" ~out_shape:[|1|] (fun x -> Algodiff.Maths.get_slice [[]; [1]] x) inp in
  let f1 = fully_connected ~name:"f1" 1 x1 in
  let f2 = fully_connected ~name:"f2" 1 x2 in
  let sum = add ~name:"sum" [| f1; f2 |] in
  sum |> get_network)
let f1nn = Graph.get_subnetwork ~in_names:[|"x1"|] (Graph.get_node nn "f1")

f1nn can then be used in Graph.model f1nn with input of shape [|1|]. Without ~in_names, it uses the inputs of nn, so in this case Graph.get_subnetwork (Graph.get_node nn "f1") has input shape [|2|].

In this example, Graph.get_subnetwork (Graph.get_node nn "sum") has the same structure as nn. An exception is raised if ~in_names does not contain all necessary inputs:

# Graph.get_subnetwork ~in_names:[|"x1"|] (Graph.get_node nn "sum");;
Exception:
Failure "Owl_neural_graph:get_subnetwork Subnetwork depends on input input_0".

Comments:

  • The code has 3 steps: 1. Recursively traverse the subgraph and create new nodes. 2. Sort the topology using the original network. 3. Connect the new nodes.
  • Currently does not copy the neurons. Which are the use cases for copying/not copying?
  • Currently only creates a subnetwork with one output. Extending to multi-outputs should be trivial.
  • Fails if ~in_names contains nodes that the output does not depend on, e.g. Graph.get_subnetwork ~in_names:[|"x1"; "x2"|] (Graph.get_node nn "f1"). Should this be adjusted to include x2 as an unconnected input?
  • Other approaches mentioned in Chaining networks and extracting intermediate calculations in Neural.Graph #487 might still be neat to have.
  • I have coded naïvely, so likely one could add asserts and otherwise improve the code.

@ryanrhymes ryanrhymes added enhancement R&D Core research and development labels Jan 24, 2020
@jzstark
Copy link
Collaborator

jzstark commented Jan 29, 2020

Sorry for the delay. This new feature looks really good and helpful.

The suggraph function is similar to the graph cutting in TensorFlow for a given node, and perhaps it would be great if the inputs of a given output can be automatically found so that users don't have to keep tracking which inputs are required in a large graph.

One case I can think of for copying is that the network weights can be further changed, such as fine tuning the parameters in training. In that case create the subgraph as a seperate network maybe necessary.

Perhaps we could leave some notes as comment of the function about things to be done such as multi-output, assertions in code, and other possible future improvements.

One last thing is that currently Owl is using ocaml-format for a unified coding style. So running make format would be helpful.

@mreppen
Copy link
Contributor Author

mreppen commented Jan 30, 2020

Thanks for the feedback. I have updated the formatting.

About the inputs: If you do not supply in_names, it takes the same inputs as the original network (trivial "autodetect"). Making it find the smallest set of inputs needed should not be a problem, but would require a little rewriting.

Nevertheless, I find in_names useful, as it allows me to specify where I want the input nodes of the subnetwork. In the above example I can then input data after the slice instead of before.

One case against copying is that it leaves no way of obtaining one with references, whereas not copying still leaves the door open to afterwards copy. I don't know if there is that much of a use case for references in the end, but if it makes sense, one could expose two different functions in the interface or offer another optional argument.

I will have a look at autodetecting the minimal set of inputs and doing multiple outputs. When there is a final set of features, I can think about where assertions would be useful.

Life is a bit busy the next two or so weeks, but I'll let you know when I have something.

@ryanrhymes
Copy link
Member

@mreppen Looks good to me, great work! Shall we wait for autodetection thing or that can be submitted as another PR? I am fine with merging this one. @jzstark what's your opinion?

@mreppen
Copy link
Contributor Author

mreppen commented Jan 30, 2020

@ryanrhymes I think I have a working version for autodetection, so I think it is worth waiting. I just need to have another look at it. However, the code (also the current PR) relies on names being unique, which is violated by Graph.inputs (see #493 )

@ryanrhymes
Copy link
Member

@mreppen ok, ping me when you think it is ready, thank you!

@mreppen
Copy link
Contributor Author

mreppen commented Jan 31, 2020

@ryanrhymes, this should be a correct version with autodetection of inputs. I also renamed in_names to make_inputs, which hopefully better explains the function.

I am fine with merging at any point. I could make further changes here or in separate PRs.

Some things left on my list:

  • As the function returns a structure with references, would it make sense to append an underscore? Or have I misunderstood the Owl notation?
  • I have a version with multiple outputs too. Would that be a separate _array function like lamda_array or should the current signature be rewritten?

Sidenote: As mentioned, this relies on the #493. @ryanrhymes, you added a verified commit. What happens to that after a rebase on the master branch? In this case it does not matter much, but I'm curious to know.

@ryanrhymes ryanrhymes merged commit 334199a into owlbarn:master Jan 31, 2020
mseri added a commit to mseri/opam-repository that referenced this pull request Feb 25, 2020
CHANGES:

*  Fix bug in _squeeze_broadcast (owlbarn/owl#503)
*  Added the Dawson function (Ndarray + Matrix + Algodiff op) (owlbarn/owl#502)
*  Fix bug in reverse mode gradients of aiso operations and pow (owlbarn/owl#501)
*  Added poisson_rvs to Owl_distribution (owlbarn/owl#499)
*  Draw poisson RVs in Ndarray and Mat modules (owlbarn/owl#498)
*  Broadcast bug for higher order derivatives (owlbarn/owl#495)
*  add sem to dense ndarray and matrix (owlbarn/owl#497)
*  Avoid input duplication with Graph.model and multi-input nn (owlbarn/owl#494)
*  Added Graph.get_subnetwork for constructing subnetworks (owlbarn/owl#491)
*  Make Graph.inputs give unique names to inputs (owlbarn/owl#493)
*  modify nlp interfaces
*  Re-add removed DiffSharp acknowledgment (owlbarn/owl#486)
*  add pretty printer for hypothesis type
*  update lambda neuron (owlbarn/owl#485)
*  fix example due to owlbarn/owl#476
*  Extend base linalg functions to complex numbers (owlbarn/owl#479)
*  [breaking] use a separate module for algodiff instead of ndarray directly (owlbarn/owl#476)
*  temp workaround and unittest (owlbarn/owl#478)
*  [breaking] Interface files for base/dense and base/linalg (owlbarn/owl#472)
*  Port code to dune2 (owlbarn/owl#474)
*  [breaking]  interface files to simplify .mli files in owl/dense (owlbarn/owl#471)
*  Save and load Npy files (owlbarn/owl#470)
*  Owl: relax bounds on base and stdio (owlbarn/owl#469)
*  Merged tests for different AD operations into one big test + autoformat tests with ocamlformat (owlbarn/owl#468)

### 0.7.2 (2019-12-06)

* fourth order finite diff approx to grad
* changes to improve stability of sylv/discrete_lyap
* fix bug in concatenate function
* add mli for owl_base_linalg_generic
* Owl-base linalg routines: LU decomposition  (owlbarn/owl#465)
* bug fixes
* Update owl.opam

### 0.7.1 (2019-11-27)

* Add unit basis
* Fix issue owlbarn/owl#337 and owlbarn/owl#457 (owlbarn/owl#458)
* owl-base: drop seemingly unnecessary dependency on integers (owlbarn/owl#456)

### 0.7.0 (2019-11-14)

* Add unsafe network save (owlbarn/owl#429)
* Sketch Count-Min and Heavy-Hitters
* Various bugfixes
* Owl_io.marshal_to_file: use to_channel
* Do not create .owl folder when loading owl library
* Re-design of exceptions and replace asserts with verify
* Add OWL_DISABLE_LAPACKE_LINKING_FLAG
* Reorganise Algodiff module
* Add parameter support to Zoo
* Two new features in algodiff: eye and linsolve (triangular option) + improved stability of qr and chol
* Implemented solve triangular
* Added linsolve and lq reverse-mode differentiation
* Fix build on archlinux (pkg-config cblas)
* Add median and sort along in ndarray
* Improve stability of lyapunov gradient tests

### 0.6.0 (2019-07-17)

* Add unsafe network save (owlbarn/owl#429)
* Sketch Count-Min and Heavy-Hitters
* Various ugfixes
* Owl_io.marshal_to_file: use to_channel
* Do not create .owl folder when loading owl library
* Re-design of exceptions and replace asserts with verify
* Add OWL_DISABLE_LAPACKE_LINKING_FLAG
* Reorganise Algodiff module
* Add parameter support to Zoo
* Two new features in algodiff: eye and linsolve (triangular option) + improved stability of qr and chol
* Implemented solve triangular
* Added linsolve and lq reverse-mode differentiation
* Fix build on archlinux (pkg-config cblas)
* Add median and sort along in ndarray
* Improve stability of lyapunov gradient tests

### 0.5.0 (2019-03-05)

* Improve building and installation.
* Fix bugs and improve performance.
* Add more functions to Algodiff.
* Split plot module out as sub library.
* Split Tfgraph module out as sub library.

### 0.4.2 (2018-11-10)

* Optimise computation graph module.
* Add some core math functions.
* Fix bugs and improve performance.

### 0.4.1 (2018-11-01)

* Improve the APIs of Dataframe module.
* Add more functions in Utils module.

### 0.4.0 (2018-08-08)

* Fix some bugs and improve performance.
* Introduce computation graph into the functor stack.
* Optimise repeat and tile function in the core.
* Adjust the OpenCL library according to computation graph.
* Improve the API of Dataframe module.
* Add more implementation of convolution operations.
* Add dilated convolution functions.
* Add transposed convolution functions.
* Add more neurons into the Neural module.
* Add more unit tests for core functions.
* Move from `jbuilder` to `dune`
* Assuage many warnings

### 0.3.8 (2018-05-22)

* Add initial support for dataframe functionality.
* Add IO module for Owl's specific file operations.
* Add more helper functions in Array module in Base.
* Add core functions such as one_hot, slide, and etc.
* Fix normalisation neuron in neural network module.
* Fix building, installation, and publishing on OPAM.
* Fix broadcasting issue in Algodiff module.
* Support negative axises in some ndarray functions.
* Add more statistical distribution functions.
* Add another higher level wrapper for CBLAS module.

### 0.3.7 (2018-04-25)

* Fix some bugs and improve performance.
* Fix some docker files for automatic image building.
* Move more pure OCaml implementation to base library.
* Add a new math module to support complex numbers.
* Improve the configuration and building system.
* Improve the automatic documentation building system.
* Change template code into C header files.
* Add initial support for OpenMP with evaluation.
* Tidy up packaging using TOPKG.

### 0.3.6 (2018-03-22)

* Fix some bugs and improve performance.
* Add more unit tests for Ndarray module.
* Add version control in Zoo gist system.
* Add tensor contract operations in Ndarray.
* Add more documentation of various functions.
* Add support of complex numbers of convolution and pooling functions.
* Enhance Owl's own Array submodule in Utils module.
* Fix pretty printer for rank 0 ndarrays.
* Add functions to iterate slices in an ndarray.
* Adjust the structure of OpenCL module.

### 0.3.5 (2018-03-12)

* Add functions for numerical integration.
* Add functions for interoperation.
* Add several root-finding algorithms.
* Introduce Owl's exception module.
* Add more functions in Linalg module.
* Add native convolution function in core.
* Remove Eigen dependency of dense data structure.
* Fix some bugs and improve performance.

### 0.3.4 (2018-02-26)

* Update README, ACKNOWLEDGEMENT, and etc.
* Initial implementation of OpenCL Context module.
* Fix some bugs and improve the performance.
* Add Adam learning rate algorithm in Optimise module.
* Add a number of statistical functions into Stats.
* Enhance View functor and add more functions.
* Include and documentation of NLP modules.
* Add dockerfile for various linux distributions.

### 0.3.3 (2018-02-12)

* Fix some bugs and improve the performance.
* Integrate with Owl's documentation system.
* Add native C implementation of pooling operations.
* Add more operators in Operator module.
* Add more functions in Linalg module.
* Optimise the Base library.
* Add more unit tests.

### 0.3.2 (2018-02-08)

* Fix some bugs and improve the performance.
* Functorise many unit tests and add more tests.
* Rewrite the documentation migrate to Sphinx system.
* Migrate many pure OCaml code into Base library.
* Implement the initial version of Base library.

### 0.3.1 (2018-01-25)

* Design View module as an experimental module for Ndarray.
* Include Mersenne Twister (SFMT) to generate random numbers.
* Implement random number generator of various distributions.
* Implement native functions for maths and stats module.
* Include FFTPACK to provide native support for FFT functions.
* Minimise dependency, remove dependencies on Gsl and etc.
* Implement slicing and indexing as native C functions.
* Use new extended indexing operators for slicing functions.
* Refine ndarray fold function and introduce scan function.
* Reorganise the module structure in the source tree.
* Fix some bugs and enhance the performance of core functions.
* Add another 200+ unit tests.

### 0.3.0 (2017-12-05)

* Migrate to jbuilder building system.
* Unify Dense Ndarray and Matrix types.
* Split Toplevel out as a separate library.
* Redesign Zoo system for recursive importing.
* Simplify the module signature for Ndarray.
* Introduce functions in Ndarray module to support in-place modification.
* Introduce reduction functions to reduce an ndarray to a scalar value.
* Add Lazy functor to support lazy evaluation, dataflow, and incremental computing.
* Implement a new and more powerful pretty printer to support both ndarray and matrix.
* Fix bugs in the core module, improve the performance.

### 0.2.8 (2017-09-02)

* New Linalg module is implemented.
* New neural network module supports both single and double precision.
* New Optimise and Regression module is built atop of Algodiff.
* Experimental Zoo system is introduced as a separate library.
* Enhance core functions and fix some bugs.

### 0.2.7 (2017-07-11)

* Enhance basic math operations for complex numbers.
* Redesign Plot module and add more plotting functions.
* Add more hypothesis test functions in Owl.Stats module.
* Support both numerical and algorithmic differentiation in Algodiff.
* Support both matrices and n-dimensional arrays in Algodiff module.
* Support interoperation of different number types in Ext and new operators.
* Support more flexible slicing, tile, repeat, and concatenate functions.
* Support n-dimensional array of any types in Dense.Ndarray.Any module.
* Support simple feedforward and convolutional neural networks.
* Support experimental distributed and parallel computing.

### 0.2.0 (2017-01-20)

* Support both dense and sparse matrices.
* Support both dense and sparse n-dimensional arrays.
* Support both real and complex numbers.
* Support both single and double precisions.
* Add more vectorised operation: sin, cos, ceil, and etc.
* Add basic unit test framework for Owl.
* Add a couple of Topic modelling algorithms.

### 0.1.0 (2016-11-09)

* Initial architecture of Owl library.
* Basic support for double precision real dense matrices.
* Basic linear functions for dense matrices.
* Basic support for plotting functions.
* SI, MKS, CGS, and CGSM metric system.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement R&D Core research and development
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants