Skip to content
Permalink
Browse files

+fix typo (#354)

  • Loading branch information
lnashier authored and owulveryck committed Dec 1, 2019
1 parent 22586f4 commit 9ee42cb9dc071733795c6ecf682a43f4468c6413
Showing with 21 additions and 21 deletions.
  1. +21 −21 README.md
@@ -19,7 +19,7 @@ Gorgonia:

# Why Use Gorgonia? #

The main reason to use Gorgonia is developer comfort. If you're using a Go stack extensively, now you have access to the ability to create production-ready machine learning systems in an environment that you are already familiar and comfortable with.
The main reason to use Gorgonia is developer comfort. If you're using a Go stack extensively, now you have access to the ability to create production-ready machine learning systems in an environment that you are already familiar and comfortable with.

ML/AI at large is usually split into two stages: the experimental stage where one builds various models, test and retest; and the deployed state where a model after being tested and played with, is deployed. This necessitate different roles like data scientist and data engineer.

@@ -45,12 +45,12 @@ require gorgonia.org/gorgonia v0.9.3
The current stable version is 0.9.3


## Versioning ##
## Versioning ##

We use [semver 2.0.0](http://semver.org/) for our versioning. Before 1.0, Gorgonia's APIs are expected to change quite a bit. API is defined by the exported functions, variables and methods. For the developers' sanity, there are minor differences to semver that we will apply prior to version 1.0. They are enumerated below:

* The MINOR number will be incremented every time there is a deletrious break in API. This means any deletion, or any change in function signature or interface methods will lead to a change in MINOR number.
* Additive changes will NOT change the MINOR version number prior to version 1.0. This means that if new functionality were added that does not break the way you use Gorgonia, there will not be an increment in the MINOR version. There will be an increment in the PATCH version.
* The MINOR number will be incremented every time there is a deleterious break in API. This means any deletion, or any change in function signature or interface methods will lead to a change in MINOR number.
* Additive changes will NOT change the MINOR version number prior to version 1.0. This means that if new functionality were added that does not break the way you use Gorgonia, there will not be an increment in the MINOR version. There will be an increment in the PATCH version.

## Go Version Support ##

@@ -101,9 +101,9 @@ Gorgonia's project has a [Slack channel on gopher](https://gophers.slack.com/mes

# Usage #

Gorgonia works by creating a computation graph, and then executing it. Think of it as a programming language, but is limited to mathematical functions, and has no branching capability (no if/then or loops). In fact this is the dominant paradigm that the user should be used to thinking about. The computation graph is an [AST](http://en.wikipedia.org/wiki/Abstract_syntax_tree).
Gorgonia works by creating a computation graph, and then executing it. Think of it as a programming language, but is limited to mathematical functions, and has no branching capability (no if/then or loops). In fact this is the dominant paradigm that the user should be used to thinking about. The computation graph is an [AST](http://en.wikipedia.org/wiki/Abstract_syntax_tree).

Microsoft's [CNTK](https://github.com/Microsoft/CNTK), with its BrainScript, is perhaps the best at exemplifying the idea that building of a computation graph and running of the computation graphs are different things, and that the user should be in different modes of thoughts when going about them.
Microsoft's [CNTK](https://github.com/Microsoft/CNTK), with its BrainScript, is perhaps the best at exemplifying the idea that building of a computation graph and running of the computation graphs are different things, and that the user should be in different modes of thoughts when going about them.

Whilst Gorgonia's implementation doesn't enforce the separation of thought as far as CNTK's BrainScript does, the syntax does help a little bit.

@@ -157,7 +157,7 @@ You might note that it's a little more verbose than other packages of similar na

The author would like to contend that this is a Good Thing - to shift one's thinking to a machine-based thinking. It helps a lot in figuring out where things might go wrong.

Additionally, there are no support for branching - that is to say there are no conditionals (if/else) or loops. The aim is not to build a Turing-complete computer.
Additionally, there are no support for branching - that is to say there are no conditionals (if/else) or loops. The aim is not to build a Turing-complete computer.

### VMs ###

@@ -174,7 +174,7 @@ Prior to release of Gorgonia, there was a third VM - a stack based VM that is si

## Differentiation ##

Gorgonia performs both symbolic and automatic differentiation. There are subtle differences between the two processes. The author has found that it's best to think of it this way - Automatic differentiation is differentiation that happens at runtime, concurrently with the execution of the graph, while symbolic differentiation is differentiation that happens during the compilation phase.
Gorgonia performs both symbolic and automatic differentiation. There are subtle differences between the two processes. The author has found that it's best to think of it this way - Automatic differentiation is differentiation that happens at runtime, concurrently with the execution of the graph, while symbolic differentiation is differentiation that happens during the compilation phase.

Runtime of course, refers to the execution of the expression graph, not the program's actual runtime.

@@ -310,7 +310,7 @@ By the way, Gorgonia comes with nice-ish graph printing abilities. Here's an exa

![graph1](https://raw.githubusercontent.com/gorgonia/gorgonia/master/media/exprGraph_example2.png)

To read the graph is easy. The expression builds from bottom up, while the derivations build from top down. This way the derivative of each node is roughly on the same level.
To read the graph is easy. The expression builds from bottom up, while the derivations build from top down. This way the derivative of each node is roughly on the same level.

Red-outlined nodes indicate that it's a root node. Green outlined nodes indicate that they're a leaf node. Nodes with a yellow background indicate that it's an input node. The dotted arrows indicate which node is the gradient node for the pointed-to node.

@@ -338,20 +338,20 @@ A Node is rendered thusly:

Gorgonia comes with CUDA support out of the box. However, usage is specialized. To use CUDA, you must build your application with the build tag `cuda`, like so:

```
```
go build -tags='cuda' .
```

Furthermore, there are some additional requirements:

1. [CUDA toolkit 9.0](https://developer.nvidia.com/cuda-toolkit) is required. Installing this installs the `nvcc` compiler which is required to run your code with CUDA.
2. Be sure to follow the [post-installation steps](http://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#post-installation-actions)
3. `go install gorgonia.org/gorgonia/cmd/cudagen`. This installs the `cudagen` program.
3. `go install gorgonia.org/gorgonia/cmd/cudagen`. This installs the `cudagen` program.
4. Running `cudagen` will generate the relevant CUDA related code for Gorgonia. Note that you will need a folder at `src\gorgonia.org\gorgonia\cuda modules\target`
4. Only certain ops are supported by the CUDA driver by now. They are implemented in a seperate [`ops/nn` package](https://godoc.org/github.com/gorgonia/gorgonia/ops/nn).
4. Only certain ops are supported by the CUDA driver by now. They are implemented in a seperate [`ops/nn` package](https://godoc.org/github.com/gorgonia/gorgonia/ops/nn).
5. `runtime.LockOSThread()` must be called in the main function where the VM is running. CUDA requires thread affinity, and therefore the OS thread must be locked.

Because `nvcc` only plays well with `gcc` version 6 and below (the current version is 7), this is also quite helpful: `sudo ln -s /path/to/gcc-6 /usr/local/cuda-9.0/bin/gcc`
Because `nvcc` only plays well with `gcc` version 6 and below (the current version is 7), this is also quite helpful: `sudo ln -s /path/to/gcc-6 /usr/local/cuda-9.0/bin/gcc`
### Example ###

So how do we use CUDA? Say we've got a file, `main.go`:
@@ -408,21 +408,21 @@ If the program is to be run using CUDA, then this must be invoked:
go run -tags='cuda'
```

And even so, only the `tanh` function uses CUDA.
And even so, only the `tanh` function uses CUDA.

### Rationale ###

The main reasons for having such complicated requirements for using CUDA is quite simply performance related. As Dave Cheney famously wrote, [cgo is not Go](https://dave.cheney.net/2016/01/18/cgo-is-not-go). To use CUDA, cgo is unfortunately required. And to use cgo, plenty of tradeoffs need to be made.

Therefore the solution was to nestle the CUDA related code in a build tag, `cuda`. That way by default no cgo is used (well, kind-of - you could still use `cblas` or `blase`).
Therefore the solution was to nestle the CUDA related code in a build tag, `cuda`. That way by default no cgo is used (well, kind-of - you could still use `cblas` or `blase`).

The reason for requiring [CUDA toolkit 8.0](https://developer.nvidia.com/cuda-toolkit) is because there are many CUDA [Compute Capabilities](http://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#compute-capabilities), and generating code for them all would yield a huge binary for no real good reason. Rather, users are encouraged to compile for their specific Compute Capabilities.

Lastly, the reason for requiring an explicit specification to use CUDA for which ops is due to the cost of cgo calls. Additional work is being done currently to implement batched cgo calls, but until that is done, the solution is keyhole "upgrade" of certain ops

### `Op`s supported by CUDA ###

As of now, only the very basic simple ops support CUDA:
As of now, only the very basic simple ops support CUDA:

Elementwise unary operations:

@@ -468,7 +468,7 @@ Gorgonia's API is as of right now, not considered stable. It will be stable from

# Roadmap #

Here are the goals for Gorgonia, sorted by importance
Here are the goals for Gorgonia, sorted by importance

- [ ] 80+% test coverage. Current coverage is 50% for Gorgonia and 80% for the `tensor`.
- [ ] More advanced operations (like `einsum`). The current Tensor operators are pretty primitive.
@@ -487,7 +487,7 @@ Here are the goals for Gorgonia, sorted by importance

The primary goal for Gorgonia is to be a *highly performant* machine learning/graph computation-based library that can scale across multiple machines. It should bring the appeal of Go (simple compilation and deployment process) to the ML world. It's a long way from there currently, however, the baby steps are already there.

The secondary goal for Gorgonia is to provide a platform for exploration for non-standard deep-learning and neural network related things. This includes things like neo-hebbian learning, corner-cutting algorithms, evolutionary algorithms and the like.
The secondary goal for Gorgonia is to provide a platform for exploration for non-standard deep-learning and neural network related things. This includes things like neo-hebbian learning, corner-cutting algorithms, evolutionary algorithms and the like.


# Contributing #
@@ -498,7 +498,7 @@ See also: [CONTRIBUTING.md](CONTRIBUTING.md)


## Contributors and Significant Contributors ##
All contributions are welcome. However, there is a new class of contributor, called Significant Contributors.
All contributions are welcome. However, there is a new class of contributor, called Significant Contributors.

A Significant Contributor is one who has shown *deep understanding* of how the library works and/or its environs. Here are examples of what constitutes a Significant Contribution:

@@ -517,9 +517,9 @@ The best way of support right now is to open a [ticket on Github](https://github

### Why are there seemingly random `runtime.GC()` calls in the tests? ###

The answer to this is simple - the design of the package uses CUDA in a particular way: specifically, a CUDA device and context is tied to a `VM`, instead of at the package level. This means for every `VM` created, a different CUDA context is created per device per `VM`. This way all the operations will play nicely with other applications that may be using CUDA (this needs to be stress-tested, however).
The answer to this is simple - the design of the package uses CUDA in a particular way: specifically, a CUDA device and context is tied to a `VM`, instead of at the package level. This means for every `VM` created, a different CUDA context is created per device per `VM`. This way all the operations will play nicely with other applications that may be using CUDA (this needs to be stress-tested, however).

The CUDA contexts are only destroyed when the `VM` gets garbage collected (with the help of a finalizer function). In the tests, about 100 `VM`s get created, and garbage collection for the most part can be considered random. This leads to cases where the GPU runs out of memory as there are too many contexts being used.
The CUDA contexts are only destroyed when the `VM` gets garbage collected (with the help of a finalizer function). In the tests, about 100 `VM`s get created, and garbage collection for the most part can be considered random. This leads to cases where the GPU runs out of memory as there are too many contexts being used.

Therefore at the end of any tests that may use GPU, a `runtime.GC()` call is made to force garbage collection, freeing GPU memories.

0 comments on commit 9ee42cb

Please sign in to comment.
You can’t perform that action at this time.