Skip to content
master
Go to file
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
 
 
 
 
 
 
 
 
 
 
doc
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

README.md

syncognite - A neural network library inspired by Stanford's CS231n course

Build Status License

A neural network library for convolutional, fully connected nets and RNNs in C++.

This library implements some of the assignments from Stanfords's CS231n 2016 course by Andrej Karpathy, Fei-Fei Li, Justin Johnson and CS224d by Richard Socher as C++ framework.

Current state: alpha

Features

  • Fully connected networks
  • Convolutional layers
  • Recurrent nets (RNNs)
  • Long-term short-term memory nets (LSTMs)
  • ReLu, Sigmoid, TanH, SELU(1), resilu(2) nonlinearities
  • BatchNorm, SpatialBatchNorm, Dropout layers
  • Softmax, SVM loss
  • TemporalAffine and TemporalSoftmax layers for RNNs

[1]: "scaled exponential linear units" (SELUs), https://arxiv.org/abs/1706.02515

[2]: "resilu residual & relu nonlinearity + linearity" (linear skip connection combined with non-linearity) (s.b.)

Sample

Model

Example: C++ definition of a deep convolutional net with batch-norm, dropout and fully connected layers:

LayerBlock lb(R"({"name":"DomsNet","bench":false,"init":"orthonormal"})"_json);

lb.addLayer("Convolution", "cv1", R"({"inputShape":[1,28,28],"kernel":[48,5,5],"stride":1,"pad":2})",{"input"});
lb.addLayer("BatchNorm","sb1","{}",{"cv1"});
lb.addLayer("Relu","rl1","{}",{"sb1"});
lb.addLayer("Dropout","doc1",R"({"drop":0.8})",{"rl1"});
lb.addLayer("Convolution", "cv2", R"({"kernel":[48,3,3],"stride":1,"pad":1})",{"doc1"});
lb.addLayer("Relu","rl2","{}",{"cv2"});
lb.addLayer("Convolution", "cv3", R"({"kernel":[64,3,3],"stride":2,"pad":1})",{"rl2"});
lb.addLayer("BatchNorm","sb2","{}",{"cv3"});
lb.addLayer("Relu","rl3","{}",{"sb2"});
lb.addLayer("Dropout","doc2",R"({"drop":0.8})",{"rl3"});
lb.addLayer("Convolution", "cv4", R"({"kernel":[64,3,3],"stride":1,"pad":1})",{"doc2"});
lb.addLayer("Relu","rl4","{}",{"cv4"});
lb.addLayer("Convolution", "cv5", R"({"kernel":[128,3,3],"stride":2,"pad":1})",{"rl4"});
lb.addLayer("BatchNorm","sb3","{}",{"cv5"});
lb.addLayer("Relu","rl5","{}",{"sb3"});
lb.addLayer("Dropout","doc3",R"({"drop":0.8})",{"rl5"});
lb.addLayer("Convolution", "cv6", R"({"kernel":[128,3,3],"stride":1,"pad":1})",{"doc3"});
lb.addLayer("Relu","rl6","{}",{"cv6"});

lb.addLayer("Affine","af1",R"({"hidden":1024})",{"rl6"});
lb.addLayer("BatchNorm","bn1","{}",{"af1"});
lb.addLayer("Relu","rla1","{}",{"bn1"});
lb.addLayer("Dropout","do1",R"({"drop":0.7})",{"rla1"});
lb.addLayer("Affine","af2",R"({"hidden":512})",{"do1"});
lb.addLayer("BatchNorm","bn2","{}",{"af2"});
lb.addLayer("Relu","rla2","{}",{"bn2"});
lb.addLayer("Dropout","do2",R"({"drop":0.7})",{"rla2"});
lb.addLayer("Affine","af3",R"({"hidden":10})",{"do2"});
lb.addLayer("Softmax","sm1","{}",{"af3"});

Training

json jo(R"({"verbose":true,"shuffle":true,"lr_decay":0.95,"epsilon":1e-8})"_json);
jo["epochs"]=(floatN)40.0;
jo["batch_size"]=50;
jo["learning_rate"]=(floatN)5e-4;
jo["regularization"]=(floatN)1e-8;

lb.train(X, y, Xv, yv, "Adam", jo);

floatN train_err, val_err, test_err;
train_err=lb.test(X, y, jo.value("batch_size", 50));
val_err=lb.test(Xv, yv, jo.value("batch_size", 50));
test_err=lb.test(Xt, yt, jo.value("batch_size", 50));

cerr << "Final results on MNIST after " << jo.value("epochs",(floatN)0.0) << " epochs:" << endl;
cerr << "      Train-error: " << train_err << " train-acc: " << 1.0-train_err << endl;
cerr << " Validation-error: " << val_err <<   "   val-acc: " << 1.0-val_err << endl;
cerr << "       Test-error: " << test_err <<  "  test-acc: " << 1.0-test_err << endl;

see mnisttest or cifar10test for complete examples.

A model that generates text via LSTMs can be defined with:

json j0;
string oName{"OH0"};
j0["inputShape"]=vector<int>{T};
j0["V"]=VS;
lb.addLayer("OneHot",oName,j0,{"input"});

int layer_depth=4;
string nName;
json j1;
j1["inputShape"]=vector<int>{VS,T};
j1["N"]=BS;
j1["H"]=H;
j1["forgetgateinitones"]=true;
j1["forgetbias"]=1.0;
j1["clip"]=clip;
for (auto l=0; l<layer_depth; l++) {
	nName="lstm"+std::to_string(l);
	lb.addLayer(rnntype,nName,j1,{oName});
	oName=nName;
}

json j11;
j11["inputShape"]=vector<int>{VS,T};
lb.addLayer("TemporalSoftmax","sm1",j11,{"af1"});

see rnnreader for a complete example.

Dependencies:

  • C++ 11 compiler (on Linux (tested: clang, gcc, Intel icpc) or macOS (clang x86-64 and Apple silicon (clang 12)), Raspberry ARM(gcc))
  • CMake build system.
  • Hdf5 C++ API for model saving and sample data, hdf5 or libhdf5-dev.

Apple silicon beta notes for hdf5

  • macOS 11 (ARM) currently [07-2020] requires building the HDF5 libraries from source
  • additionally, the python dataset download tools require h5py, which also currently needs to be built from source for Apple silicon.

Optional dependencies:

  • Cuda, OpenCL, ViennaCL (experimental, optional for BLAS speedups)

External libraries that are included in the source tree:

  • Eigen v3.3 eigen3, already (in default configuration) included in the source tree as submodule.
  • nlohmann_json, already included in source tree (cpneural/nlohmann_json).

Build

syncognite uses the CMake build system.

Clone the repository:

git clone git://github.com/domschl/syncognite
git submodule init
git submodule update    # This gets the in-tree Eigen3

Create a Build directory within the syncognite directory and configure the build:

# in sycognite/Build, default is make-build-system, but Ninja can also be used:
cmake [-G Ninja] ..
# optionally use ccmake to configure options and paths:
ccmake ..

macOS users might want to configure for building with Xcode:

cmake -G Xcode ..

Build the project:

make
# or
ninja
# or (macOS) start Xcode and load the generated project file.

History

  • 2020-07-31: Apple ARM tested ok.
  • 2020-07-05: Tests with resilu (non-)linearity
  • 2018-03-02: Removed faulty RAN layer, switched to official eigen3 github-mirror at: Github eigen3, fixes for eigen-dev stricted type-checking.

Subprojects:

Things that should work:

  • testneural (cptest subproject, consistency tests for all layers using testdata and numerical differentials)
  • bench (benchmark subproject, benchmarks for all layers)
  • mnisttest (cpmnist subproject, MNIST handwritten digit recognition with a convolutional network, requires dataset download.)
  • cifar10test (cpcifar10 subproject, cifar10 image recognition with a convolutional network, requires dataset download.)
  • rnnreader (rnnreader subproject, text generation via RNN/LSTMs, similar to char-rnn.)

Appendix

Resilu (non-) linearity

See jupyter notebook for visualization and more discussions of resilu function.

(1)

can be rewritten as:

(2)

thus can be interpreted as a residual combination of linearity and non-linearity via addition.

Since shows a phase-transition instability at , a taylor approximation is used for and for .

Both quotients (1) and (2) have as limit relu(x) or, in case of (2): -relu(x), if is replaced by for small constants a.

About

Neural network library inspired by Stanford's 2016 CS231n course, written in C++

Topics

Resources

License

Releases

No releases published

Packages

No packages published
You can’t perform that action at this time.