-
Notifications
You must be signed in to change notification settings - Fork 92
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
simple XOR #2
Comments
I looked at your code and can't seem to find anything wrong with it. Maybe there's something I'm not getting about the Matrix organization. I also tried the example given in README.md and can't get that to give accurate results. The following is what I get when I examine the data. Am I missing something?
[Epoch 0, batch 0] Loss = 0.306261 |
Sorry for my late reply. The question both of you were asking was mostly about the DNN theory, not the program itself. For @OmarJay1's observation, in fact this is expected. In my example x and y were simulated independently, so they shouldn't have any real correlation. It uses random errors to predict random errors, so the bad accuracy is real. For @katzb123's XOR function, I can comment a little bit more. It is true that the XOR function is simple, mathematically. But unfortunately, optimizing the nonconvex loss function using gradient-based methods is hard, even if the true function is simple. One common trick in training DNN is overparameterization. That is, you use (far) more parameters than needed to fit the function. I modified your code a little bit and it produced the desired results. #include <Eigen/Core>
#include <MiniDNN.h>
using namespace MiniDNN;
typedef Eigen::MatrixXd Matrix;
typedef Eigen::VectorXd Vector;
int main()
{
std::srand(123);
Matrix inputs(2, 4);
inputs << 0, 0, 1, 1,
0, 1, 0, 1;
Matrix outputs(1, 4);
outputs << 0, 1, 1, 0;
std::cout << "input =\n" << inputs << std::endl;
std::cout << "output = " << outputs << std::endl;
// Construct a network object
Network net;
// Create layers
Layer* layer1 = new FullyConnected<ReLU>(2, 100); // 2 input, 100 hidden
Layer* layer2 = new FullyConnected<Sigmoid>(100, 1); // 1 output
// Add layers to the network object
net.add_layer(layer1);
net.add_layer(layer2);
// Set output layer
net.set_output(new RegressionMSE());
// Optimizer
RMSProp opt;
opt.m_lrate = 0.01;
VerboseCallback callback;
net.set_callback(callback);
// Initialize parameters with N(0, 0.01^2) using random seed 123
net.init(0, 0.01, 123);
// Fit the model with a batch size of 4, running 1000 epochs with random seed 123
net.fit(opt, inputs, outputs, 4, 500, 123);
Matrix pred = net.predict(inputs);
std::cout << pred << std::endl;
std::cin.get();
return 0;
} Output:
The differences I made were:
|
Hey, I was trying out your library and I was having issues. I was just starting out with XOR. main c++ code is below. Hopefully you can tell me what I'm doing wrong. I noticed that the m_Weight is a 2x2 for the first layer which is correct, but the "prev_layer_data" in the feedforward function is a 2x4 because my input matrix is 2x4. Shouldn't it be splitting the inputs into each "vector" of inputs. it looks like its trying to do all the inputs at the same time.
#include <Eigen/Dense>
#include
#include
#include "MiniDNN.h"
using namespace MiniDNN;
typedef Eigen::MatrixXd Matrix;
typedef Eigen::VectorXd Vector;
int main()
{
std::srand(123);
}
The text was updated successfully, but these errors were encountered: