Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Running train.php ends up with PHP Fatal Error #3

Closed
YRSLV opened this issue Feb 2, 2020 · 8 comments
Closed

Running train.php ends up with PHP Fatal Error #3

YRSLV opened this issue Feb 2, 2020 · 8 comments
Assignees
Labels
bug Something isn't working

Comments

@YRSLV
Copy link
Contributor

YRSLV commented Feb 2, 2020

I`m using 0.0.18-beta version of this library with PHP Version 7.3.12 bundled in XAMPP Version: 7.3.12 on Windows 10.
Whenever I try to run train.php, I get the following error:

C:\xampp\php\php.exe D:\Projects\MNIST\train.php
Loading data into memory ...
Training ...
[2020-02-02 12:01:26] MNIST.INFO: Fitted ZScaleStandardizer
PHP Fatal error:  Uncaught InvalidArgumentException: Classifiers require categorical labels, continuous given. in D:\Projects\MNIST\vendor\rubix\ml\src\Other\Specifications\LabelsAreCompatibleWithLearner.php:27
Stack trace:
#0 D:\Projects\MNIST\vendor\rubix\ml\src\Classifiers\MultilayerPerceptron.php(347): Rubix\ML\Other\Specifications\LabelsAreCompatibleWithLearner::check(Object(Rubix\ML\Datasets\Labeled), Object(Rubix\ML\Classifiers\MultilayerPerceptron))
#1 D:\Projects\MNIST\vendor\rubix\ml\src\Classifiers\MultilayerPerceptron.php(324): Rubix\ML\Classifiers\MultilayerPerceptron->partial(Object(Rubix\ML\Datasets\Labeled))
#2 D:\Projects\MNIST\vendor\rubix\ml\src\Pipeline.php(150): Rubix\ML\Classifiers\MultilayerPerceptron->train(Object(Rubix\ML\Datasets\Labeled))
#3 D:\Projects\MNIST\vendor\rubix\ml\src\PersistentModel.php(125): Rubix\ML\Pipeline->train(Object(Rubix\ML\Datasets\Labeled))
#4 D:\Projects\MNIST\train.php(61): Rubix\ML\PersistentModel->train(Object(Rubix\ML\Datasets\Labeled))
#5 {main}
  thrown in D:\Projects\MNIST\vendor\rubix\ml\src\Other\Specifications\LabelsAreCompatibleWithLearner.php on line 27

Fatal error: Uncaught InvalidArgumentException: Classifiers require categorical labels, continuous given. in D:\Projects\MNIST\vendor\rubix\ml\src\Other\Specifications\LabelsAreCompatibleWithLearner.php:27
Stack trace:
#0 D:\Projects\MNIST\vendor\rubix\ml\src\Classifiers\MultilayerPerceptron.php(347): Rubix\ML\Other\Specifications\LabelsAreCompatibleWithLearner::check(Object(Rubix\ML\Datasets\Labeled), Object(Rubix\ML\Classifiers\MultilayerPerceptron))
#1 D:\Projects\MNIST\vendor\rubix\ml\src\Classifiers\MultilayerPerceptron.php(324): Rubix\ML\Classifiers\MultilayerPerceptron->partial(Object(Rubix\ML\Datasets\Labeled))
#2 D:\Projects\MNIST\vendor\rubix\ml\src\Pipeline.php(150): Rubix\ML\Classifiers\MultilayerPerceptron->train(Object(Rubix\ML\Datasets\Labeled))
#3 D:\Projects\MNIST\vendor\rubix\ml\src\PersistentModel.php(125): Rubix\ML\Pipeline->train(Object(Rubix\ML\Datasets\Labeled))
#4 D:\Projects\MNIST\train.php(61): Rubix\ML\PersistentModel->train(Object(Rubix\ML\Datasets\Labeled))
#5 {main}
  thrown in D:\Projects\MNIST\vendor\rubix\ml\src\Other\Specifications\LabelsAreCompatibleWithLearner.php on line 27

UPD: I found out that label type is determined by determine() method in ...\vendor\rubix\ml\src\DataType.php and that only data of type string is corresponding to categorical type. So I tried converting each $label value to string before writing it to $labels array in train.php:

for ($label = 0; $label < 10; $label++) {
    foreach (glob("training/$label/*.png") as $file) {
        $label = strval($label);
        $samples[] = [imagecreatefrompng($file)];
        $labels[] = $label;
    }
}

This did the trick for me and I was able to run the script from PhpStorm 2019.3 built-in terminal. Nevertheless, for some reason the training only completed 1 epoch and finished. This seems to be a really strange behavior. Below is the output for that run (the prompt for saving the model does not appear on screen, but giving 'y' as the input saves the model):

C:\xampp\php\php.exe D:\Projects\MNIST\train.php
Loading data into memory ...
Training ...
[2020-02-02 17:07:40] MNIST.INFO: Fitted ZScaleStandardizer
[2020-02-02 17:07:45] MNIST.INFO: Learner init hidden=[0=Dense 1=Activation 2=Dropout 3=Dense 4=Activation 5=Dropout 6=Dense 7=Activation 8=Dropout] batch_size=200 optimizer=Adam alpha=0.0001 epochs=1000 min_change=0.0001 window=3 hold_out=0.1 cost_fn=CrossEntropy metric=FBeta
[2020-02-02 17:16:39] MNIST.INFO: Epoch 1 score=0.03078281535439 loss=0
[2020-02-02 17:16:39] MNIST.INFO: Training complete
Progress saved to progress.csv
y

Process finished with exit code 0

But when I try to run the script from Windows Command Prompt or PowerShell I get the following error:

C:\xampp\php\php.exe D:\Projects\MNIST\train.php
Loading data into memory ...
Training ...
[2020-02-02 18:04:09] MNIST.INFO: Fitted ZScaleStandardizer
PHP Fatal error:  Uncaught InvalidArgumentException: The number of input nodes must be greater than 0, 0 given. in D:\Projects\MNIST\vendor\rubix\ml\src\NeuralNet\Layers\Placeholder1D.php:34
Stack trace:
#0 D:\Projects\MNIST\vendor\rubix\ml\src\Classifiers\MultilayerPerceptron.php(316): Rubix\ML\NeuralNet\Layers\Placeholder1D->__construct(0)
#1 D:\Projects\MNIST\vendor\rubix\ml\src\Pipeline.php(150): Rubix\ML\Classifiers\MultilayerPerceptron->train(Object(Rubix\ML\Datasets\Labeled))
#2 D:\Projects\MNIST\vendor\rubix\ml\src\PersistentModel.php(125): Rubix\ML\Pipeline->train(Object(Rubix\ML\Datasets\Labeled))
#3 D:\Projects\MNIST\train.php(71): Rubix\ML\PersistentModel->train(Object(Rubix\ML\Datasets\Labeled))
#4 {main}
  thrown in D:\Projects\MNIST\vendor\rubix\ml\src\NeuralNet\Layers\Placeholder1D.php on line 34

Fatal error: Uncaught InvalidArgumentException: The number of input nodes must be greater than 0, 0 given. in D:\Projects\MNIST\vendor\rubix\ml\src\NeuralNet\Layers\Placeholder1D.php:34
Stack trace:
#0 D:\Projects\MNIST\vendor\rubix\ml\src\Classifiers\MultilayerPerceptron.php(316): Rubix\ML\NeuralNet\Layers\Placeholder1D->__construct(0)
#1 D:\Projects\MNIST\vendor\rubix\ml\src\Pipeline.php(150): Rubix\ML\Classifiers\MultilayerPerceptron->train(Object(Rubix\ML\Datasets\Labeled))
#2 D:\Projects\MNIST\vendor\rubix\ml\src\PersistentModel.php(125): Rubix\ML\Pipeline->train(Object(Rubix\ML\Datasets\Labeled))
#3 D:\Projects\MNIST\train.php(71): Rubix\ML\PersistentModel->train(Object(Rubix\ML\Datasets\Labeled))
#4 {main}
  thrown in D:\Projects\MNIST\vendor\rubix\ml\src\NeuralNet\Layers\Placeholder1D.php on line 34
@andrewdalpino
Copy link
Member

andrewdalpino commented Feb 4, 2020

Hi @YRSLV thank you for the bug report

This is a known issue, and indeed it has to deal with the label

Since PHP (silently) converts integer strings (such as '1') to integers when they are put into an array index, we experience this strange behavior

We are working on a fix for 0.0.20 that compensates for this behavior ... but in the meantime 0.0.18-beta and 0.0.19-beta are effected ... 0.0.17-beta should work ok however since we were still using loose comparisons when computing the loss at the neural network output layer.

Thanks again for the bug report, we should have a fix soon

@andrewdalpino
Copy link
Member

andrewdalpino commented Feb 4, 2020

@YRSLV

Ok, we reinstated loose comparison at the output layer of the neural net, thus even if integer strings get converted to integers, they will still evaluate correctly

You can use the latest dev-master in the time between now and 0.0.20 release

This commit 1cab2f4 has the changes for the MNIST project which you can pull locally right now

In regard to the issue with Powershell or the window command prompt .. can you post the full training script? It looks like Image Vectorizer is missing from the training log

Training ...
[2020-02-02 18:04:09] MNIST.INFO: Fitted ZScaleStandardizer
PHP Fatal error:  Uncaught InvalidArgumentException: The number of input nodes must be greater than 0, 0 given. in D:\Projects\MNIST\vendor\rubix\ml\src\NeuralNet\Layers\Placeholder1D.php:34
Stack trace:

Image vectorization should happen before Z Scale Standardizer, and may explain why there are no features to train on

Thanks again for the great bug report!

@andrewdalpino andrewdalpino self-assigned this Feb 4, 2020
@andrewdalpino andrewdalpino added the bug Something isn't working label Feb 4, 2020
@andrewdalpino
Copy link
Member

Also, I had no issue running the train script using Windows Power Shell

Windows PowerShell
Copyright (C) Microsoft Corporation. All rights reserved.

Try the new cross-platform PowerShell https://aka.ms/pscore6

PS C:\Users\Andrew> cd .\Workspace\
PS C:\Users\Andrew\Workspace> cd .\Rubix\
PS C:\Users\Andrew\Workspace\Rubix> cd .\MNIST\
PS C:\Users\Andrew\Workspace\Rubix\MNIST> php .\train.php
Loading data into memory ...
Training ...
[2020-02-05 04:38:22] MNIST.INFO: Fitted ImageVectorizer
[2020-02-05 04:38:35] MNIST.INFO: Fitted ZScaleStandardizer
[2020-02-05 04:38:39] MNIST.INFO: Learner init hidden_layers=[0=Dense 1=Activation 2=Dropout 3=Dense 4=Activation 5=Dropout 6=Dense 7=Activation 8=Dropout] batch_size=200 optimizer=Adam alpha=0.0001 epochs=1000 min_change=0.0001 window=3 hold_out=0.1 cost_fn=CrossEntropy metric=FBeta
[2020-02-05 04:46:11] MNIST.INFO: Epoch 1 score=0.94457591053846 loss=0.033902218707065
[2020-02-05 04:53:37] MNIST.INFO: Epoch 2 score=0.95457020041576 loss=0.016683108266148
[2020-02-05 05:01:50] MNIST.INFO: Epoch 3 score=0.95709448967882 loss=0.014113661779522
[2020-02-05 05:09:56] MNIST.INFO: Epoch 4 score=0.96373544342892 loss=0.012598210141191
[2020-02-05 05:17:59] MNIST.INFO: Epoch 5 score=0.96545474797228 loss=0.011270274660206
[2020-02-05 05:25:33] MNIST.INFO: Epoch 6 score=0.96610189454757 loss=0.0097651640048506
[2020-02-05 05:32:38] MNIST.INFO: Epoch 7 score=0.96657030608177 loss=0.0092345036268538
[2020-02-05 05:39:25] MNIST.INFO: Epoch 8 score=0.96816338766516 loss=0.0087438044840915
[2020-02-05 05:46:01] MNIST.INFO: Epoch 9 score=0.96762584618292 loss=0.0083210835965014
[2020-02-05 05:52:29] MNIST.INFO: Epoch 10 score=0.96755353382111 loss=0.0077246114853719
[2020-02-05 05:58:59] MNIST.INFO: Epoch 11 score=0.96763480484681 loss=0.007399161387465
[2020-02-05 05:58:59] MNIST.INFO: Network restored from previous snapshot
[2020-02-05 05:58:59] MNIST.INFO: Training complete
Progress saved to progress.csv
Save this model? (y|[n]): y
PS C:\Users\Andrew\Workspace\Rubix\MNIST>

@YRSLV
Copy link
Contributor Author

YRSLV commented Feb 5, 2020

Hi @andrewdalpino

First of all, thank you for your quick reply and great support.

Here is the full training script I used before pulling the latest 3895be5 commit:

<?php

include __DIR__ . '/vendor/autoload.php';

use Rubix\ML\Datasets\Labeled;
use Rubix\ML\PersistentModel;
use Rubix\ML\Pipeline;
use Rubix\ML\Transformers\ImageResizer;
use Rubix\ML\Transformers\ImageVectorizer;
use Rubix\ML\Transformers\ZScaleStandardizer;
use Rubix\ML\Classifiers\MultiLayerPerceptron;
use Rubix\ML\NeuralNet\Layers\Dense;
use Rubix\ML\NeuralNet\Layers\Dropout;
use Rubix\ML\NeuralNet\Layers\Activation;
use Rubix\ML\NeuralNet\ActivationFunctions\LeakyReLU;
use Rubix\ML\NeuralNet\Optimizers\Adam;
use Rubix\ML\Persisters\Filesystem;
use Rubix\ML\Other\Loggers\Screen;
use League\Csv\Writer;

use function Rubix\ML\array_transpose;

ini_set('memory_limit', '-1');

echo 'Loading data into memory ...' . PHP_EOL;

$samples = $labels = [];

for ($label = 0; $label < 10; $label++) {
    foreach (glob("training/$label/*.png") as $file) {
        $label = strval($label);
        $samples[] = [imagecreatefrompng($file)];
        $labels[] = $label;
    }
}

$dataset = new Labeled($samples, $labels);

$estimator = new PersistentModel(
    new Pipeline([
        new ImageResizer(28, 28),
        new ImageVectorizer(1),
        new ZScaleStandardizer(),
    ], new MultiLayerPerceptron([
        new Dense(100),
        new Activation(new LeakyReLU()),
        new Dropout(0.2),
        new Dense(100),
        new Activation(new LeakyReLU()),
        new Dropout(0.2),
        new Dense(100),
        new Activation(new LeakyReLU()),
        new Dropout(0.2),
    ], 200, new Adam(0.001))),
    new Filesystem('mnist.model', true)
);

$estimator->setLogger(new Screen('MNIST'));

echo 'Training ...' .  PHP_EOL;

$estimator->train($dataset);

$scores = $estimator->scores();
$losses = $estimator->steps();

$writer = Writer::createFromPath('progress.csv', 'w+');
$writer->insertOne(['score', 'loss']);
$writer->insertAll(array_transpose([$scores, $losses]));

echo 'Progress saved to progress.csv' . PHP_EOL;

if (strtolower(trim(readline('Save this model? (y|[n]): '))) === 'y') {
    $estimator->save();
}

It's basically the default repo version of train.php at the state of 87ea1bc commit with $label = strval($label); conversion added in the for loop.

I can also confirm that the latest 3895be5 version which utilizes dev-master version of the library is working flawlessly. I was able to run the training script from both PhpStorm in-built terminal, new Windows Terminal, and PowerShell. Image Vectorizer is present in the training log and the script in general is executed just as expected.

The problem with PowerShell execution was caused by my mistake of running the script outside the project directory, which messed up the path to the dataset, therefore the script could not find any training data and returned various Uncaught InvalidArgumentException errors. I realized this at some point when I was looking at your PowerShell log 🙂.

@YRSLV
Copy link
Contributor Author

YRSLV commented Feb 5, 2020

I also wonder if explicit type cast should be performed on the $label variable when pushing it into $labels[] array in validate.php considering it uses the same for loop construction as train.php?
Thanks.

@andrewdalpino
Copy link
Member

@YRSLV yes great observation!

Would you like to submit a PR for the explicit typecast or should I do it? (basically would just add (string) before the label assignment like in the training script)

@YRSLV
Copy link
Contributor Author

YRSLV commented Feb 6, 2020

@andrewdalpino I just submitted PR #5

@andrewdalpino
Copy link
Member

Awesome, welcome to our community!

Thanks again for the bug report

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants