Killed upon reaching iteration limit #1075

hipwelljo · 2018-09-08T15:13:43Z

Using Turi Create 5.0, I seem to be encountering the same issue #361 that others saw with a previous release. It completes training due to reaching the iteration limit, and then logs Killed: 9. It does not evaluate the model and log the accuracy.

model = tc.image_classifier.create(train_data, target='label', model='squeezenet_v1.1')

predictions = model.classify(test_data)

metrics = model.evaluate(test_data)
print "Accuracy: %s" % metrics['accuracy']

The output is:

Logistic regression:
--------------------------------------------------------
Number of examples          : 49264
Number of classes           : 5742
Number of feature columns   : 1
Number of unpacked features : 1000
Number of coefficients      : 5746741
Starting L-BFGS
--------------------------------------------------------
+-----------+----------+-----------+--------------+-------------------+---------------------+
| Iteration | Passes   | Step size | Elapsed Time | Training Accuracy | Validation Accuracy |
+-----------+----------+-----------+--------------+-------------------+---------------------+
| 0         | 1        | NaN       | 1410.938340  | 0.000142          | 0.000410            |
| 1         | 4        | 0.000101  | 6993.121902  | 0.000284          | 0.000000            |
| 2         | 6        | 1.000000  | 11684.099158 | 0.008038          | 0.002048            |
| 3         | 7        | 1.000000  | 14630.014436 | 0.056796          | 0.013923            |
| 4         | 8        | 1.000000  | 17736.467060 | 0.026713          | 0.005733            |
| 5         | 9        | 1.000000  | 20273.712870 | 0.072791          | 0.022113            |
| 6         | 10       | 1.000000  | 22618.025692 | 0.112049          | 0.037674            |
| 7         | 11       | 1.000000  | 25096.515562 | 0.178528          | 0.066339            |
| 8         | 12       | 1.000000  | 27513.523471 | 0.241698          | 0.088043            |
| 9         | 13       | 1.000000  | 29956.004078 | 0.290983          | 0.110156            |
| 10        | 14       | 1.000000  | 32559.100921 | 0.343476          | 0.139640            |
+-----------+----------+-----------+--------------+-------------------+---------------------+
Completed (Iteration limit reached).
This model may not be optimal. To improve it, consider increasing `max_iterations`.
Killed: 9

The text was updated successfully, but these errors were encountered:

srikris · 2018-09-08T16:37:02Z

Can you share the data by any chance. We can investigate what's going on.

hipwelljo · 2018-09-09T00:14:50Z

I can't post it publicly, is there a way I can send it to you directly?

srikris · 2019-01-10T15:44:20Z

@hipwelljo Can you confirm its fixed in the latest release.

hipwelljo · 2019-01-11T04:48:02Z

It appears to be fixed. It has reached iteration 14 after 12 hours. I don't plan to let it finish. 😛

TobyRoseman · 2019-01-11T19:37:08Z

@hipwelljo - thanks for the update. I'll close this issue but let us know if you have the problem again.

TobyRoseman · 2019-01-11T22:38:28Z

Reopening since this issue only occurred at the end of training, probably when calculating validation set statistics.

hipwelljo · 2019-01-13T02:14:57Z

Ok I set max_iterations to 10 and tried it again. It logged

Logistic regression:
--------------------------------------------------------
Number of examples          : 49029
Number of classes           : 5742
Number of feature columns   : 1
Number of unpacked features : 1000
Number of coefficients      : 5746741
Starting L-BFGS
--------------------------------------------------------
+-----------+----------+-----------+--------------+-------------------+---------------------+
| Iteration | Passes   | Step size | Elapsed Time | Training Accuracy | Validation Accuracy |
+-----------+----------+-----------+--------------+-------------------+---------------------+
| 0         | 1        | NaN       | 1190.438703  | 0.000245          | 0.000000            |
| 1         | 4        | 0.000102  | 6172.879564  | 0.000510          | 0.000000            |
| 2         | 6        | 1.000000  | 9557.887708  | 0.006935          | 0.000745            |
| 3         | 7        | 1.000000  | 11956.632976 | 0.054111          | 0.014898            |
| 4         | 8        | 1.000000  | 14345.693528 | 0.020682          | 0.006331            |
| 5         | 9        | 1.000000  | 16971.418269 | 0.054580          | 0.020484            |
| 6         | 10       | 1.000000  | 19478.623501 | 0.089784          | 0.031657            |
| 7         | 11       | 1.000000  | 22022.317719 | 0.151339          | 0.066294            |
| 8         | 12       | 1.000000  | 24646.645556 | 0.226193          | 0.096834            |
| 9         | 13       | 1.000000  | 27294.776593 | 0.264823          | 0.104283            |
| 10        | 14       | 1.000000  | 29925.663033 | 0.317302          | 0.128119            |
+-----------+----------+-----------+--------------+-------------------+---------------------+
Completed (Iteration limit reached).
This model may not be optimal. To improve it, consider increasing `max_iterations`.

then after a long while logged Killed: 9 :(

I verified I'm using v5.2.1.

As an aside it would be good to log what it's working on, like Calculating validation set statistics or whatever to not give the impression that it's fully completed as the previous line reads. :)

srikris · 2019-01-17T21:18:54Z

We've identified the root cause and we have a plan to fix this. Milestone is set for 5.4. Thanks everyone!

hoytak · 2019-02-21T22:02:09Z

We believe this was fixed with #1402.

hoytak · 2019-03-14T17:23:15Z

In our internal tests, this has been resolved -- #1402 indeed seemed to fix the issue. Please reopen if there are more issues after 5.4 is released.

srikris added bug p1 image classifier labels Sep 8, 2018

znation added the toolkits label Oct 17, 2018

srikris assigned TobyRoseman Jan 10, 2019

TobyRoseman closed this as completed Jan 11, 2019

TobyRoseman reopened this Jan 11, 2019

nickjong added this to the 5.4 milestone Jan 17, 2019

nickjong assigned hoytak and unassigned TobyRoseman Feb 7, 2019

hoytak closed this as completed Mar 14, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Killed upon reaching iteration limit #1075

Killed upon reaching iteration limit #1075

hipwelljo commented Sep 8, 2018 •

edited

srikris commented Sep 8, 2018

hipwelljo commented Sep 9, 2018

srikris commented Jan 10, 2019

hipwelljo commented Jan 11, 2019

TobyRoseman commented Jan 11, 2019

TobyRoseman commented Jan 11, 2019

hipwelljo commented Jan 13, 2019 •

edited

srikris commented Jan 17, 2019

hoytak commented Feb 21, 2019

hoytak commented Mar 14, 2019

Navigation Menu

Killed upon reaching iteration limit #1075

Killed upon reaching iteration limit #1075

Comments

hipwelljo commented Sep 8, 2018 • edited

srikris commented Sep 8, 2018

hipwelljo commented Sep 9, 2018

srikris commented Jan 10, 2019

hipwelljo commented Jan 11, 2019

TobyRoseman commented Jan 11, 2019

TobyRoseman commented Jan 11, 2019

hipwelljo commented Jan 13, 2019 • edited

srikris commented Jan 17, 2019

hoytak commented Feb 21, 2019

hoytak commented Mar 14, 2019

hipwelljo commented Sep 8, 2018 •

edited

hipwelljo commented Jan 13, 2019 •

edited