Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with TFImageNet Example #100

Closed
rahulbhalerao001 opened this issue Mar 2, 2016 · 13 comments
Closed

Issue with TFImageNet Example #100

rahulbhalerao001 opened this issue Mar 2, 2016 · 13 comments

Comments

@rahulbhalerao001
Copy link
Contributor

Hello,
Nice to see the integration with Tensorflow and GPUs back 馃憤
I setup the cluster with the new AMI and was able to run the MNIST example. It ran succesfully but in the Spark web ui, was able to see a lot of jobs skipped.
capture
I hope that is not a problem. Also can the MNIST example use all the GPUs?

Further, for the TFImageNetApp, I ran into the following error. My ImageNetApp (caffe) used to correctly work with my S3 bucket.
Command
/root/spark/bin/spark-submit --class apps.TFImageNetApp /root/SparkNet/target/scala-2.10/sparknet-assembly-0.1-SNAPSHOT.jar 2 sparknetdivideo
Error
java.lang.IllegalArgumentException: The data and shape arguments are not compatible, data.length = 196608 and shape = Array(227, 256, 256). at libs.NDArray$.apply(NDArray.scala:55) at libs.ImageNetTensorFlowPreprocessor$$anonfun$convert$16.apply(Preprocessor.scala:131) at libs.ImageNetTensorFlowPreprocessor$$anonfun$convert$16.apply(Preprocessor.scala:122) at libs.TensorFlowNet$$anonfun$loadFrom$1.apply$mcVI$sp(TensorFlowNet.scala:64) at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141) at libs.TensorFlowNet.loadFrom(TensorFlowNet.scala:63) at libs.TensorFlowNet.forward(TensorFlowNet.scala:74) at apps.TFImageNetApp$$anonfun$7$$anonfun$apply$2.apply$mcVI$sp(TFImageNetApp.scala:106) at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141) at apps.TFImageNetApp$$anonfun$7.apply(TFImageNetApp.scala:105) at apps.TFImageNetApp$$anonfun$7.apply(TFImageNetApp.scala:102) at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$17.apply(RDD.scala:706) at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$17.apply(RDD.scala:706) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297) at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:69) at org.apache.spark.rdd.RDD.iterator(RDD.scala:262) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297) at org.apache.spark.rdd.RDD.iterator(RDD.scala:264) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297) at org.apache.spark.rdd.RDD.iterator(RDD.scala:264) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66) at org.apache.spark.scheduler.Task.run(Task.scala:88) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745)

@robertnishihara
Copy link
Member

Thanks Rahul! I suspect the skipped tasks are not a problem, but I'm not positive (I think when a task is "skipped" that means that its result has already been cached, and so the result is reused).

The Mnist example doesn't use all of the GPUs right now. TensorFlow should be able to do this. Right now the way to do this is probably to modify mnist_graph.py to assign different nodes in the computation graph to different devices (something like this https://www.tensorflow.org/versions/r0.7/how_tos/using_gpu/index.html).

Also, we merged the TensorFlow version of ImageNet prematurely. I'll fix it up soon.

@rahulbhalerao001
Copy link
Contributor Author

This is running correctly, so closing this. But want to remark that it is extremely slow as compared to SparkNet with Caffe.

Caffe

root@ip-172-31-30-96:~/SparkNet# cat imagenet5.txt | grep accuracy
130.038, i = 0: 0.11% accuracy
522.603, i = 10: 17.44% accuracy
911.004, i = 20: 94.27% accuracy
1299.298, i = 30: 98.70% accuracy
1690.085, i = 40: 99.33% accuracy

Tensorflow

root@ip-172-31-30-96:~/SparkNet# cat tensorflowImageNet5.txt | grep accuracy
153.173, i = 0: 0.25% accuracy
519.216, i = 5: 0.04% accuracy
875.824, i = 10: 0.04% accuracy

@robertnishihara
Copy link
Member

Hmm.... those numbers look too good. I can replicate it, and it seems like there's a bug with the ImageNet example.

@robertnishihara
Copy link
Member

The bug is fixed by #115. GPUs are enabled again for Caffe.

@rahulbhalerao001
Copy link
Contributor Author

Hello Robert,
With regard to the training time, I am running the Imagenet with Caffe example, and here is what I have been able to get:

230.758, i = 0: 0.09% accuracy
726.53, i = 10: 0.08% accuracy
1226.734, i = 20: 0.18% accuracy
1719.316, i = 30: 0.12% accuracy
2221.687, i = 40: 0.14% accuracy
2703.317, i = 50: 0.13% accuracy
3192.063, i = 60: 0.08% accuracy
3696.541, i = 70: 0.37% accuracy
4190.006, i = 80: 0.39% accuracy
4681.449, i = 90: 0.35% accuracy
5175.196, i = 100: 0.47% accuracy
5675.586, i = 110: 0.52% accuracy
6162.29, i = 120: 0.55% accuracy
6661.666, i = 130: 0.57% accuracy
7153.414, i = 140: 0.75% accuracy
7654.373, i = 150: 0.69% accuracy

For a 3 g2.8xlarge slave cluster with Caffe, do you think this is reasonable? Also should I expect speedup with more number of slaves, or do you feel the current training data images are too less to achieve good accuracy. If less, what is a reasonable number of images which will provide good results, and at same time not cause memory issues in the system.

Thanks,
Rahul

@robertnishihara
Copy link
Member

The default settings in the app were aimed at being simple to run but not optimal for training. I'd suggest setting syncInterval=100 to amortize the overhead of each map call. I'd also suggest using more of the data. For example, set

var trainRDD = loader.apply(sc, "ILSVRC2012_img_train/train.000", "train.txt", fullHeight, fullWidth)

that is, remove a 0 from the path for trainRDD. This increases the training data to 1/10th of the data set.

@robertnishihara
Copy link
Member

Ok, there was another oversight. We weren't caching trainDF and testDF so some of the work was being recomputed, causing some additional slowdowns. The PR #117 has the relevant fixes to speed up training.

@rahulbhalerao001
Copy link
Contributor Author

Hello Robert,

Thank you for the suggestions. I have started the process with your previous suggestions, will leave it overnight, and report the findings tomorrow. I will try again tomorrow with this new caching.

Thanks,
Rahul

@robertnishihara
Copy link
Member

Ok, sounds good. The caching is actually pretty important. Recomputing stuff could be more severe with more data.

@robertnishihara
Copy link
Member

Here's the log for a run I did overnight with two g2.8xlarge workers. It looks like it gets to 10% accuracy at around i=100, which is after about 3.5 hours. From there it doesn't improve much. To get good accuracies, you probably want to use the full training set.

0.002: loading train data
11.219: loading test data
48.331: numTrainData = 128200
55.233: numTestData = 10000
55.233: computing mean image
133.787: coalescing
604.765: trainPartitionSizes = Array(64100, 64100)
604.789: testPartitionSizes = Array(5000, 5000)
616.656, i = 0: broadcasting weights
617.336, i = 0: setting weights on workers
619.225, i = 0: testing
755.593, i = 0: 0.09% accuracy
755.593, i = 0: training
851.906, i = 0: collecting weights
862.073, i = 0: weight = -0.005752801
862.073, i = 1: broadcasting weights
862.713, i = 1: setting weights on workers
864.526, i = 1: training
961.546, i = 1: collecting weights
972.03, i = 1: weight = -0.005740252
972.03, i = 2: broadcasting weights
972.662, i = 2: setting weights on workers
974.429, i = 2: training
1070.395, i = 2: collecting weights
1079.232, i = 2: weight = -0.0056812163
1079.232, i = 3: broadcasting weights
1080.564, i = 3: setting weights on workers
1083.286, i = 3: training
1179.605, i = 3: collecting weights
1188.421, i = 3: weight = -0.0055780634
1188.421, i = 4: broadcasting weights
1189.062, i = 4: setting weights on workers
1192.443, i = 4: training
1287.2, i = 4: collecting weights
1296.695, i = 4: weight = -0.005518262
1296.695, i = 5: broadcasting weights
1297.465, i = 5: setting weights on workers
1299.097, i = 5: training
1395.416, i = 5: collecting weights
1405.254, i = 5: weight = -0.0054991157
1405.254, i = 6: broadcasting weights
1405.877, i = 6: setting weights on workers
1408.067, i = 6: training
1504.935, i = 6: collecting weights
1514.857, i = 6: weight = -0.005503099
1514.857, i = 7: broadcasting weights
1515.479, i = 7: setting weights on workers
1517.301, i = 7: training
1614.101, i = 7: collecting weights
1623.594, i = 7: weight = -0.0055048433
1623.594, i = 8: broadcasting weights
1624.355, i = 8: setting weights on workers
1626.243, i = 8: training
1722.059, i = 8: collecting weights
1731.241, i = 8: weight = -0.0054747057
1731.241, i = 9: broadcasting weights
1731.861, i = 9: setting weights on workers
1733.691, i = 9: training
1829.121, i = 9: collecting weights
1838.499, i = 9: weight = -0.0054659587
1838.499, i = 10: broadcasting weights
1839.119, i = 10: setting weights on workers
1841.177, i = 10: testing
1977.743, i = 10: 0.05% accuracy
1977.743, i = 10: training
2074.51, i = 10: collecting weights
2083.932, i = 10: weight = -0.005457867
2083.932, i = 11: broadcasting weights
2084.686, i = 11: setting weights on workers
2086.791, i = 11: training
2181.327, i = 11: collecting weights
2190.567, i = 11: weight = -0.0054264003
2190.567, i = 12: broadcasting weights
2191.185, i = 12: setting weights on workers
2194.559, i = 12: training
2289.819, i = 12: collecting weights
2301.607, i = 12: weight = -0.0054265214
2301.608, i = 13: broadcasting weights
2302.223, i = 13: setting weights on workers
2313.57, i = 13: training
2410.282, i = 13: collecting weights
2419.549, i = 13: weight = -0.0054609855
2419.549, i = 14: broadcasting weights
2420.261, i = 14: setting weights on workers
2425.188, i = 14: training
2520.622, i = 14: collecting weights
2529.973, i = 14: weight = -0.0055139875
2529.973, i = 15: broadcasting weights
2530.667, i = 15: setting weights on workers
2534.416, i = 15: training
2630.786, i = 15: collecting weights
2639.856, i = 15: weight = -0.005658327
2639.856, i = 16: broadcasting weights
2640.471, i = 16: setting weights on workers
2642.55, i = 16: training
2738.95, i = 16: collecting weights
2747.53, i = 16: weight = -0.006201865
2747.53, i = 17: broadcasting weights
2748.147, i = 17: setting weights on workers
2751.206, i = 17: training
2847.686, i = 17: collecting weights
2856.86, i = 17: weight = -0.0074481424
2856.86, i = 18: broadcasting weights
2857.588, i = 18: setting weights on workers
2859.443, i = 18: training
2955.833, i = 18: collecting weights
2965.166, i = 18: weight = -0.008238256
2965.166, i = 19: broadcasting weights
2965.78, i = 19: setting weights on workers
2967.661, i = 19: training
3063.998, i = 19: collecting weights
3072.7, i = 19: weight = -0.008252922
3072.701, i = 20: broadcasting weights
3073.313, i = 20: setting weights on workers
3074.891, i = 20: testing
3211.342, i = 20: 0.18% accuracy
3211.342, i = 20: training
3307.593, i = 20: collecting weights
3317.251, i = 20: weight = -0.008983822
3317.251, i = 21: broadcasting weights
3318.613, i = 21: setting weights on workers
3320.426, i = 21: training
3414.982, i = 21: collecting weights
3423.738, i = 21: weight = -0.008959025
3423.738, i = 22: broadcasting weights
3424.353, i = 22: setting weights on workers
3427.036, i = 22: training
3523.649, i = 22: collecting weights
3532.327, i = 22: weight = -0.008875648
3532.328, i = 23: broadcasting weights
3532.94, i = 23: setting weights on workers
3534.711, i = 23: training
3629.044, i = 23: collecting weights
3638.406, i = 23: weight = -0.00771817
3638.406, i = 24: broadcasting weights
3639.148, i = 24: setting weights on workers
3641.101, i = 24: training
3736.98, i = 24: collecting weights
3746.186, i = 24: weight = -0.00832183
3746.186, i = 25: broadcasting weights
3746.798, i = 25: setting weights on workers
3748.418, i = 25: training
3844.017, i = 25: collecting weights
3852.86, i = 25: weight = -0.0074207233
3852.86, i = 26: broadcasting weights
3853.475, i = 26: setting weights on workers
3855.07, i = 26: training
3950.83, i = 26: collecting weights
3960.396, i = 26: weight = -0.005030203
3960.396, i = 27: broadcasting weights
3961.166, i = 27: setting weights on workers
3963.159, i = 27: training
4057.798, i = 27: collecting weights
4067.203, i = 27: weight = -0.004932753
4067.204, i = 28: broadcasting weights
4067.815, i = 28: setting weights on workers
4069.388, i = 28: training
4165.808, i = 28: collecting weights
4174.509, i = 28: weight = -0.006245138
4174.509, i = 29: broadcasting weights
4175.784, i = 29: setting weights on workers
4177.483, i = 29: training
4273.904, i = 29: collecting weights
4282.383, i = 29: weight = -0.005012705
4282.383, i = 30: broadcasting weights
4282.996, i = 30: setting weights on workers
4285.051, i = 30: testing
4421.666, i = 30: 0.27% accuracy
4421.666, i = 30: training
4516.145, i = 30: collecting weights
4524.795, i = 30: weight = -0.0052556507
4524.795, i = 31: broadcasting weights
4525.403, i = 31: setting weights on workers
4527.162, i = 31: training
4622.025, i = 31: collecting weights
4631.398, i = 31: weight = -0.004839625
4631.398, i = 32: broadcasting weights
4632.006, i = 32: setting weights on workers
4634.099, i = 32: training
4730.716, i = 32: collecting weights
4740.009, i = 32: weight = -0.0060347007
4740.009, i = 33: broadcasting weights
4740.618, i = 33: setting weights on workers
4742.623, i = 33: training
4839.267, i = 33: collecting weights
4847.859, i = 33: weight = -0.007981682
4847.859, i = 34: broadcasting weights
4848.467, i = 34: setting weights on workers
4850.89, i = 34: training
4947.604, i = 34: collecting weights
4956.903, i = 34: weight = -0.008934092
4956.903, i = 35: broadcasting weights
4957.598, i = 35: setting weights on workers
4959.349, i = 35: training
5053.715, i = 35: collecting weights
5063.042, i = 35: weight = -0.008953424
5063.042, i = 36: broadcasting weights
5063.653, i = 36: setting weights on workers
5065.39, i = 36: training
5161.595, i = 36: collecting weights
5170.414, i = 36: weight = -0.0089482255
5170.414, i = 37: broadcasting weights
5171.023, i = 37: setting weights on workers
5172.798, i = 37: training
5269.366, i = 37: collecting weights
5278.69, i = 37: weight = -0.008762666
5278.69, i = 38: broadcasting weights
5279.446, i = 38: setting weights on workers
5280.985, i = 38: training
5376.797, i = 38: collecting weights
5386.205, i = 38: weight = -0.008149167
5386.205, i = 39: broadcasting weights
5386.817, i = 39: setting weights on workers
5388.448, i = 39: training
5484.728, i = 39: collecting weights
5493.625, i = 39: weight = -0.010553602
5493.625, i = 40: broadcasting weights
5494.231, i = 40: setting weights on workers
5496.0, i = 40: testing
5632.596, i = 40: 0.89% accuracy
5632.596, i = 40: training
5729.332, i = 40: collecting weights
5737.967, i = 40: weight = -0.009048437
5737.967, i = 41: broadcasting weights
5738.673, i = 41: setting weights on workers
5741.123, i = 41: training
5836.429, i = 41: collecting weights
5845.651, i = 41: weight = -0.009061288
5845.651, i = 42: broadcasting weights
5846.284, i = 42: setting weights on workers
5848.409, i = 42: training
5944.436, i = 42: collecting weights
5953.088, i = 42: weight = -0.008766227
5953.088, i = 43: broadcasting weights
5953.707, i = 43: setting weights on workers
5955.413, i = 43: training
6051.824, i = 43: collecting weights
6061.282, i = 43: weight = -0.0077708345
6061.282, i = 44: broadcasting weights
6061.913, i = 44: setting weights on workers
6063.62, i = 44: training
6158.411, i = 44: collecting weights
6167.922, i = 44: weight = -0.0073010167
6167.922, i = 45: broadcasting weights
6168.529, i = 45: setting weights on workers
6170.774, i = 45: training
6266.51, i = 45: collecting weights
6275.246, i = 45: weight = -0.0065529794
6275.246, i = 46: broadcasting weights
6275.857, i = 46: setting weights on workers
6278.016, i = 46: training
6372.838, i = 46: collecting weights
6381.55, i = 46: weight = -0.0064114695
6381.55, i = 47: broadcasting weights
6382.221, i = 47: setting weights on workers
6384.118, i = 47: training
6479.516, i = 47: collecting weights
6488.979, i = 47: weight = -0.005639033
6488.979, i = 48: broadcasting weights
6489.59, i = 48: setting weights on workers
6491.506, i = 48: training
6586.569, i = 48: collecting weights
6595.34, i = 48: weight = -0.006604268
6595.34, i = 49: broadcasting weights
6595.952, i = 49: setting weights on workers
6597.628, i = 49: training
6692.59, i = 49: collecting weights
6702.472, i = 49: weight = -0.0062721716
6702.472, i = 50: broadcasting weights
6703.089, i = 50: setting weights on workers
6704.787, i = 50: testing
6841.427, i = 50: 2.06% accuracy
6841.427, i = 50: training
6936.477, i = 50: collecting weights
6945.906, i = 50: weight = -0.007378959
6945.906, i = 51: broadcasting weights
6946.517, i = 51: setting weights on workers
6948.611, i = 51: training
7045.095, i = 51: collecting weights
7053.761, i = 51: weight = -0.0074507995
7053.761, i = 52: broadcasting weights
7054.37, i = 52: setting weights on workers
7056.126, i = 52: training
7151.908, i = 52: collecting weights
7160.558, i = 52: weight = -0.008429487
7160.558, i = 53: broadcasting weights
7161.261, i = 53: setting weights on workers
7163.141, i = 53: training
7259.158, i = 53: collecting weights
7268.344, i = 53: weight = -0.0060025984
7268.344, i = 54: broadcasting weights
7268.954, i = 54: setting weights on workers
7271.038, i = 54: training
7367.639, i = 54: collecting weights
7376.198, i = 54: weight = -0.007133088
7376.198, i = 55: broadcasting weights
7376.808, i = 55: setting weights on workers
7378.504, i = 55: training
7473.457, i = 55: collecting weights
7482.822, i = 55: weight = -0.0059373295
7482.822, i = 56: broadcasting weights
7483.527, i = 56: setting weights on workers
7485.223, i = 56: training
7580.036, i = 56: collecting weights
7589.349, i = 56: weight = -0.0057906844
7589.349, i = 57: broadcasting weights
7590.027, i = 57: setting weights on workers
7591.764, i = 57: training
7688.369, i = 57: collecting weights
7696.913, i = 57: weight = -0.005515228
7696.913, i = 58: broadcasting weights
7697.522, i = 58: setting weights on workers
7699.662, i = 58: training
7794.187, i = 58: collecting weights
7802.804, i = 58: weight = -0.003577847
7802.804, i = 59: broadcasting weights
7803.437, i = 59: setting weights on workers
7805.37, i = 59: training
7899.861, i = 59: collecting weights
7909.292, i = 59: weight = -0.0026609926
7909.292, i = 60: broadcasting weights
7909.992, i = 60: setting weights on workers
7911.689, i = 60: testing
8047.808, i = 60: 4.23% accuracy
8047.808, i = 60: training
8142.853, i = 60: collecting weights
8152.248, i = 60: weight = -0.0044957823
8152.248, i = 61: broadcasting weights
8152.857, i = 61: setting weights on workers
8154.99, i = 61: training
8251.556, i = 61: collecting weights
8260.096, i = 61: weight = -0.0061532296
8260.096, i = 62: broadcasting weights
8260.703, i = 62: setting weights on workers
8262.669, i = 62: training
8358.425, i = 62: collecting weights
8367.146, i = 62: weight = -0.006685048
8367.146, i = 63: broadcasting weights
8367.819, i = 63: setting weights on workers
8369.595, i = 63: training
8464.914, i = 63: collecting weights
8474.122, i = 63: weight = -0.0064020986
8474.122, i = 64: broadcasting weights
8474.732, i = 64: setting weights on workers
8476.508, i = 64: training
8571.638, i = 64: collecting weights
8580.32, i = 64: weight = -0.0044420767
8580.32, i = 65: broadcasting weights
8580.928, i = 65: setting weights on workers
8582.908, i = 65: training
8679.242, i = 65: collecting weights
8687.928, i = 65: weight = -0.005657709
8687.928, i = 66: broadcasting weights
8689.335, i = 66: setting weights on workers
8692.026, i = 66: training
8788.092, i = 66: collecting weights
8796.8, i = 66: weight = -0.004765268
8796.8, i = 67: broadcasting weights
8797.411, i = 67: setting weights on workers
8799.327, i = 67: training
8896.431, i = 67: collecting weights
8905.019, i = 67: weight = -0.0063854726
8905.019, i = 68: broadcasting weights
8905.629, i = 68: setting weights on workers
8907.567, i = 68: training
9004.181, i = 68: collecting weights
9013.584, i = 68: weight = -0.0049652895
9013.584, i = 69: broadcasting weights
9014.307, i = 69: setting weights on workers
9016.382, i = 69: training
9111.849, i = 69: collecting weights
9121.283, i = 69: weight = -0.0060089887
9121.283, i = 70: broadcasting weights
9121.902, i = 70: setting weights on workers
9124.229, i = 70: testing
9261.325, i = 70: 6.98% accuracy
9261.325, i = 70: training
9357.442, i = 70: collecting weights
9366.094, i = 70: weight = -0.006755676
9366.094, i = 71: broadcasting weights
9366.697, i = 71: setting weights on workers
9368.415, i = 71: training
9462.872, i = 71: collecting weights
9471.531, i = 71: weight = -0.0063136537
9471.531, i = 72: broadcasting weights
9472.218, i = 72: setting weights on workers
9474.301, i = 72: training
9569.789, i = 72: collecting weights
9579.123, i = 72: weight = -0.008809285
9579.123, i = 73: broadcasting weights
9579.746, i = 73: setting weights on workers
9581.464, i = 73: training
9678.209, i = 73: collecting weights
9686.724, i = 73: weight = -0.0062743947
9686.724, i = 74: broadcasting weights
9687.331, i = 74: setting weights on workers
9689.229, i = 74: training
9785.652, i = 74: collecting weights
9794.355, i = 74: weight = -0.006698339
9794.355, i = 75: broadcasting weights
9795.72, i = 75: setting weights on workers
9798.006, i = 75: training
9892.991, i = 75: collecting weights
9901.374, i = 75: weight = -0.0074926927
9901.374, i = 76: broadcasting weights
9901.983, i = 76: setting weights on workers
9903.695, i = 76: training
9998.839, i = 76: collecting weights
10007.435, i = 76: weight = -0.0047774827
10007.435, i = 77: broadcasting weights
10008.042, i = 77: setting weights on workers
10010.162, i = 77: training
10105.77, i = 77: collecting weights
10114.304, i = 77: weight = -0.007904093
10114.304, i = 78: broadcasting weights
10115.788, i = 78: setting weights on workers
10117.766, i = 78: training
10212.304, i = 78: collecting weights
10221.023, i = 78: weight = -0.0061032483
10221.023, i = 79: broadcasting weights
10221.631, i = 79: setting weights on workers
10223.366, i = 79: training
10319.644, i = 79: collecting weights
10328.185, i = 79: weight = -0.0073969113
10328.185, i = 80: broadcasting weights
10328.795, i = 80: setting weights on workers
10330.679, i = 80: testing
10467.184, i = 80: 8.50% accuracy
10467.184, i = 80: training
10561.14, i = 80: collecting weights
10570.072, i = 80: weight = -0.009284534
10570.072, i = 81: broadcasting weights
10571.551, i = 81: setting weights on workers
10573.317, i = 81: training
10668.644, i = 81: collecting weights
10677.152, i = 81: weight = -0.011661126
10677.152, i = 82: broadcasting weights
10677.753, i = 82: setting weights on workers
10679.699, i = 82: training
10775.75, i = 82: collecting weights
10784.428, i = 82: weight = -0.011425588
10784.428, i = 83: broadcasting weights
10785.036, i = 83: setting weights on workers
10787.149, i = 83: training
10883.142, i = 83: collecting weights
10892.808, i = 83: weight = -0.0127496105
10892.808, i = 84: broadcasting weights
10893.512, i = 84: setting weights on workers
10895.23, i = 84: training
10991.433, i = 84: collecting weights
11000.177, i = 84: weight = -0.012408334
11000.178, i = 85: broadcasting weights
11000.804, i = 85: setting weights on workers
11002.929, i = 85: training
11098.392, i = 85: collecting weights
11107.707, i = 85: weight = -0.012370084
11107.707, i = 86: broadcasting weights
11108.312, i = 86: setting weights on workers
11110.001, i = 86: training
11204.813, i = 86: collecting weights
11214.431, i = 86: weight = -0.012631195
11214.431, i = 87: broadcasting weights
11215.035, i = 87: setting weights on workers
11216.753, i = 87: training
11312.597, i = 87: collecting weights
11321.202, i = 87: weight = -0.013900034
11321.202, i = 88: broadcasting weights
11321.809, i = 88: setting weights on workers
11323.736, i = 88: training
11419.436, i = 88: collecting weights
11428.869, i = 88: weight = -0.013252469
11428.869, i = 89: broadcasting weights
11429.477, i = 89: setting weights on workers
11431.444, i = 89: training
11525.91, i = 89: collecting weights
11534.406, i = 89: weight = -0.014525559
11534.406, i = 90: broadcasting weights
11535.094, i = 90: setting weights on workers
11536.797, i = 90: testing
11673.412, i = 90: 9.71% accuracy
11673.412, i = 90: training
11769.132, i = 90: collecting weights
11778.428, i = 90: weight = -0.012917453
11778.429, i = 91: broadcasting weights
11779.033, i = 91: setting weights on workers
11780.955, i = 91: training
11875.295, i = 91: collecting weights
11883.774, i = 91: weight = -0.011587456
11883.774, i = 92: broadcasting weights
11884.403, i = 92: setting weights on workers
11886.518, i = 92: training
11982.803, i = 92: collecting weights
11991.483, i = 92: weight = -0.010740318
11991.483, i = 93: broadcasting weights
11992.186, i = 93: setting weights on workers
11993.898, i = 93: training
12090.158, i = 93: collecting weights
12099.677, i = 93: weight = -0.008672677
12099.677, i = 94: broadcasting weights
12100.282, i = 94: setting weights on workers
12102.39, i = 94: training
12196.712, i = 94: collecting weights
12205.147, i = 94: weight = -0.00951618
12205.147, i = 95: broadcasting weights
12205.756, i = 95: setting weights on workers
12207.774, i = 95: training
12302.449, i = 95: collecting weights
12310.985, i = 95: weight = -0.012972984
12310.985, i = 96: broadcasting weights
12311.653, i = 96: setting weights on workers
12313.387, i = 96: training
12409.742, i = 96: collecting weights
12419.014, i = 96: weight = -0.012411969
12419.014, i = 97: broadcasting weights
12419.62, i = 97: setting weights on workers
12421.516, i = 97: training
12517.859, i = 97: collecting weights
12526.475, i = 97: weight = -0.014994165
12526.475, i = 98: broadcasting weights
12527.079, i = 98: setting weights on workers
12529.013, i = 98: training
12625.556, i = 98: collecting weights
12634.401, i = 98: weight = -0.016364098
12634.401, i = 99: broadcasting weights
12635.092, i = 99: setting weights on workers
12637.279, i = 99: training
12733.881, i = 99: collecting weights
12743.182, i = 99: weight = -0.015827147
12743.182, i = 100: broadcasting weights
12743.855, i = 100: setting weights on workers
12745.583, i = 100: testing
12882.516, i = 100: 9.53% accuracy
12882.517, i = 100: training
12977.153, i = 100: collecting weights
12985.903, i = 100: weight = -0.016975367
12985.903, i = 101: broadcasting weights
12986.51, i = 101: setting weights on workers
12988.413, i = 101: training
13084.71, i = 101: collecting weights
13093.294, i = 101: weight = -0.015141864
13093.294, i = 102: broadcasting weights
13093.902, i = 102: setting weights on workers
13095.573, i = 102: training
13190.791, i = 102: collecting weights
13200.348, i = 102: weight = -0.016692992
13200.348, i = 103: broadcasting weights
13200.953, i = 103: setting weights on workers
13202.751, i = 103: training
13298.787, i = 103: collecting weights
13307.585, i = 103: weight = -0.01690317
13307.585, i = 104: broadcasting weights
13308.19, i = 104: setting weights on workers
13310.076, i = 104: training
13406.073, i = 104: collecting weights
13415.339, i = 104: weight = -0.015084368
13415.339, i = 105: broadcasting weights
13415.947, i = 105: setting weights on workers
13418.266, i = 105: training
13512.685, i = 105: collecting weights
13522.184, i = 105: weight = -0.015421257
13522.184, i = 106: broadcasting weights
13522.891, i = 106: setting weights on workers
13524.795, i = 106: training
13620.847, i = 106: collecting weights
13629.409, i = 106: weight = -0.015798565
13629.409, i = 107: broadcasting weights
13630.019, i = 107: setting weights on workers
13631.722, i = 107: training
13728.304, i = 107: collecting weights
13736.842, i = 107: weight = -0.014879125
13736.842, i = 108: broadcasting weights
13737.451, i = 108: setting weights on workers
13739.195, i = 108: training
13834.09, i = 108: collecting weights
13843.383, i = 108: weight = -0.013881062
13843.383, i = 109: broadcasting weights
13844.115, i = 109: setting weights on workers
13845.824, i = 109: training
13941.886, i = 109: collecting weights
13950.49, i = 109: weight = -0.014148306
13950.49, i = 110: broadcasting weights
13951.095, i = 110: setting weights on workers
13952.789, i = 110: testing
14089.568, i = 110: 8.76% accuracy
14089.568, i = 110: training
14183.816, i = 110: collecting weights
14193.249, i = 110: weight = -0.016414229
14193.249, i = 111: broadcasting weights
14193.857, i = 111: setting weights on workers
14195.842, i = 111: training
14292.481, i = 111: collecting weights
14301.184, i = 111: weight = -0.01751368
14301.184, i = 112: broadcasting weights
14301.886, i = 112: setting weights on workers
14303.579, i = 112: training
14398.346, i = 112: collecting weights
14407.673, i = 112: weight = -0.01711673
14407.674, i = 113: broadcasting weights
14408.283, i = 113: setting weights on workers
14410.177, i = 113: training
14506.422, i = 113: collecting weights
14514.911, i = 113: weight = -0.015798416
14514.911, i = 114: broadcasting weights
14515.517, i = 114: setting weights on workers
14517.215, i = 114: training
14612.207, i = 114: collecting weights
14620.884, i = 114: weight = -0.016548436
14620.885, i = 115: broadcasting weights
14622.365, i = 115: setting weights on workers
14624.079, i = 115: training
14719.81, i = 115: collecting weights
14728.362, i = 115: weight = -0.018760566
14728.362, i = 116: broadcasting weights
14728.97, i = 116: setting weights on workers
14730.659, i = 116: training
14826.852, i = 116: collecting weights
14835.559, i = 116: weight = -0.015645875
14835.559, i = 117: broadcasting weights
14836.17, i = 117: setting weights on workers
14837.863, i = 117: training
14934.064, i = 117: collecting weights
14942.829, i = 117: weight = -0.016066536
14942.829, i = 118: broadcasting weights
14943.536, i = 118: setting weights on workers
14945.219, i = 118: training
15042.004, i = 118: collecting weights
15051.489, i = 118: weight = -0.015276831
15051.489, i = 119: broadcasting weights
15052.098, i = 119: setting weights on workers
15053.994, i = 119: training
15148.652, i = 119: collecting weights
15157.468, i = 119: weight = -0.015956223
15157.468, i = 120: broadcasting weights
15158.077, i = 120: setting weights on workers
15159.795, i = 120: testing
15296.662, i = 120: 9.98% accuracy
15296.662, i = 120: training
15391.358, i = 120: collecting weights
15400.131, i = 120: weight = -0.013546612
15400.131, i = 121: broadcasting weights
15400.839, i = 121: setting weights on workers
15402.566, i = 121: training
15498.048, i = 121: collecting weights
15507.349, i = 121: weight = -0.015350427
15507.349, i = 122: broadcasting weights
15507.957, i = 122: setting weights on workers
15509.874, i = 122: training
15606.307, i = 122: collecting weights
15614.948, i = 122: weight = -0.01734538
15614.948, i = 123: broadcasting weights
15615.556, i = 123: setting weights on workers
15617.272, i = 123: training
15713.494, i = 123: collecting weights
15722.195, i = 123: weight = -0.017827015
15722.195, i = 124: broadcasting weights
15722.869, i = 124: setting weights on workers
15724.779, i = 124: training
15821.429, i = 124: collecting weights
15830.643, i = 124: weight = -0.016200133
15830.643, i = 125: broadcasting weights
15831.257, i = 125: setting weights on workers
15833.196, i = 125: training
15929.734, i = 125: collecting weights
15938.514, i = 125: weight = -0.019617494
15938.514, i = 126: broadcasting weights
15939.125, i = 126: setting weights on workers
15940.947, i = 126: training
16035.871, i = 126: collecting weights
16044.613, i = 126: weight = -0.01907153
16044.613, i = 127: broadcasting weights
16045.302, i = 127: setting weights on workers
16047.046, i = 127: training
16141.688, i = 127: collecting weights
16150.891, i = 127: weight = -0.014817581
16150.891, i = 128: broadcasting weights
16151.5, i = 128: setting weights on workers
16153.279, i = 128: training
16248.53, i = 128: collecting weights
16257.178, i = 128: weight = -0.013819749
16257.178, i = 129: broadcasting weights
16257.786, i = 129: setting weights on workers
16259.507, i = 129: training
16355.514, i = 129: collecting weights
16364.335, i = 129: weight = -0.016180595
16364.335, i = 130: broadcasting weights
16364.943, i = 130: setting weights on workers
16366.849, i = 130: testing
16503.414, i = 130: 11.67% accuracy
16503.414, i = 130: training
16598.67, i = 130: collecting weights
16608.176, i = 130: weight = -0.01642452
16608.176, i = 131: broadcasting weights
16608.791, i = 131: setting weights on workers
16610.508, i = 131: training
16707.346, i = 131: collecting weights
16715.945, i = 131: weight = -0.017009687
16715.945, i = 132: broadcasting weights
16716.572, i = 132: setting weights on workers
16718.707, i = 132: training
16813.473, i = 132: collecting weights
16822.635, i = 132: weight = -0.014291309
16822.635, i = 133: broadcasting weights
16823.346, i = 133: setting weights on workers
16825.092, i = 133: training
16919.508, i = 133: collecting weights
16928.148, i = 133: weight = -0.017764378
16928.148, i = 134: broadcasting weights
16928.76, i = 134: setting weights on workers
16930.473, i = 134: training
17026.027, i = 134: collecting weights
17035.26, i = 134: weight = -0.016726885
17035.26, i = 135: broadcasting weights
17035.898, i = 135: setting weights on workers
17037.84, i = 135: training
17133.527, i = 135: collecting weights
17142.154, i = 135: weight = -0.016428892
17142.154, i = 136: broadcasting weights
17142.832, i = 136: setting weights on workers
17144.531, i = 136: training
17240.752, i = 136: collecting weights
17249.275, i = 136: weight = -0.01597419
17249.275, i = 137: broadcasting weights
17249.883, i = 137: setting weights on workers
17251.775, i = 137: training
17347.854, i = 137: collecting weights
17357.312, i = 137: weight = -0.016787983
17357.312, i = 138: broadcasting weights
17357.916, i = 138: setting weights on workers
17359.613, i = 138: training
17454.611, i = 138: collecting weights
17463.232, i = 138: weight = -0.016863916
17463.232, i = 139: broadcasting weights
17463.906, i = 139: setting weights on workers
17465.557, i = 139: training
17560.328, i = 139: collecting weights
17568.908, i = 139: weight = -0.01763396
17568.908, i = 140: broadcasting weights
17569.516, i = 140: setting weights on workers
17571.447, i = 140: testing
17707.988, i = 140: 10.96% accuracy
17707.988, i = 140: training
17804.074, i = 140: collecting weights
17813.615, i = 140: weight = -0.019559862
17813.615, i = 141: broadcasting weights
17814.223, i = 141: setting weights on workers
17816.152, i = 141: training
17911.816, i = 141: collecting weights
17920.42, i = 141: weight = -0.018549021
17920.42, i = 142: broadcasting weights
17921.104, i = 142: setting weights on workers
17923.219, i = 142: training
18019.604, i = 142: collecting weights
18028.814, i = 142: weight = -0.020196222
18028.814, i = 143: broadcasting weights
18029.42, i = 143: setting weights on workers
18031.148, i = 143: training
18126.484, i = 143: collecting weights
18135.195, i = 143: weight = -0.018938774
18135.195, i = 144: broadcasting weights
18135.803, i = 144: setting weights on workers
18137.729, i = 144: training
18233.22, i = 144: collecting weights
18241.924, i = 144: weight = -0.019082956
18241.924, i = 145: broadcasting weights
18242.596, i = 145: setting weights on workers
18244.469, i = 145: training
18340.852, i = 145: collecting weights
18349.424, i = 145: weight = -0.019053422
18349.424, i = 146: broadcasting weights
18350.027, i = 146: setting weights on workers
18351.771, i = 146: training
18445.951, i = 146: collecting weights
18455.717, i = 146: weight = -0.015252183
18455.717, i = 147: broadcasting weights
18456.324, i = 147: setting weights on workers
18458.46, i = 147: training
18553.176, i = 147: collecting weights
18562.107, i = 147: weight = -0.01884624
18562.107, i = 148: broadcasting weights
18562.72, i = 148: setting weights on workers
18564.617, i = 148: training
18659.82, i = 148: collecting weights
18669.027, i = 148: weight = -0.018300107
18669.027, i = 149: broadcasting weights
18669.65, i = 149: setting weights on workers
18671.371, i = 149: training
18767.828, i = 149: collecting weights
18776.43, i = 149: weight = -0.021200072
18776.43, i = 150: broadcasting weights
18777.04, i = 150: setting weights on workers
18778.797, i = 150: testing
18915.46, i = 150: 10.38% accuracy
18915.46, i = 150: training
19012.176, i = 150: collecting weights
19021.0, i = 150: weight = -0.021617308
19021.0, i = 151: broadcasting weights
19021.7, i = 151: setting weights on workers
19023.354, i = 151: training
19119.64, i = 151: collecting weights
19127.988, i = 151: weight = -0.017858937
19127.988, i = 152: broadcasting weights
19128.61, i = 152: setting weights on workers
19130.725, i = 152: training
19227.516, i = 152: collecting weights
19237.068, i = 152: weight = -0.016848356
19237.068, i = 153: broadcasting weights
19237.68, i = 153: setting weights on workers
19239.398, i = 153: training
19335.404, i = 153: collecting weights
19343.969, i = 153: weight = -0.01835807
19343.969, i = 154: broadcasting weights
19344.64, i = 154: setting weights on workers
19346.533, i = 154: training
19443.35, i = 154: collecting weights
19452.467, i = 154: weight = -0.019302974
19452.467, i = 155: broadcasting weights
19453.084, i = 155: setting weights on workers
19455.174, i = 155: training
19552.035, i = 155: collecting weights
19560.691, i = 155: weight = -0.019577678
19560.691, i = 156: broadcasting weights
19561.303, i = 156: setting weights on workers
19563.021, i = 156: training
19657.225, i = 156: collecting weights
19666.152, i = 156: weight = -0.013703593
19666.152, i = 157: broadcasting weights
19666.758, i = 157: setting weights on workers
19668.46, i = 157: training
19765.133, i = 157: collecting weights
19774.412, i = 157: weight = -0.020081561
19774.412, i = 158: broadcasting weights
19775.023, i = 158: setting weights on workers
19776.91, i = 158: training
19871.799, i = 158: collecting weights
19880.756, i = 158: weight = -0.018739281
19880.756, i = 159: broadcasting weights
19881.363, i = 159: setting weights on workers
19883.115, i = 159: training
19979.342, i = 159: collecting weights
19987.838, i = 159: weight = -0.024326136
19987.838, i = 160: broadcasting weights
19988.514, i = 160: setting weights on workers
19990.236, i = 160: testing
20126.197, i = 160: 10.32% accuracy
20126.197, i = 160: training
20221.09, i = 160: collecting weights
20229.72, i = 160: weight = -0.021167494
20229.72, i = 161: broadcasting weights
20230.34, i = 161: setting weights on workers
20232.08, i = 161: training
20327.725, i = 161: collecting weights
20336.95, i = 161: weight = -0.019555029
20336.95, i = 162: broadcasting weights
20337.56, i = 162: setting weights on workers
20339.664, i = 162: training
20435.67, i = 162: collecting weights
20444.3, i = 162: weight = -0.017718932
20444.3, i = 163: broadcasting weights
20444.99, i = 163: setting weights on workers
20447.297, i = 163: training
20542.906, i = 163: collecting weights
20552.494, i = 163: weight = -0.019657493
20552.494, i = 164: broadcasting weights
20553.1, i = 164: setting weights on workers
20555.18, i = 164: training
20651.736, i = 164: collecting weights
20660.49, i = 164: weight = -0.017428681
20660.49, i = 165: broadcasting weights
20661.096, i = 165: setting weights on workers
20662.988, i = 165: training
20759.559, i = 165: collecting weights
20768.201, i = 165: weight = -0.018227957
20768.201, i = 166: broadcasting weights
20769.805, i = 166: setting weights on workers
20771.74, i = 166: training
20868.68, i = 166: collecting weights
20877.266, i = 166: weight = -0.01901897
20877.266, i = 167: broadcasting weights
20877.871, i = 167: setting weights on workers
20879.73, i = 167: training
20976.154, i = 167: collecting weights
20984.84, i = 167: weight = -0.01819825
20984.84, i = 168: broadcasting weights
20985.443, i = 168: setting weights on workers
20987.352, i = 168: training
21082.4, i = 168: collecting weights
21090.96, i = 168: weight = -0.01564147
21090.96, i = 169: broadcasting weights
21091.633, i = 169: setting weights on workers
21093.51, i = 169: training
21188.844, i = 169: collecting weights
21198.324, i = 169: weight = -0.016515559
21198.324, i = 170: broadcasting weights
21198.928, i = 170: setting weights on workers
21200.64, i = 170: testing
21337.688, i = 170: 11.18% accuracy
21337.688, i = 170: training
21432.424, i = 170: collecting weights
21441.035, i = 170: weight = -0.019890409
21441.035, i = 171: broadcasting weights
21441.645, i = 171: setting weights on workers
21443.564, i = 171: training
21540.244, i = 171: collecting weights
21548.873, i = 171: weight = -0.020831356
21548.873, i = 172: broadcasting weights
21549.584, i = 172: setting weights on workers
21551.283, i = 172: training
21647.908, i = 172: collecting weights
21657.158, i = 172: weight = -0.023309525
21657.158, i = 173: broadcasting weights
21657.78, i = 173: setting weights on workers
21659.867, i = 173: training
21754.438, i = 173: collecting weights
21763.172, i = 173: weight = -0.021100517
21763.172, i = 174: broadcasting weights
21763.78, i = 174: setting weights on workers
21765.455, i = 174: training
21861.596, i = 174: collecting weights
21870.305, i = 174: weight = -0.018584628
21870.305, i = 175: broadcasting weights
21871.008, i = 175: setting weights on workers
21872.92, i = 175: training
21969.23, i = 175: collecting weights
21977.79, i = 175: weight = -0.021652566
21977.79, i = 176: broadcasting weights
21978.4, i = 176: setting weights on workers
21980.088, i = 176: training
22074.523, i = 176: collecting weights
22084.115, i = 176: weight = -0.020843402
22084.115, i = 177: broadcasting weights
22084.736, i = 177: setting weights on workers
22086.469, i = 177: training
22180.857, i = 177: collecting weights
22189.838, i = 177: weight = -0.020773388
22189.838, i = 178: broadcasting weights
22190.465, i = 178: setting weights on workers
22192.344, i = 178: training
22289.385, i = 178: collecting weights
22297.98, i = 178: weight = -0.024789896
22297.98, i = 179: broadcasting weights
22298.59, i = 179: setting weights on workers
22300.443, i = 179: training
22396.447, i = 179: collecting weights
22405.842, i = 179: weight = -0.021892298
22405.842, i = 180: broadcasting weights
22406.447, i = 180: setting weights on workers
22408.326, i = 180: testing
22544.729, i = 180: 11.16% accuracy
22544.729, i = 180: training
22639.164, i = 180: collecting weights
22647.936, i = 180: weight = -0.024101965
22647.936, i = 181: broadcasting weights
22648.559, i = 181: setting weights on workers
22650.867, i = 181: training
22747.281, i = 181: collecting weights
22756.652, i = 181: weight = -0.023421535
22756.652, i = 182: broadcasting weights
22757.26, i = 182: setting weights on workers
22759.352, i = 182: training
22855.928, i = 182: collecting weights
22864.457, i = 182: weight = -0.019755494
22864.457, i = 183: broadcasting weights
22865.777, i = 183: setting weights on workers
22867.533, i = 183: training
22964.26, i = 183: collecting weights
22972.951, i = 183: weight = -0.021026272
22972.951, i = 184: broadcasting weights
22973.559, i = 184: setting weights on workers
22975.229, i = 184: training
23071.809, i = 184: collecting weights
23080.578, i = 184: weight = -0.019825052
23080.578, i = 185: broadcasting weights
23081.184, i = 185: setting weights on workers
23083.107, i = 185: training
23178.316, i = 185: collecting weights
23186.969, i = 185: weight = -0.023135073
23186.969, i = 186: broadcasting weights
23187.668, i = 186: setting weights on workers
23189.38, i = 186: training
23286.06, i = 186: collecting weights
23295.236, i = 186: weight = -0.021025069
23295.236, i = 187: broadcasting weights
23295.84, i = 187: setting weights on workers
23298.133, i = 187: training
23393.287, i = 187: collecting weights
23401.746, i = 187: weight = -0.024848977
23401.746, i = 188: broadcasting weights
23402.352, i = 188: setting weights on workers
23404.025, i = 188: training
23498.156, i = 188: collecting weights
23506.877, i = 188: weight = -0.025640875
23506.877, i = 189: broadcasting weights
23507.6, i = 189: setting weights on workers
23509.916, i = 189: training
23604.781, i = 189: collecting weights
23614.186, i = 189: weight = -0.02897274
23614.186, i = 190: broadcasting weights
23614.795, i = 190: setting weights on workers
23616.93, i = 190: testing
23753.822, i = 190: 11.23% accuracy
23753.822, i = 190: training
23850.584, i = 190: collecting weights
23859.113, i = 190: weight = -0.026255783
23859.113, i = 191: broadcasting weights
23859.72, i = 191: setting weights on workers
23861.477, i = 191: training
23958.168, i = 191: collecting weights
23966.8, i = 191: weight = -0.029134702
23966.8, i = 192: broadcasting weights
23967.484, i = 192: setting weights on workers
23969.203, i = 192: training
24066.12, i = 192: collecting weights
24074.736, i = 192: weight = -0.027529169
24074.736, i = 193: broadcasting weights
24075.344, i = 193: setting weights on workers
24077.244, i = 193: training
24174.1, i = 193: collecting weights
24183.725, i = 193: weight = -0.028809853
24183.725, i = 194: broadcasting weights
24184.332, i = 194: setting weights on workers
24186.418, i = 194: training
24281.957, i = 194: collecting weights
24290.615, i = 194: weight = -0.033832252
24290.615, i = 195: broadcasting weights
24292.004, i = 195: setting weights on workers
24293.756, i = 195: training
24390.312, i = 195: collecting weights
24398.84, i = 195: weight = -0.031045282
24398.84, i = 196: broadcasting weights
24399.451, i = 196: setting weights on workers
24401.115, i = 196: training
24497.324, i = 196: collecting weights
24505.94, i = 196: weight = -0.030697525
24505.94, i = 197: broadcasting weights
24506.549, i = 197: setting weights on workers
24508.252, i = 197: training
24603.951, i = 197: collecting weights
24612.77, i = 197: weight = -0.03293054
24612.77, i = 198: broadcasting weights
24613.455, i = 198: setting weights on workers
24615.36, i = 198: training
24710.744, i = 198: collecting weights
24719.385, i = 198: weight = -0.032518342
24719.385, i = 199: broadcasting weights
24720.014, i = 199: setting weights on workers
24721.916, i = 199: training
24817.924, i = 199: collecting weights
24827.295, i = 199: weight = -0.033092063
24827.295, i = 200: broadcasting weights
24827.9, i = 200: setting weights on workers
24829.611, i = 200: testing
24966.229, i = 200: 10.35% accuracy
24966.229, i = 200: training
25062.766, i = 200: collecting weights
25071.371, i = 200: weight = -0.034023516
25071.371, i = 201: broadcasting weights
25072.072, i = 201: setting weights on workers
25074.057, i = 201: training
25169.877, i = 201: collecting weights
25179.23, i = 201: weight = -0.034427404
25179.23, i = 202: broadcasting weights
25179.855, i = 202: setting weights on workers
25181.559, i = 202: training
25277.217, i = 202: collecting weights
25285.889, i = 202: weight = -0.033765763
25285.889, i = 203: broadcasting weights
25286.496, i = 203: setting weights on workers
25288.197, i = 203: training
25383.191, i = 203: collecting weights
25391.904, i = 203: weight = -0.033460088
25391.904, i = 204: broadcasting weights
25392.621, i = 204: setting weights on workers
25394.73, i = 204: training
25491.389, i = 204: collecting weights
25499.982, i = 204: weight = -0.03540793
25499.982, i = 205: broadcasting weights
25500.592, i = 205: setting weights on workers
25502.371, i = 205: training
25598.004, i = 205: collecting weights
25606.707, i = 205: weight = -0.03572621
25606.707, i = 206: broadcasting weights
25607.336, i = 206: setting weights on workers
25609.455, i = 206: training
25705.68, i = 206: collecting weights
25714.98, i = 206: weight = -0.036047786
25714.98, i = 207: broadcasting weights
25715.684, i = 207: setting weights on workers
25717.424, i = 207: training
25813.951, i = 207: collecting weights
25822.6, i = 207: weight = -0.03332932
25822.6, i = 208: broadcasting weights
25823.207, i = 208: setting weights on workers
25825.488, i = 208: training
25922.154, i = 208: collecting weights
25931.508, i = 208: weight = -0.029649518
25931.51, i = 209: broadcasting weights
25932.115, i = 209: setting weights on workers
25934.225, i = 209: training
26031.455, i = 209: collecting weights
26040.16, i = 209: weight = -0.03143044
26040.16, i = 210: broadcasting weights
26040.852, i = 210: setting weights on workers
26042.576, i = 210: testing
26179.771, i = 210: 9.07% accuracy
26179.771, i = 210: training
26275.732, i = 210: collecting weights
26284.205, i = 210: weight = -0.028149448
26284.205, i = 211: broadcasting weights
26284.812, i = 211: setting weights on workers
26286.5, i = 211: training
26382.188, i = 211: collecting weights
26391.469, i = 211: weight = -0.030547045
26391.469, i = 212: broadcasting weights
26392.072, i = 212: setting weights on workers
26394.045, i = 212: training
26490.867, i = 212: collecting weights
26499.58, i = 212: weight = -0.030565228
26499.58, i = 213: broadcasting weights
26500.322, i = 213: setting weights on workers
26502.465, i = 213: training
26598.9, i = 213: collecting weights
26607.236, i = 213: weight = -0.027259825
26607.236, i = 214: broadcasting weights
26607.846, i = 214: setting weights on workers
26609.764, i = 214: training
26706.611, i = 214: collecting weights
26715.121, i = 214: weight = -0.031828858
26715.121, i = 215: broadcasting weights
26715.729, i = 215: setting weights on workers
26717.46, i = 215: training
26813.854, i = 215: collecting weights
26823.346, i = 215: weight = -0.033288643
26823.346, i = 216: broadcasting weights
26824.076, i = 216: setting weights on workers
26826.047, i = 216: training
26921.656, i = 216: collecting weights
26930.213, i = 216: weight = -0.03616178
26930.213, i = 217: broadcasting weights
26930.816, i = 217: setting weights on workers
26932.936, i = 217: training
27030.184, i = 217: collecting weights
27038.695, i = 217: weight = -0.036762744
27038.695, i = 218: broadcasting weights
27039.305, i = 218: setting weights on workers
27041.193, i = 218: training
27137.771, i = 218: collecting weights
27146.996, i = 218: weight = -0.03734451
27146.996, i = 219: broadcasting weights
27147.725, i = 219: setting weights on workers
27149.443, i = 219: training
27246.023, i = 219: collecting weights
27254.596, i = 219: weight = -0.037505895
27254.596, i = 220: broadcasting weights
27255.203, i = 220: setting weights on workers
27257.205, i = 220: testing
27393.605, i = 220: 9.57% accuracy
27393.605, i = 220: training
27490.0, i = 220: collecting weights
27498.793, i = 220: weight = -0.03374687
27498.793, i = 221: broadcasting weights
27499.398, i = 221: setting weights on workers
27501.514, i = 221: training
27596.607, i = 221: collecting weights
27606.168, i = 221: weight = -0.03795062
27606.168, i = 222: broadcasting weights
27606.936, i = 222: setting weights on workers
27609.05, i = 222: training
27705.584, i = 222: collecting weights
27714.035, i = 222: weight = -0.036329664
27714.035, i = 223: broadcasting weights
27714.64, i = 223: setting weights on workers
27716.531, i = 223: training
27812.104, i = 223: collecting weights
27820.73, i = 223: weight = -0.03471512
27820.73, i = 224: broadcasting weights
27821.336, i = 224: setting weights on workers
27823.28, i = 224: training
27919.383, i = 224: collecting weights
27928.672, i = 224: weight = -0.031874306
27928.672, i = 225: broadcasting weights
27929.4, i = 225: setting weights on workers
27931.701, i = 225: training
28028.092, i = 225: collecting weights
28036.752, i = 225: weight = -0.032988034
28036.752, i = 226: broadcasting weights
28037.363, i = 226: setting weights on workers
28039.084, i = 226: training
28135.748, i = 226: collecting weights
28144.408, i = 226: weight = -0.03552442
28144.408, i = 227: broadcasting weights
28145.012, i = 227: setting weights on workers
28146.732, i = 227: training
28242.797, i = 227: collecting weights
28252.473, i = 227: weight = -0.034765005
28252.473, i = 228: broadcasting weights
28253.08, i = 228: setting weights on workers
28254.99, i = 228: training
28349.576, i = 228: collecting weights
28358.168, i = 228: weight = -0.032990888
28358.168, i = 229: broadcasting weights
28358.846, i = 229: setting weights on workers
28360.951, i = 229: training
28454.943, i = 229: collecting weights
28463.498, i = 229: weight = -0.033771716
28463.498, i = 230: broadcasting weights
28464.104, i = 230: setting weights on workers
28466.3, i = 230: testing
28603.297, i = 230: 11.24% accuracy
28603.297, i = 230: training
28699.418, i = 230: collecting weights
28708.713, i = 230: weight = -0.03680542
28708.713, i = 231: broadcasting weights
28709.44, i = 231: setting weights on workers
28711.527, i = 231: training
28808.164, i = 231: collecting weights
28816.598, i = 231: weight = -0.032608204
28816.598, i = 232: broadcasting weights
28817.203, i = 232: setting weights on workers
28818.88, i = 232: training
28914.64, i = 232: collecting weights
28923.443, i = 232: weight = -0.030683802
28923.443, i = 233: broadcasting weights
28924.053, i = 233: setting weights on workers
28926.137, i = 233: training
29020.229, i = 233: collecting weights
29029.979, i = 233: weight = -0.03296407
29029.979, i = 234: broadcasting weights
29030.684, i = 234: setting weights on workers
29032.836, i = 234: training
29129.287, i = 234: collecting weights
29137.836, i = 234: weight = -0.028226208
29137.836, i = 235: broadcasting weights
29138.443, i = 235: setting weights on workers
29140.389, i = 235: training
29235.527, i = 235: collecting weights
29244.096, i = 235: weight = -0.03278337
29244.096, i = 236: broadcasting weights
29244.703, i = 236: setting weights on workers
29246.398, i = 236: training
29342.588, i = 236: collecting weights
29351.723, i = 236: weight = -0.03371924
29351.723, i = 237: broadcasting weights
29352.443, i = 237: setting weights on workers
29354.166, i = 237: training
29449.1, i = 237: collecting weights
29457.777, i = 237: weight = -0.035661712
29457.777, i = 238: broadcasting weights
29458.393, i = 238: setting weights on workers
29460.488, i = 238: training
29556.668, i = 238: collecting weights
29565.2, i = 238: weight = -0.036219575
29565.2, i = 239: broadcasting weights
29565.809, i = 239: setting weights on workers
29567.943, i = 239: training
29663.508, i = 239: collecting weights
29672.908, i = 239: weight = -0.04001714
29672.908, i = 240: broadcasting weights
29673.611, i = 240: setting weights on workers
29675.297, i = 240: testing
29811.988, i = 240: 11.13% accuracy
29811.988, i = 240: training
29907.99, i = 240: collecting weights
29916.537, i = 240: weight = -0.042155348
29916.54, i = 241: broadcasting weights
29917.148, i = 241: setting weights on workers
29919.088, i = 241: training
30014.184, i = 241: collecting weights
30022.893, i = 241: weight = -0.03994567
30022.893, i = 242: broadcasting weights
30023.5, i = 242: setting weights on workers
30025.627, i = 242: training
30121.191, i = 242: collecting weights
30131.129, i = 242: weight = -0.039086863
30131.129, i = 243: broadcasting weights
30131.863, i = 243: setting weights on workers
30133.576, i = 243: training
30227.904, i = 243: collecting weights
30236.691, i = 243: weight = -0.03774213
30236.691, i = 244: broadcasting weights
30237.3, i = 244: setting weights on workers
30239.0, i = 244: training
30335.627, i = 244: collecting weights
30344.201, i = 244: weight = -0.038008776
30344.201, i = 245: broadcasting weights
30344.807, i = 245: setting weights on workers
30346.768, i = 245: training
30443.557, i = 245: collecting weights
30453.027, i = 245: weight = -0.039052725
30453.027, i = 246: broadcasting weights
30453.729, i = 246: setting weights on workers
30455.664, i = 246: training
30551.207, i = 246: collecting weights
30559.754, i = 246: weight = -0.044177763
30559.754, i = 247: broadcasting weights
30560.36, i = 247: setting weights on workers
30562.248, i = 247: training
30658.828, i = 247: collecting weights
30667.6, i = 247: weight = -0.04363596
30667.6, i = 248: broadcasting weights
30668.207, i = 248: setting weights on workers
30670.928, i = 248: training
30767.023, i = 248: collecting weights
30776.908, i = 248: weight = -0.043794602
30776.908, i = 249: broadcasting weights
30777.512, i = 249: setting weights on workers
30779.22, i = 249: training
30874.637, i = 249: collecting weights
30883.432, i = 249: weight = -0.04157701
30883.432, i = 250: broadcasting weights
30884.033, i = 250: setting weights on workers
30886.096, i = 250: testing
31022.51, i = 250: 10.44% accuracy
31022.51, i = 250: training
31118.416, i = 250: collecting weights
31127.633, i = 250: weight = -0.040816337
31127.633, i = 251: broadcasting weights
31128.24, i = 251: setting weights on workers
31129.936, i = 251: training
31225.783, i = 251: collecting weights
31234.4, i = 251: weight = -0.039249092
31234.4, i = 252: broadcasting weights
31235.064, i = 252: setting weights on workers
31236.783, i = 252: training
31331.363, i = 252: collecting weights
31340.633, i = 252: weight = -0.040223036
31340.633, i = 253: broadcasting weights
31341.238, i = 253: setting weights on workers
31342.922, i = 253: training
31438.422, i = 253: collecting weights
31446.955, i = 253: weight = -0.040302474
31446.955, i = 254: broadcasting weights
31447.639, i = 254: setting weights on workers
31449.34, i = 254: training
31544.08, i = 254: collecting weights
31553.283, i = 254: weight = -0.039920293
31553.283, i = 255: broadcasting weights
31553.95, i = 255: setting weights on workers
31555.623, i = 255: training
31652.133, i = 255: collecting weights
31660.7, i = 255: weight = -0.03867361
31660.7, i = 256: broadcasting weights
31661.307, i = 256: setting weights on workers
31663.36, i = 256: training
31759.94, i = 256: collecting weights
31768.643, i = 256: weight = -0.03798943
31768.643, i = 257: broadcasting weights
31769.248, i = 257: setting weights on workers
31771.082, i = 257: training
31867.688, i = 257: collecting weights
31876.27, i = 257: weight = -0.037287947
31876.27, i = 258: broadcasting weights
31876.955, i = 258: setting weights on workers
31878.76, i = 258: training
31974.084, i = 258: collecting weights
31983.43, i = 258: weight = -0.039043225
31983.43, i = 259: broadcasting weights
31984.04, i = 259: setting weights on workers
31985.684, i = 259: training
32082.506, i = 259: collecting weights
32091.072, i = 259: weight = -0.03858953
32091.072, i = 260: broadcasting weights
32091.676, i = 260: setting weights on workers
32093.344, i = 260: testing
32230.111, i = 260: 9.92% accuracy

@rahulbhalerao001
Copy link
Contributor Author

Hello Robert,

Thank you for sharing the logs. So given that we are facing memory issues when running for the entire ImageNet Data, there will be problems in getting above this 10% accuracy.

Also, about your earlier observation about the severity of caching, it indeed happened that way. With more data, not caching resulted in extremely slow training :
441.483, i = 0: 0.12% accuracy
4058.596, i = 10: 0.15% accuracy
7602.223, i = 20: 0.14% accuracy

While in your case you were able to complete 60 iterations in around 7500s, without caching I was only able to complete 20 iterations.

Thanks,
Rahul

@rahulbhalerao001
Copy link
Contributor Author

Hello Robert,

I ran the CIFAR-10 example for a longer time, and I see that the accuracy is stuck at around 65%. Is this expected? syncInteval is 50, as per your previous suggestion on a 3 slave GPU cluster.

54.481, i = 0: 11.47% accuracy
64.393, i = 5: 34.45% accuracy
73.911, i = 10: 45.61% accuracy
83.579, i = 15: 51.94% accuracy
92.946, i = 20: 54.30% accuracy
102.002, i = 25: 56.22% accuracy
111.247, i = 30: 58.83% accuracy
120.4, i = 35: 59.68% accuracy
129.784, i = 40: 60.55% accuracy
138.658, i = 45: 61.93% accuracy
148.062, i = 50: 62.47% accuracy
156.246, i = 55: 62.02% accuracy
165.291, i = 60: 63.00% accuracy
173.856, i = 65: 63.23% accuracy
182.786, i = 70: 63.25% accuracy
191.66, i = 75: 63.47% accuracy
200.109, i = 80: 63.32% accuracy
208.925, i = 85: 62.80% accuracy
217.973, i = 90: 64.18% accuracy
227.15, i = 95: 64.66% accuracy
237.086, i = 100: 64.46% accuracy
246.321, i = 105: 64.44% accuracy
255.545, i = 110: 64.83% accuracy
264.332, i = 115: 64.40% accuracy
272.63, i = 120: 65.07% accuracy
281.84, i = 125: 63.82% accuracy
290.215, i = 130: 63.98% accuracy
299.598, i = 135: 65.57% accuracy
309.195, i = 140: 64.70% accuracy
318.301, i = 145: 65.80% accuracy
327.1, i = 150: 65.99% accuracy
336.231, i = 155: 64.39% accuracy
345.449, i = 160: 65.04% accuracy
354.733, i = 165: 65.41% accuracy
363.912, i = 170: 65.57% accuracy
373.48, i = 175: 64.78% accuracy
382.506, i = 180: 63.96% accuracy
391.606, i = 185: 63.41% accuracy
400.587, i = 190: 65.27% accuracy
410.189, i = 195: 65.24% accuracy
419.535, i = 200: 65.63% accuracy
428.082, i = 205: 65.47% accuracy
437.28, i = 210: 65.38% accuracy
446.891, i = 215: 65.34% accuracy
456.192, i = 220: 64.52% accuracy
465.607, i = 225: 64.64% accuracy
474.4, i = 230: 65.64% accuracy
482.806, i = 235: 65.15% accuracy
491.818, i = 240: 65.70% accuracy
501.172, i = 245: 64.81% accuracy
509.784, i = 250: 65.79% accuracy
518.101, i = 255: 66.03% accuracy
527.304, i = 260: 65.69% accuracy
536.021, i = 265: 65.73% accuracy
544.725, i = 270: 65.60% accuracy
554.434, i = 275: 65.32% accuracy
562.718, i = 280: 65.42% accuracy
571.668, i = 285: 65.61% accuracy
580.118, i = 290: 64.58% accuracy
589.157, i = 295: 63.53% accuracy
598.547, i = 300: 65.67% accuracy
607.188, i = 305: 65.45% accuracy
615.803, i = 310: 64.81% accuracy
624.179, i = 315: 64.89% accuracy
632.848, i = 320: 64.14% accuracy
641.916, i = 325: 64.03% accuracy
650.957, i = 330: 65.26% accuracy
659.136, i = 335: 65.24% accuracy
668.471, i = 340: 65.01% accuracy
677.009, i = 345: 64.53% accuracy
685.507, i = 350: 64.47% accuracy
694.447, i = 355: 62.68% accuracy
703.884, i = 360: 65.58% accuracy
712.784, i = 365: 64.48% accuracy
721.734, i = 370: 65.60% accuracy
730.342, i = 375: 65.27% accuracy
737.756, i = 380: 64.92% accuracy
747.206, i = 385: 63.86% accuracy
756.395, i = 390: 65.83% accuracy
765.059, i = 395: 64.59% accuracy
774.566, i = 400: 65.16% accuracy
783.197, i = 405: 64.52% accuracy
792.002, i = 410: 65.55% accuracy
800.979, i = 415: 65.30% accuracy
810.061, i = 420: 63.77% accuracy
818.978, i = 425: 64.49% accuracy
828.629, i = 430: 61.71% accuracy
837.831, i = 435: 65.42% accuracy
846.781, i = 440: 64.88% accuracy
855.511, i = 445: 64.65% accuracy
864.108, i = 450: 64.64% accuracy
873.522, i = 455: 65.34% accuracy
881.768, i = 460: 65.69% accuracy
890.736, i = 465: 63.59% accuracy
899.886, i = 470: 64.76% accuracy
908.615, i = 475: 64.76% accuracy
916.393, i = 480: 64.47% accuracy
925.322, i = 485: 65.30% accuracy
933.535, i = 490: 64.57% accuracy
942.399, i = 495: 64.40% accuracy
951.027, i = 500: 64.63% accuracy
959.545, i = 505: 66.09% accuracy
968.559, i = 510: 64.15% accuracy
977.315, i = 515: 65.60% accuracy
986.079, i = 520: 64.26% accuracy
994.85, i = 525: 65.45% accuracy
1003.591, i = 530: 65.41% accuracy
1012.597, i = 535: 65.43% accuracy
1021.315, i = 540: 65.72% accuracy
1030.048, i = 545: 65.70% accuracy
1038.249, i = 550: 65.61% accuracy
1046.566, i = 555: 64.19% accuracy
1055.629, i = 560: 64.08% accuracy
1064.447, i = 565: 65.35% accuracy
1072.934, i = 570: 64.21% accuracy
1081.518, i = 575: 64.42% accuracy
1089.692, i = 580: 65.37% accuracy
1097.897, i = 585: 65.22% accuracy
1106.968, i = 590: 64.32% accuracy
1116.06, i = 595: 65.40% accuracy
1125.093, i = 600: 64.87% accuracy
1133.831, i = 605: 64.46% accuracy
1142.713, i = 610: 64.88% accuracy
1151.275, i = 615: 65.19% accuracy
1160.018, i = 620: 65.24% accuracy
1169.043, i = 625: 66.09% accuracy
1178.193, i = 630: 64.54% accuracy
1187.333, i = 635: 64.86% accuracy
1196.057, i = 640: 65.42% accuracy
1205.03, i = 645: 65.06% accuracy
1213.856, i = 650: 66.14% accuracy
1222.021, i = 655: 65.76% accuracy
1230.9, i = 660: 65.17% accuracy
1239.384, i = 665: 65.78% accuracy
1248.562, i = 670: 65.05% accuracy
1256.989, i = 675: 65.66% accuracy
1265.969, i = 680: 64.95% accuracy
1274.597, i = 685: 65.04% accuracy
1283.156, i = 690: 66.20% accuracy
1291.256, i = 695: 65.91% accuracy
1300.118, i = 700: 64.32% accuracy
1308.55, i = 705: 65.62% accuracy
1317.195, i = 710: 64.84% accuracy
1325.625, i = 715: 65.62% accuracy
1333.702, i = 720: 65.25% accuracy
1342.528, i = 725: 64.15% accuracy
1351.105, i = 730: 64.46% accuracy
1359.563, i = 735: 65.57% accuracy
1367.939, i = 740: 64.85% accuracy
1375.643, i = 745: 65.46% accuracy
1383.907, i = 750: 64.92% accuracy
1392.414, i = 755: 65.43% accuracy
1400.969, i = 760: 66.23% accuracy
1409.69, i = 765: 65.30% accuracy
1418.294, i = 770: 65.61% accuracy
1426.445, i = 775: 65.72% accuracy
1435.318, i = 780: 65.03% accuracy
1444.16, i = 785: 65.13% accuracy
1453.146, i = 790: 64.63% accuracy
1461.79, i = 795: 65.45% accuracy
1470.501, i = 800: 65.86% accuracy
1479.29, i = 805: 64.35% accuracy
1487.954, i = 810: 65.16% accuracy
1496.955, i = 815: 65.82% accuracy
1504.908, i = 820: 65.80% accuracy
1512.725, i = 825: 66.09% accuracy
1521.558, i = 830: 65.78% accuracy
1530.101, i = 835: 65.54% accuracy
1539.108, i = 840: 65.42% accuracy
1547.872, i = 845: 66.15% accuracy
1556.15, i = 850: 66.40% accuracy
1563.828, i = 855: 65.94% accuracy
1572.304, i = 860: 65.70% accuracy
1580.587, i = 865: 65.86% accuracy
1589.47, i = 870: 65.17% accuracy
1597.988, i = 875: 65.99% accuracy
1606.729, i = 880: 65.86% accuracy
1615.596, i = 885: 65.76% accuracy
1624.33, i = 890: 65.66% accuracy
1633.389, i = 895: 65.34% accuracy
1641.133, i = 900: 65.82% accuracy
1649.762, i = 905: 65.72% accuracy
1658.663, i = 910: 64.39% accuracy
1666.81, i = 915: 66.02% accuracy
1675.528, i = 920: 66.00% accuracy
1684.604, i = 925: 66.58% accuracy
1693.037, i = 930: 66.24% accuracy
1701.473, i = 935: 65.94% accuracy
1710.364, i = 940: 66.03% accuracy
1719.51, i = 945: 65.84% accuracy
1727.769, i = 950: 65.68% accuracy
1736.771, i = 955: 65.48% accuracy
1746.02, i = 960: 65.63% accuracy
1755.224, i = 965: 65.77% accuracy
1763.91, i = 970: 66.13% accuracy
1772.34, i = 975: 64.30% accuracy
1781.138, i = 980: 65.62% accuracy
1790.013, i = 985: 65.72% accuracy
1798.838, i = 990: 65.84% accuracy
1807.408, i = 995: 65.69% accuracy
1816.45, i = 1000: 65.67% accuracy
1824.97, i = 1005: 65.89% accuracy
1832.784, i = 1010: 66.42% accuracy
1841.939, i = 1015: 66.07% accuracy
1850.52, i = 1020: 65.92% accuracy
1858.71, i = 1025: 66.19% accuracy
1867.86, i = 1030: 65.90% accuracy
1876.594, i = 1035: 65.90% accuracy
1885.059, i = 1040: 66.35% accuracy
1893.887, i = 1045: 65.81% accuracy
1902.504, i = 1050: 65.95% accuracy
1910.885, i = 1055: 66.21% accuracy
1920.078, i = 1060: 65.62% accuracy
1928.88, i = 1065: 66.33% accuracy
1937.339, i = 1070: 66.70% accuracy
1946.042, i = 1075: 66.03% accuracy
1954.822, i = 1080: 66.52% accuracy
1963.186, i = 1085: 66.23% accuracy
1972.109, i = 1090: 65.03% accuracy
1981.142, i = 1095: 66.08% accuracy
1989.84, i = 1100: 65.02% accuracy
1997.647, i = 1105: 66.28% accuracy

@pcmoritz
Copy link
Collaborator

Hey Rahul,

we are currently not subtracting the mean image (this would get you to similar performance as the cifar_quick model from here: http://caffe.berkeleyvision.org/gathered/examples/cifar10.html) and also not flipping the images during training, which should give a further boost. See the ImageNet example on how to substract the mean; for flipping, I'd recommend either augmenting the training set or implementing your own preprocessor. Let us know if you need any help!

-- Philipp.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants