Output the testing results of many images as text file #70

tmquan · 2015-04-15T07:25:01Z

Thanks to digits, the new feature of "image_classification_model_classify_many" works well.

I just wonder is there any way to retrieve that result as text file from this part, to systematically analyze the output using pandas? It would be nice if this can save the dataframe of testing images somewhere in its job's directory

classifications = []
    for image_index, index_list in enumerate(indices):
        result = []
        for i in index_list:
            # `i` is a category in labels and also an index into scores
            result.append((labels[i], round(100.0*scores[image_index, i],2)))
        classifications.append(result)

The text was updated successfully, but these errors were encountered:

VrUnRealEngine4 · 2015-04-15T12:00:07Z

Did you have a look at my code in #61... That is how I directly store the results to a csv file....

tmquan · 2015-04-15T12:16:07Z

Thanks to michael-geoge-hart.

I did the same strategy to output the result to text file or csv, but it is really messy and I dont want to alter the beautiful structure of DIGITS.

Just wonder to have this feature in the next version because it is very convenient to analyse on the test dataset.

On the other hand, since the list of test data is oftent quite large (mine has around 1 milion images). It would be nice if we can have a status directly on the web browser, along side with the terminal info log.

Regards,

VrUnRealEngine4 · 2015-04-15T12:38:10Z

Totally agree .... @lukeyeager has done a really good job on DIGITS in terms of clear though and design/implementation of DIGITS .... it is really not that hard to add the additional features you desired if you look at #61

lukeyeager · 2015-04-15T16:49:55Z

@tmquan, so you'd like to be able to download the "Classify Many" results as a .csv? That sounds pretty doable. As I mentioned here, you should be able to copy and paste the text into excel and get your data fairly easily. But I agree that this would be a nice enhancement.

@michael-george-hart, you sound eager to help with this. If you could embed the data on the results page in the CSV format and provide a "Download Results" button that would be awesome. That would be much more user friendly than Tran's suggestion:

save the dataframe of testing images somewhere in its job's directory

lukeyeager · 2015-04-15T16:51:24Z

On the other hand, since the list of test data is often quite large (mine has around 1 million images). It would be nice if we can have a status directly on the web browser, along side with the terminal info log.

Are you talking about returning intermediate results? Or some kind of progress bar? @Sravan2j, this sounds like the kind of thing you were talking about. I agree, that would be nice.

VrUnRealEngine4 · 2015-04-15T20:08:43Z

I will take a shot at providing the requested feature.... However, I will be very busy until this Saturday. I will get a copy of the latest master then and see what I can do over the weekend.

Since I have been doing a great deal of runs batch runs of 10K to 20K images ... there is one things that would be particularly useful to me and perhaps others.... I often need know what was the best epoch is during training.... I checked the logs to see if I could discover what was the best performing epoch ... didn't see anything.... anyway most people would want to use the best epoch on their test datasets and eyeballing the graph to discover what the best epoch was after a few hundred epoch can be difficult ... so I think that would be useful information to have some place on the page or apart of the epoch list box showing not only the epoch but the performance of the particular epoch...

lukeyeager · 2015-04-15T20:15:21Z

have some place on the page or apart of the epoch list box showing not only the epoch but the performance of the particular epoch

Great idea, why don't you open a new issue and suggest a new feature there. Long threads get confusing.

tmquan · 2015-04-16T02:16:57Z

@lukeyeager : The reason I'd like to address this enhancement is when I pulled the newest version from master branch, it was fine to classify a million of images and return the results to the terminal. It was, however, the problem on the client browser side: can not display million lines of results and the browser was automatically killed.

Processed 1443682/1443682 images
2015-04-16 03:41:08 [ERROR] Exception on /models/images/classification/classify_many [POST]
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/flask/app.py", line 1817, in wsgi_app
    response = self.full_dispatch_request()
  File "/usr/local/lib/python2.7/dist-packages/flask/app.py", line 1478, in full_dispatch_request
    response = self.make_response(rv)
  File "/usr/local/lib/python2.7/dist-packages/flask/app.py", line 1566, in make_response
    raise ValueError('View function did not return a response')
ValueError: View function did not return a response

lukeyeager · 2015-04-16T16:48:13Z

Oh wow, so it was a request timeout? Yuck.

I think the best way to solve this would be with the intermediate results / progress bar feature I mentioned earlier. Then the page can return quickly with a "0/1,300,000 images processed" page, and then slowly return the data as it comes through. That's a lot of work, and won't get implemented quickly. For now I'd suggest breaking up your huge textfile into manageable chunks.

joyofdata · 2015-04-21T11:19:20Z

Or the CSV is stored in a public directory of the server and a link to it is provided. By means of a recurring AJAX call presence might be checked. Or, even simpler, the link is simply dead until the CSV is created - maybe the new predictions are appended then the link would allow for download of intermediate results. At the end of the day, those solutions would be simple to implement and already provide a convenient solution.

lukeyeager added the enhancement label Apr 15, 2015

lukeyeager added the bug label Apr 20, 2015

lukeyeager mentioned this issue May 18, 2015

Classify many, uploads all images to ram? #115

Closed

lukeyeager mentioned this issue Jun 17, 2015

Report incremental updates to "Classify Many" #151

Open

lukeyeager mentioned this issue Dec 18, 2015

gunicorn + nginx issues #479

Closed

lukeyeager mentioned this issue Feb 17, 2016

Inference jobs #573

Merged

15 tasks

lukeyeager mentioned this issue Feb 29, 2016

View partial results for inference #611

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Output the testing results of many images as text file #70

Output the testing results of many images as text file #70

tmquan commented Apr 15, 2015

VrUnRealEngine4 commented Apr 15, 2015

tmquan commented Apr 15, 2015

VrUnRealEngine4 commented Apr 15, 2015

lukeyeager commented Apr 15, 2015

lukeyeager commented Apr 15, 2015

VrUnRealEngine4 commented Apr 15, 2015

lukeyeager commented Apr 15, 2015

tmquan commented Apr 16, 2015

lukeyeager commented Apr 16, 2015

joyofdata commented Apr 21, 2015

Output the testing results of many images as text file #70

Output the testing results of many images as text file #70

Comments

tmquan commented Apr 15, 2015

VrUnRealEngine4 commented Apr 15, 2015

tmquan commented Apr 15, 2015

VrUnRealEngine4 commented Apr 15, 2015

lukeyeager commented Apr 15, 2015

lukeyeager commented Apr 15, 2015

VrUnRealEngine4 commented Apr 15, 2015

lukeyeager commented Apr 15, 2015

tmquan commented Apr 16, 2015

lukeyeager commented Apr 16, 2015

joyofdata commented Apr 21, 2015