Image file names in classify_many JSON results #1373

dcmartin · 2017-01-05T17:38:12Z

The classify_many API call returns JSON in which the image identifier is used as the attribute name, e.g.

{
"classifications": {
"20160428105354-780-01.jpg": [
[
"foo",
37.6
],
[
"bar",
22.94
],

The specific LOC which produces this results is lines 585-587 in DIGITS/views.py:

if request_wants_json():
joined = dict(zip(paths, classifications))
return flask.jsonify({'classifications': joined}), status_code

While this technique may be expeditious, processing the output is a challenge (e.g. try with 30K results); could we introduce a "id" attribute to the JSON and set the value to the image identifier?

lukeyeager · 2017-01-05T18:20:52Z

processing the output is a challenge (e.g. try with 30K results)

Can you explain more? Are you just saying that you have long path names?

dcmartin · 2017-01-05T21:22:05Z

Perhaps I am uneducated, but I use the JQ command line program to extract the data from the scores returned; to address the values for the example, I would need to request '.classifications."20160428105354-780-01.jpg".[0] tp retrieve ["foo",37.6]. Typically this type of retrieval would be performed by selecting on an attribute of classification, e.g. '.classifications|select(.id=="20160428105354-780-01.jpg")[0]'

I attach the scores emitted and the results I generated; the script I used is at: ~dcmartin/age-at-home/bin/dodigit

band-results-json.txt
band-scores-json.txt

dcmartin · 2017-01-08T18:35:11Z

I changed my batch script for calling classify_many to process the scores from each call as partial results; I limited by batch size to 100 images, but have had success with larger sets. I am pursuing identifying the "most interesting" images for end-user curation, e.g. images with low scores across all classes and/or low std. dev. indicating either multiple entities or lack of clarity in the trained model, hence requiring end-user curation. I am also seeing a response indicating a ZIP failure, which I think is due to an image in the file listing being unavailable:

Sat Jan 7 20:21:04 PST 2017 ./dodigit 42020 -- ITERATING over NO_TAGS.801.900.42020.txt and NO_TAGS.801.900.42020.json
Sat Jan 7 20:21:08 PST 2017 ./dodigit 42020 -- processing 901 to 1000
Sat Jan 7 20:21:52 PST 2017 ./dodigit 42020 -- ERROR: "zip argument #2 must support iteration"

erexhepa · 2017-01-12T07:54:38Z

Hi,

I also see the follwing error but the listing is correct (meaning all images are there)

ERROR: "zip argument #2 must support iteration"

dcmartin closed this as completed Jan 5, 2017

dcmartin reopened this Jan 5, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Image file names in classify_many JSON results #1373

Image file names in classify_many JSON results #1373

dcmartin commented Jan 5, 2017

lukeyeager commented Jan 5, 2017

dcmartin commented Jan 5, 2017

dcmartin commented Jan 8, 2017

erexhepa commented Jan 12, 2017

Image file names in classify_many JSON results #1373

Image file names in classify_many JSON results #1373

Comments

dcmartin commented Jan 5, 2017

lukeyeager commented Jan 5, 2017

dcmartin commented Jan 5, 2017

dcmartin commented Jan 8, 2017

erexhepa commented Jan 12, 2017