Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Image file names in classify_many JSON results #1373

Open
dcmartin opened this issue Jan 5, 2017 · 4 comments
Open

Image file names in classify_many JSON results #1373

dcmartin opened this issue Jan 5, 2017 · 4 comments

Comments

@dcmartin
Copy link

dcmartin commented Jan 5, 2017

The classify_many API call returns JSON in which the image identifier is used as the attribute name, e.g.

{
"classifications": {
"20160428105354-780-01.jpg": [
[
"foo",
37.6
],
[
"bar",
22.94
],

The specific LOC which produces this results is lines 585-587 in DIGITS/views.py:

if request_wants_json():
joined = dict(zip(paths, classifications))
return flask.jsonify({'classifications': joined}), status_code

While this technique may be expeditious, processing the output is a challenge (e.g. try with 30K results); could we introduce a "id" attribute to the JSON and set the value to the image identifier?

@lukeyeager
Copy link
Member

processing the output is a challenge (e.g. try with 30K results)

Can you explain more? Are you just saying that you have long path names?

@dcmartin
Copy link
Author

dcmartin commented Jan 5, 2017

Perhaps I am uneducated, but I use the JQ command line program to extract the data from the scores returned; to address the values for the example, I would need to request '.classifications."20160428105354-780-01.jpg".[0] tp retrieve ["foo",37.6]. Typically this type of retrieval would be performed by selecting on an attribute of classification, e.g. '.classifications|select(.id=="20160428105354-780-01.jpg")[0]'

I attach the scores emitted and the results I generated; the script I used is at: ~dcmartin/age-at-home/bin/dodigit

band-results-json.txt
band-scores-json.txt

@dcmartin dcmartin closed this as completed Jan 5, 2017
@dcmartin dcmartin reopened this Jan 5, 2017
@dcmartin
Copy link
Author

dcmartin commented Jan 8, 2017

I changed my batch script for calling classify_many to process the scores from each call as partial results; I limited by batch size to 100 images, but have had success with larger sets. I am pursuing identifying the "most interesting" images for end-user curation, e.g. images with low scores across all classes and/or low std. dev. indicating either multiple entities or lack of clarity in the trained model, hence requiring end-user curation. I am also seeing a response indicating a ZIP failure, which I think is due to an image in the file listing being unavailable:

Sat Jan 7 20:21:04 PST 2017 ./dodigit 42020 -- ITERATING over NO_TAGS.801.900.42020.txt and NO_TAGS.801.900.42020.json
Sat Jan 7 20:21:08 PST 2017 ./dodigit 42020 -- processing 901 to 1000
Sat Jan 7 20:21:52 PST 2017 ./dodigit 42020 -- ERROR: "zip argument #2 must support iteration"

@erexhepa
Copy link

Hi,

I also see the follwing error but the listing is correct (meaning all images are there)

ERROR: "zip argument #2 must support iteration"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants