Failed to load json using detect.py #2086

boesiii · 2019-04-04T15:09:08Z

In which file did you encounter the issue?

python-docs-samples/vision/cloud-client/detect/detect.py

Did you change the file? If so, how?

No.

Describe the issue

I tried using detect.py on a PDF that is stored in Google Cloud. Below is a sample of the code I tried
C:\temp1\google_vision>python detect.py ocr-uri gs://my_bucket_name/file_1003.pdf gs://my_bucket_name/output/

When I run my code I get the following error:

C:\temp1\google_vision>python detect.py ocr-uri gs://matr/file_1003.pdf gs://mat
r/output
Waiting for the operation to finish.
Output files:
output/
output/clsoutput-1-to-2.json
output/output-1-to-2.json
outputoutput-1-to-2.json
Traceback (most recent call last):
  File "C:\Program Files (x86)\Python37-32\lib\site-packages\google\protobuf\jso
n_format.py", line 416, in Parse
    js = json.loads(text, object_pairs_hook=_DuplicateChecker)
  File "C:\Program Files (x86)\Python37-32\lib\json\__init__.py", line 361, in l
oads
    return cls(**kw).decode(s)
  File "C:\Program Files (x86)\Python37-32\lib\json\decoder.py", line 337, in de
code
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "C:\Program Files (x86)\Python37-32\lib\json\decoder.py", line 355, in ra
w_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "detect.py", line 955, in <module>
    run_uri(args)
  File "detect.py", line 835, in run_uri
    async_detect_document(args.uri, args.destination_uri)
  File "detect.py", line 720, in async_detect_document
    json_string, vision.types.AnnotateFileResponse())
  File "C:\Program Files (x86)\Python37-32\lib\site-packages\google\protobuf\jso
n_format.py", line 418, in Parse
    raise ParseError('Failed to load JSON: {0}.'.format(str(e)))
google.protobuf.json_format.ParseError: Failed to load JSON: Expecting value: li
ne 1 column 1 (char 0).

How can I avoid this error? There is a resulting JSON file in the output folder.

The text was updated successfully, but these errors were encountered:

boesiii · 2019-04-09T02:59:33Z

It looks like the error is more about how it parses the JSON output file.

nnegrey · 2019-04-17T22:03:37Z

Hi, from your call C:\temp1\google_vision>python detect.py ocr-uri gs://matr/file_1003.pdf gs://mat r/output

It looks like you might be missing the end / on the gcs_destination_uri.

Should be: C:\temp1\google_vision>python detect.py ocr-uri gs://matr/file_1003.pdf gs://mat r/output/

Let me know if that works.

boesiii · 2019-04-18T12:19:33Z

No, still the same error.

nnegrey · 2019-04-18T17:36:28Z

Is your target GCS bucket empty?

boesiii · 2019-04-18T18:02:59Z

I created a new folder in my bucket and targeted that folder and still received the error.

nnegrey · 2019-04-18T21:52:39Z

Does it still throw an error if you use our example pdf?
gs://python-docs-samples-tests/HodgeConj.pdf

boesiii · 2019-04-19T12:10:09Z

Yes, still same error.

nnegrey · 2019-04-22T15:31:21Z

For the example pdf (gs://python-docs-samples-tests/HodgeConj.pdf), can you share a little bit of the contents of the output file?

boesiii · 2019-04-22T15:43:04Z

Here are the first 75 lines

{
	"inputConfig": {
		"gcsSource": {
			"uri": "gs://python-docs-samples-tests/HodgeConj.pdf"
		},
		"mimeType": "application/pdf"
	},
	"responses": [{
			"fullTextAnnotation": {
				"pages": [{
						"property": {
							"detectedLanguages": [{
									"languageCode": "en",
									"confidence": 0.97
								}, {
									"languageCode": "az",
									"confidence": 0.02
								}
							]
						},
						"width": 595,
						"height": 842,
						"blocks": [{
								"boundingBox": {
									"normalizedVertices": [{
											"x": 0.09243698,
											"y": 0.059382424
										}, {
											"x": 0.5243698,
											"y": 0.066508316
										}, {
											"x": 0.5243698,
											"y": 0.07482185
										}, {
											"x": 0.09243698,
											"y": 0.06769596
										}
									]
								},
								"paragraphs": [{
										"boundingBox": {
											"normalizedVertices": [{
													"x": 0.09243698,
													"y": 0.059382424
												}, {
													"x": 0.5243698,
													"y": 0.066508316
												}, {
													"x": 0.5243698,
													"y": 0.07482185
												}, {
													"x": 0.09243698,
													"y": 0.06769596
												}
											]
										},
										"words": [{
												"property": {
													"detectedLanguages": [{
															"languageCode": "en"
														}
													]
												},
												"boundingBox": {
													"normalizedVertices": [{
															"x": 0.09243698,
															"y": 0.059382424
														}, {
															"x": 0.13781513,
															"y": 0.060570072
														}, {
															"x": 0.13781513,
															"y": 0.06888361
														}, {

boesiii · 2019-04-22T15:44:35Z

Here are the three total files
test2_output-1-to-2.zip
test2_output-3-to-4.zip
test2_output-5-to-5.zip

nnegrey · 2019-04-23T22:45:39Z

Alright, cool. It looks like the Vision API call is successful, but when retrieving the results from GCS there seems to be an issue.

Are you on the latest version for the storage API?
If you run pip freeze | grep google

boesiii · 2019-04-24T12:34:51Z

pip freeze | findstr google
google-api-core==1.8.2
google-auth==1.6.3
google-cloud-bigquery==1.10.0
google-cloud-core==0.29.1
google-cloud-storage==1.14.0
google-cloud-vision==0.36.0
google-resumable-media==0.3.2
googleapis-common-protos==1.5.9

boesiii · 2019-04-24T12:40:53Z

I updated google cloud storage to 1.15.0 but I still get the same error

benbluhm · 2019-04-26T03:10:29Z

I had this issue and determined it was caused by the prefix being iterated as part of the bloblist. I can see that "output/" is listed as a file in your output, and subsequently has parsing attempted on it causing the error.

Try hardcoding a prefix something like prefix = 'output/out' and that folder won't be included in the list.

The demo code should probably be modified to handle this simple case a little better.

APerson101 · 2019-04-28T20:06:45Z

@benbluhm your suggestion solved my issue, thank you

boesiii · 2019-04-29T12:03:39Z

Yes. It worked for me also.

nnegrey · 2019-04-29T15:51:14Z

Thanks, @benbluhm!
Closing the issue.

arindam-halder · 2021-03-16T14:34:41Z

Hi Guys can someone put in the updated sample code. That would be great. \

Thanks

nnegrey self-assigned this Apr 17, 2019

nnegrey closed this as completed Apr 29, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Failed to load json using detect.py #2086

Failed to load json using detect.py #2086

boesiii commented Apr 4, 2019

boesiii commented Apr 9, 2019

nnegrey commented Apr 17, 2019

boesiii commented Apr 18, 2019

nnegrey commented Apr 18, 2019

boesiii commented Apr 18, 2019

nnegrey commented Apr 18, 2019

boesiii commented Apr 19, 2019

nnegrey commented Apr 22, 2019

boesiii commented Apr 22, 2019

boesiii commented Apr 22, 2019

nnegrey commented Apr 23, 2019

boesiii commented Apr 24, 2019

boesiii commented Apr 24, 2019

benbluhm commented Apr 26, 2019

APerson101 commented Apr 28, 2019

boesiii commented Apr 29, 2019

nnegrey commented Apr 29, 2019

arindam-halder commented Mar 16, 2021

Failed to load json using detect.py #2086

Failed to load json using detect.py #2086

Comments

boesiii commented Apr 4, 2019

In which file did you encounter the issue?

Did you change the file? If so, how?

Describe the issue

boesiii commented Apr 9, 2019

nnegrey commented Apr 17, 2019

boesiii commented Apr 18, 2019

nnegrey commented Apr 18, 2019

boesiii commented Apr 18, 2019

nnegrey commented Apr 18, 2019

boesiii commented Apr 19, 2019

nnegrey commented Apr 22, 2019

boesiii commented Apr 22, 2019

boesiii commented Apr 22, 2019

nnegrey commented Apr 23, 2019

boesiii commented Apr 24, 2019

boesiii commented Apr 24, 2019

benbluhm commented Apr 26, 2019

APerson101 commented Apr 28, 2019

boesiii commented Apr 29, 2019

nnegrey commented Apr 29, 2019

arindam-halder commented Mar 16, 2021