Fix BLOB store/get, Update API.py GET #137

krowvin · 2025-03-12T04:11:38Z

ISSUE

When testing the blobs for storing and retrieving with cwms-python version 0.6.0 the API was unable to retrieve the BLOB with the following error:

ERROR:root:Error decoding CDA response as JSON: Expecting value: line 1 column 3 (char 2) on line 1

This was because the mime-type returned by the blobs endpoint for a GET request is application/octet-stream

Seen here:

$ curl -X 'GET' 'https://cwms-data.usace.army.mil/cwms-data/blobs/GATECHANGES.XML?office=SWT'  -H 'accept: */*' -I
HTTP/1.1 200 
Strict-Transport-Security: max-age=31536000;includeSubDomains
X-Frame-Options: SAMEORIGIN
X-Content-Type-Options: nosniff
X-XSS-Protection: 1; mode=block
Cache-Control: max-age=300
ETag: 3579315270
Content-Type: application/octet-stream
Content-Length: 863
Date: Wed, 12 Mar 2025 03:58:43 GMT
Server: webthing

If you run what was present before in the get method

cwms-python/cwms/api.py

Line 238 in 2684dcf

return cast(JSON, response.json())

It attempts to force the response into json even if it is not json.

FIX

To fix this I made sure to check the content type in the get method of api.py. Dynamically deciding when to apply the various methods to the response data.

MISSING CDA TYPES

Currently CDA only lets you store/retrieve octet-stream - I posted an issue for this on the CDA repo here:

Add support for other MIME Types to the BLOB endpoint USACE/cwms-data-api#1031

SUMMARY OF CHANGES

I also took the opportunity to :

Make sure the store_blob casted the value to a base64 encoded string if it was not already done. This is required in order for the API to store the value. Otherwise you will get a serialization error in the logs and a 500 status.
Converted the response check for codes to .ok as 3## redirects are handled by requests and imo anything less than 400 should be reasonable - I'm not dead set on this but we could make it < 400 if you prefer the verbosity. (not the < 300 we had).
Add notes to the BLOB store/get pydocs to ensure user knows the id gets uppercased on storing and must be uppercase on retrieval.
Switch from using the response.close() to a context manager which properly ensures no resource/connection leaks.
Use response.text and response.content along with response.json() based on the content-type returned, updated the get_xml to use the new get method for backwards compatibility.

TESTS

I went about writing a quick get script with some mock tests for store_blob and various other endpoints to make sure it was properly building the payload (base64 encoded value) and I did not introduce any other breaking changes.

Here is that script for reference:

import cwms

cwms.init_session()


import sys

data = {
    "office-id": "SWT",
    "id": "MYFILE_OR_BLOB_ID.TXT",
    "description": "Your description here",
    "media-type-id": "application/octet-stream",
    "value": "STRING of content or BASE64_ENCODED_STRING",
}
# cwms.store_blobs(data, fail_if_exists=False)

# sys.exit()
changes = cwms.get_blob("GATECHANGES.XML", "SWT")
print(changes)

xml_catalog = cwms.get_blobs(office_id="SWT", blob_id_like="*.XML")
print(xml_catalog.json)

timeseries = cwms.get_timeseries(
    office_id="SWT", ts_id="KEYS.Elev.Inst.1Hour.0.Ccp-Rev"
)
print(timeseries.df)

location = cwms.get_location("KEYS", "SWT")
print(location.df)


outlet = cwms.get_outlet("SWT", "KEYS")
print(outlet.df)

…/cwms-python into bug/get-blob-response

…ure of HTTP response. This will prevent resource (connection) leaks in the event the request fails.

…brary and redirects still return valid data. Simplify by accepting anything under 400 with response.ok

…t-type

…nsure VALUE stored to CDA is a base64 encoded string.

…w if no content type is set fall back to json

… iosort

…he JSON response is happy for mypy

krowvin · 2025-03-12T05:26:41Z

I'm uncertain if it makes more sense for get_clob to return an object or if the string should be passed through.

I tried setting a Union type for the get handler of JSON and str but got a few errors from mypy about downstream functions expecting only JSON.

2f35807

In the future as more types are added it could be possible to do things like add ElementTree and parse the BLOB as XML given the content type from the headers is set.

Also, sorry for the spam. I was actively remembering (contributing doc) as I went and how much I could run/test locally before committing.

Enovotny

I like the changes. just have a couple comments. Thank for making the improvements to the api calls.

cwms/location/group.py

cwms/api.py

Update blob to correct version; Change types to Any that can return various mimetypes (not just json); Change from 102 to xml/v2

…t for versions higher than two, allow CDA to say version is not supported.

sonarqubecloud · 2025-03-13T03:14:39Z

Quality Gate passed

Issues
2 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

krowvin · 2025-03-13T03:26:25Z

Your requests

Removed the extra location groups file
Removed get_xml
Changed types to Any where required by typechecker for XML/etc

Updated method to return raw string instead of Dicts where needed

Few more things

Ran the spellchecker for entire codebase
converted 102/2 api version to dynamically figure out the mimetype based on the format you provide (CDA has a format param but will use the mimetype/accept as well - NOT both)
Unlocked the versions because

We control the version with the wrapper itself
CDA can/should handle if a user/we try to specify a version higher
Adds room for API growth

I also ran my python file above to make sure these worked against the national CDA instance, not just the mocks

Enovotny · 2025-03-13T13:29:58Z

I like some of the changes, but we should keep the calls get_....xml. Something that I learned from Jordan when first setting this up was the each function should do one thing or provide a single data format. This is for testing purposes. That is why we created the cwms data type to provide the json/dataframe which is the backbone of this package. Anything that provides a different format of data in the get calls should be a separate function. get_ratings_xml ect... I think I am fine with the change in the api.py calls. I think that makes things a little more readable and maintainable long term. but we should not have format as a parameter in any of the get... functions. I originally had that parameters as well to get json or dataframe and Jordan steered me into the current implementation.

…rsion 2" This reverts commit 9c0befd.

This reverts commit 2fe3602.

…pped)

krowvin · 2025-05-29T21:09:43Z

I tested these changes against the below cwms-python code and our CDA instance to confirm this would work as intended.

import os
import cwms

cwms.init_session(api_key="apikey " + os.getenv("CDA_API_KEY", ""), api_root=os.getenv("CDA_HOST", "") + "/")


cwms.store_blobs(
    data={
        "office-id": "SWT",
        "id": "TEST.TXT",
        "description": "Your description here",
        "media-type-id": "text/plain",
        "value": "A test of cwms-python blob store",
    }, 
    fail_if_exists=False
)

print(cwms.get_blob(blob_id="DATACHECK.HYDROPOWER.JSON", office_id="SWT"))
print(cwms.get_blob(blob_id="EUFA.PLOT.PNG", office_id="SWT"))
print(cwms.get_blob(blob_id="TEST.TXT", office_id="SWT"))
print(cwms.get_blobs(office_id="SWT", blob_id_like="TEST").json)

print("Stored!")

Saved to:
https://cwms-data.usace.army.mil/cwms-data/blobs/TEST.TXT?office=SWT

Output here:

{'startTime': 't-7d', 'updated': '2025-03-27 07:28:12.067961-05:00', 'DATA': 1, 'NAME': 0, 'endTime': 't+6h', 'groups': [['Group Name 2', {'Pool Elevation': ['FGIB.Elev.Inst.30Minutes.0.Decodes-Raw']}]]}

iVBORw0KGgoAA...really-long-base64-string...AABJRU5ErkJggg==

A test of cwms-python blob store

{'blobs': [{'office-id': 'SWT', 'id': 'TEST-TEXT', 'description': 'a test text response', 'media-type-id': 'text/plain'}, {'office-id': 'SWT', 'id': 'TEST.TXT', 'description': 'Your description here', 'media-type-id': 'text/plain'}], 'page': 'fHwwfHwxMDA=', 'page-size': 100, 'total': 0}

Stored!

I reverted the format and get_xml/etc changes as requested.

Let me know if I missed any changes you would like to see.

Side note, the get may normally return a JSON. But with the blob endpoint it could be any format based on the mimetype.

Could also create a get_any if you wanted to maintain JSON on get?

krowvin · 2025-05-29T21:31:34Z

Realized we could handle the images mimetype too. Those are stored in blob as base64 strings.

cwms-python/cwms/api.py

Lines 232 to 233 in 505c192

    
           if content_type.startswith("image/"): 
        
               return base64.b64encode(response.content).decode("utf-8")

Depending on the mime-type it might not play nice with an attempt at python doing a decode on it. But will leave any extras I missed (XML/XLSX/etc) for the next PR...

cwms-python/cwms/api.py

Line 235 in 505c192

return response.content.decode("utf-8")

sonarqubecloud · 2025-05-30T17:34:38Z

Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

krowvin and others added 9 commits February 13, 2025 16:40

Initial setup of location group function defs

849914f

Merge branch 'main' of https://github.com/HydrologicEngineeringCenter…

902c24d

…/cwms-python into bug/get-blob-response

Bump version, fix typo in project toml description

e2f1d01

Change response close method to context manager (with) to ensure clos…

4328f08

…ure of HTTP response. This will prevent resource (connection) leaks in the event the request fails.

Switch to using response.ok as 300 are usually handled by requests li…

b9319c8

…brary and redirects still return valid data. Simplify by accepting anything under 400 with response.ok

Update blob return type, make get requests dynamic on response conten…

a9555b2

…t-type

Fix clob to BLOB in get, add extra notes for storing/gets to pydocs

3fc330c

Create base64 check utility for blobs

5b89757

Add blurb to users on how to format the BLOB storage, correct note, e…

a200088

…nsure VALUE stored to CDA is a base64 encoded string.

krowvin changed the title ~~Fix BLOB store, Update API.py GET~~ Fix BLOB store/get, Update API.py GET Mar 12, 2025

krowvin added the blocking A district needs this to move forward with cloud migration label Mar 12, 2025

krowvin added 5 commits March 11, 2025 23:45

Not sure how to set the headers in the mock requests response, for no…

7c186d8

…w if no content type is set fall back to json

Ensure string or JSON can return from blob

2f35807

Forgot to install precommit on this box, fix sort order of typing for…

e504397

… iosort

Attempt casting get_blob to string and reverting type union for get

ae96f0d

Wrap all the responses for get that are not json in a dictionary so t…

fada38a

…he JSON response is happy for mypy

Enovotny requested changes Mar 12, 2025

View reviewed changes

cwms/location/group.py Outdated Show resolved Hide resolved

cwms/api.py Show resolved Hide resolved

cwms/api.py Show resolved Hide resolved

krowvin added 6 commits March 12, 2025 21:39

Remove duplicate group methods

1792120

Ensure accept mimetype is set for proper file type response;

2fe3602

Update blob to correct version; Change types to Any that can return various mimetypes (not just json); Change from 102 to xml/v2

Change all references to 102 (xml, v2) to format="xml" and version 2

9c0befd

Remove print statements

32719b9

Run spell checker through all files

841e0c2

Correct api_version_text to not set a version for 0 and 1; remove tes…

d402002

…t for versions higher than two, allow CDA to say version is not supported.

krowvin added 3 commits May 29, 2025 15:36

Revert "Change all references to 102 (xml, v2) to format="xml" and ve…

75ac1dc

…rsion 2" This reverts commit 9c0befd.

Revert "Ensure accept mimetype is set for proper file type response;"

9308abe

This reverts commit 2fe3602.

Change to Any and update other mimetypes to pure output (not JSON wra…

efeb86d

…pped)

Bump version, handle Any for get_with_paging

fcaf173

krowvin added 2 commits May 29, 2025 16:21

Handle base64 encoded content (images)

3efade4

Correct base64 import position

73bdcad

krowvin force-pushed the bug/get-blob-response branch from 505c192 to 73bdcad Compare May 29, 2025 21:30

Enovotny approved these changes May 30, 2025

View reviewed changes

Merge branch 'main' into bug/get-blob-response

fd6d917

krowvin merged commit cd91824 into main Jun 3, 2025
8 checks passed

krowvin deleted the bug/get-blob-response branch June 3, 2025 17:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix BLOB store/get, Update API.py GET #137

Fix BLOB store/get, Update API.py GET #137

Uh oh!

krowvin commented Mar 12, 2025

Uh oh!

krowvin commented Mar 12, 2025 •

edited

Loading

Uh oh!

Enovotny left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sonarqubecloud bot commented Mar 13, 2025

Uh oh!

krowvin commented Mar 13, 2025

Uh oh!

Enovotny commented Mar 13, 2025

Uh oh!

krowvin commented May 29, 2025 •

edited

Loading

Uh oh!

krowvin commented May 29, 2025 •

edited

Loading

Uh oh!

sonarqubecloud bot commented May 30, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Fix BLOB store/get, Update API.py GET #137

Fix BLOB store/get, Update API.py GET #137

Uh oh!

Conversation

krowvin commented Mar 12, 2025

ISSUE

FIX

MISSING CDA TYPES

SUMMARY OF CHANGES

TESTS

Uh oh!

krowvin commented Mar 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Enovotny left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sonarqubecloud bot commented Mar 13, 2025

Quality Gate passed

Uh oh!

krowvin commented Mar 13, 2025

Your requests

Few more things

Uh oh!

Enovotny commented Mar 13, 2025

Uh oh!

krowvin commented May 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

krowvin commented May 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sonarqubecloud bot commented May 30, 2025

Quality Gate passed

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

krowvin commented Mar 12, 2025 •

edited

Loading

krowvin commented May 29, 2025 •

edited

Loading

krowvin commented May 29, 2025 •

edited

Loading