-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Problem: Storage service File API endpoint response has changed in qa/1.x (returns "string type" for size not integer #1094
Comments
Original work was related to #981 |
I remember this one, sorry for not making my considerations more clear in an issue/the pr.
I partially agree. You should not break a contract unless you have a very good reason. And I would like to argue that we have a good reason here. Correctly handling and displaying the size of an AIP for me is stronger than an "undocumented expectation". Of course, this does not mean that we shouldn't try to make the API as intuitive and stable as possible. As @replaceafill points out in #981, switching from MySQL 5.6 to 5.7 changes the default SQL mode which causes an error when inserting a value that is too large. However, in earlier versions of MySQL, the problem is there, you are just not notified during the insert. From my understanding, the Storage Service will move to a MySQL based database rather sooner than later. This would mean that rolling back this change, would result in ingest problems for reasonable transfers. I must admit that I have been somewhat lazy. I have tried to see if I could find a way to force the type of the JSON response. I could not find an easy way to do it, and since I only broke one internal call I decided that it was simpler to fix rather than spending a lot of time on it. So to summarise: yes we break the contract. However, given that we fix an actual bug over not breaking keeping an undocumented expectation I would like to argue that this is worth it. But maybe someone more experienced with Django and the API implementation in Archivematica could solve the new problem? |
I just checked #1507 and here I mentioned that the problem is caused by how the requests library the parsing of the resulting JSON. IIRC the JSON was unchanged, the problem is how the requests library decides to parse the JSON. Which I think is outside our control. But I might have missed something. |
Perfect! Thank you @jorikvankemenade! All valid, and welcome points. Radda is going to take a look at this one and I'm sure the pragmatic path will be found. 🙂 |
Thanks @ross-spencer and @jorikvankemenade, great report and comments! I agree with Jorik that we should not revert the initial change but I'm having conflicts about keeping it as a string or force it as a number. I think we need to consider who is the bigger consumer of the affected endpoints and that's hard to know, at least for me. As Ross said, it doesn't seem to be any limitation on the JSON number specification but it looks like there may be some issues on the Javascript side, for example. I also wonder if the value is being casted to an integer by the receiver, if that may cause an overflow error in some languages. Considering that changes may be needed in the consumer either way (number or string) and that we'll need to "hack" django-tastypie's behavior to make it a number (something that they even have a PR from 2011 and never addressed it), I'd leave it as a string. Some interesting Q&A:
|
This would do the trick: diff --git a/storage_service/locations/api/resources.py b/storage_service/locations/api/resources.py
index 1a2da9a..cf91930 100644
--- a/storage_service/locations/api/resources.py
+++ b/storage_service/locations/api/resources.py
@@ -909,6 +909,12 @@ class PackageResource(ModelResource):
bundle.data["encrypted"] = encrypted
return bundle
+ def dehydrate_size(self, bundle):
+ try:
+ return int(bundle.data["size"])
+ except (ValueError, TypeError):
+ return -1 # Or pass if we don't want to populate.
+
def hydrate_current_location(self, bundle):
"""Customize unserialization of current_location. It's pretty easy and a common approach in Tastypie. Thoughts? |
@sevein that looks promising. I was unaware of this feature of Django/Tastypie, thanks for sharing :). For anyone interested in what the hydrate/dehydrate concepts are, check out the Tastypie documentation. |
Manually dehydrate `BigIntegerField` as an integer since Tastypie can't do that for us (django-tastypie/django-tastypie#299). Connects to archivematica/Issues#1094.
Manually dehydrate `BigIntegerField` as an integer since Tastypie can't do that for us (django-tastypie/django-tastypie#299). Connects to archivematica/Issues#1094.
Manually dehydrate `BigIntegerField` as an integer since Tastypie can't do that for us (django-tastypie/django-tastypie#299). Connects to archivematica/Issues#1094.
Ready for QA. Follow the same steps that Ross shared to reproduce! |
Verified! $ http -v --pretty=format GET "http://localhost:62081/api/v2/file/6149c4df-fda1-420b-956b-3a75abbe0ffe/" Authorization:"ApiKey test:test"
GET /api/v2/file/6149c4df-fda1-420b-956b-3a75abbe0ffe/ HTTP/1.1
Accept: */*
Accept-Encoding: gzip, deflate
Authorization: ApiKey test:test
Connection: keep-alive
Host: localhost:62081
User-Agent: HTTPie/0.9.8
HTTP/1.1 200 OK
Cache-Control: no-cache
Connection: keep-alive
Content-Language: en
Content-Type: application/json
Date: Thu, 27 Feb 2020 22:13:15 GMT
Server: nginx/1.14.2
Transfer-Encoding: chunked
Vary: Accept, Accept-Language, Cookie
X-Frame-Options: SAMEORIGIN
{
"current_full_path": "/var/archivematica/sharedDirectory/www/AIPsStore/6149/c4df/fda1/420b/956b/3a75/abbe/0ffe/pictures-6149c4df-fda1-420b-956b-3a75abbe0ffe.7z",
"current_location": "/api/v2/location/c2dd8f5b-a095-4ba9-bf09-3aff9a3bb435/",
"current_path": "6149/c4df/fda1/420b/956b/3a75/abbe/0ffe/pictures-6149c4df-fda1-420b-956b-3a75abbe0ffe.7z",
"encrypted": false,
"misc_attributes": {},
"origin_pipeline": "/api/v2/pipeline/fa3f8d23-2fea-4434-b231-dd5289b162f0/",
"package_type": "AIP",
"related_packages": [
"/api/v2/file/544e5a38-304a-440f-bd6d-83de3e609d4e/"
],
"replicas": [],
"replicated_package": null,
"resource_uri": "/api/v2/file/6149c4df-fda1-420b-956b-3a75abbe0ffe/",
"size": 5059319,
"status": "UPLOADED",
"uuid": "6149c4df-fda1-420b-956b-3a75abbe0ffe"
} |
Expected behaviour
API contracts are not broken between releases in Archivematica. The
size
field for information about an AIP should return an integer type.Current behaviour
The size field returns a string-type.
This was introduced via artefactual/archivematica-storage-service@d54716c which on the face of it, does not seem to be a controversial change. In hindsight we should have considered it in greater detail, and was signaled with the change in Archivematica here: artefactual/archivematica@473bb9d#diff-2ffb0ea23203dd32f7be41aa919e77a7.
Steps to reproduce
Construct an API call that is being used by Archivematica (this uses HTTPie):
Observe the response:
For the size field other consumers expect:
Which means currently this is broken for anyone who handles this response expecting an integer (as we previously did).
Your environment (version of Archivematica, operating system, other relevant details)
Docker compose running various commits of AM before and after the one indeitifed.
Additional context
In terms of fixing this, then:
Attaching the discussion label. cc. @@jorikvankemenade as well for opinions.
For Artefactual use:
Before you close this issue, you must check off the following:
The text was updated successfully, but these errors were encountered: