Conversation
|
@mariacarmina should provider/ocean_provider/routes/encrypt.py Line 22 in 657b177 have @Validate(EncryptRequest) decorator? |
| if "data" not in self.request.keys(): | ||
| raise Exception("Data to encrypt is empty.") | ||
| # import pdb;pdb.set_trace() | ||
| if not isinstance(self.request["data"], list) and not isinstance( |
There was a problem hiding this comment.
while request /encrypt hit, both asset files encryption and ddo encryption will send to RBAC and let RBAC decide what to do next?
calina-c
left a comment
There was a problem hiding this comment.
I think there is still some confusion about what needs to be done here. Can you guys please share some discussions/conclusions? @soonhuat @mariacarmina because I might be misguided in what I'm reviewing exactly.
| def get_data(self): | ||
| if "data" not in self.request.keys(): | ||
| raise Exception("Data to encrypt is empty.") | ||
| if not isinstance(self.request["data"], list) and not isinstance( |
There was a problem hiding this comment.
please add an empty line before the second if, for readability. How come these can be only list and Asset? Is it possible this is a dict or a string json that needs conversion to Asset?
| if not isinstance(self.request["data"], list) and not isinstance( | ||
| self.request["data"], Asset | ||
| ): | ||
| raise Exception("Invalid type of data.") |
There was a problem hiding this comment.
I'm not big on raising an Exception here. Is it possible to not have data? Previously we had this option, so it should be still supported? i.e. if there is no data, do not send it?
I would also recommend changing the function name to get_request_data, which is clearer.
| } | ||
| req = { | ||
| "document": json.dumps(document), | ||
| "data": [document], |
There was a problem hiding this comment.
we shhould definitely still support the version with json.dumps, this should be fully compatible.
There was a problem hiding this comment.
this is still a thing, I'll add a generic code review comment explaining what I recommend.
@calina-c to extend RBAC to validate metadata data structure or values before encryption of metadata. |
|
I have understood so far that provider should be a pass-through for RBAC server. Is it necessary to validate the type for |
@mariacarmina if Asset object wouldn't compatible is request were from oceanJs or market, how about check against if the |
Ok I have understood, so RBAC should accept a list of URLs, not of file structure object, but how about the Asset object case? If provider receives an asset for encryption, in which form should be exactly or what should I expect to receive for asset case? A disctionary or a stringified JSON? |
As for scenario when provider encrypt request input are Metadata/Asset, then RBAC will receive as it is, assuming RBAC will then (proxy to another validation API) or (market operator will fork the RBAC handle the validation themselves), finally return true/false. |
What means |
@mariacarmina yup, agree (provider should check first if the asset object is serialized or is a dictionary), looks like there is similar check around 71-72 provider/ocean_provider/utils/util.py Line 72 in e89d6a0 maybe this is the checking u mean? after the check, if is NOT data files encrypt, then just like your PR : POST the request body to RBAC (the |
|
Yes, I have added the proper checks as you mentioned in your description. I will do the changes likewise and provide additional checks for |
calina-c
left a comment
There was a problem hiding this comment.
I'm sure this works, but you can do much better.
| def _check_if_asset(self): | ||
| data = self.request["data"] | ||
| if isinstance(data, dict): | ||
| if data.get("version") is not None: |
There was a problem hiding this comment.
if it is a dict you can directly return data.get("version", False)
| return self.request["data"] | ||
|
|
||
| def _check_if_asset(self): | ||
| data = self.request["data"] |
There was a problem hiding this comment.
I would add a preliminary check: if it is not dict, nor string, return False directly, to avoid if imbrication later
| if data.get("version") is not None: | ||
| return True | ||
| elif isinstance(data, str): | ||
| data_dict = json.loads(data) |
There was a problem hiding this comment.
you need to add exception handling for the json.loads, e.g. what if the string is "bla bla"? That can not be loaded into a json dict. I would also change the order like this:
- perform json loading if string. if the json conversion fails, we return False
- the resulting dict (coming from the string loading or directly on the payload) needs the same version check and that's it. No code duplication.
There was a problem hiding this comment.
What if we can verify in the strigified json if the version: "4.1.0" exists? Without converting into a dictionary.
There was a problem hiding this comment.
It should cover all the cases, but if it is not a dictionary (or convertable to one) then it is invalid anyway, so I would check both.
|
|
||
| def _check_if_respects_file_encryption_schema(self): | ||
| data = self.request["data"] | ||
| if isinstance(data, list): |
There was a problem hiding this comment.
can it be a stringified list?
I don't like the imbrication here, it is not clear to me at first glance what this does. This means you need to improve its readability.
There was a problem hiding this comment.
@soonhuat is it necessary to check a stringified list of files in provider?
There was a problem hiding this comment.
I think don't need to the stringified list check
TODAY if provider interface from market or oceanJs, it won't be a stringify list, is always either stringify of DDO object or asset file object as below structure (schema 4.1.0):
{
nftAddress,
datatokenAddress,
files: [
{
type: 'url',
index: 0,
url: "abcdefg",
method: 'GET'
}
]
}
based on calina suggestion, maybe rename it to _is_file_encryption_data would able to self explaining itself
There was a problem hiding this comment.
Thanks @soonhuat for clarification. Should provider check if the list contains dictionaries with those specific keys?
There was a problem hiding this comment.
data is not a list, is an object
and probably line 107 and 108 is enough already, since that's exactly is asset file encryption data structure
|
|
||
|
|
||
| @pytest.mark.unit | ||
| def test_wrong_encrypt_request_payload(consumer_wallet, publisher_wallet, monkeypatch): |
There was a problem hiding this comment.
please add tests for the individual functions as well, the test coverage is very poor on your added code. How do I know this? The imbrications many and hard to read. Poorly readable means poorly testable, so that's why you couldn't add tests.
| return False | ||
|
|
||
| if isinstance(data, dict): | ||
| return data.get("version", False) |
There was a problem hiding this comment.
I would convert this to a bool.
| and list(file.keys()) == ["nftAddress", "datatokenAddress", "files"] | ||
| and isinstance(file["files"], dict) | ||
| ): | ||
| return True |
There was a problem hiding this comment.
shouldn't this be higher-level? Is it a file structure if just only one of the elements in data has the correct structure?
can you also look into the list comparison possible issues? What if I add the nftAddress, datatokenAddress and files in a different order? Will comparing with == work?
can files also be stringified?
I think we should only handle the data encryption separately, and leave the rest as is, without restrictions. As far as I know, anything should be encryptable.
There was a problem hiding this comment.
And actually, people can create their own private aquariuses with different schemas, so we shouldn't do any validation on this structure.
| validator = RBACValidator(request_name="EncryptRequest", request=req) | ||
| with pytest.raises(Exception) as err: | ||
| validator.build_payload() | ||
| assert err.value.args[0] == "Data to encrypt is empty." |
There was a problem hiding this comment.
you can simplify testing using match as an argument of raises instead of err.value.args
calina-c
left a comment
There was a problem hiding this comment.
Here is my 2c on this. Since this has gone through so many iterations, I would like to scrap it altogether and copy just the data-related part in a new PR. I don't think we should impose any restrictions on the payload. No validation, nothing extra. All we need to do is detect if there is an asset structure using the _is_asset function and send it to rbac. That's it. And it will be more efficient to copy that over into a new PR, for easier tracking of changes.
After copying the _is_asset function, add tests for that function particularly. Then a couple of extra test in the rbac file: when data should be sent and when not.
|
Looks good, if provider/ocean_provider/routes/encrypt.py Line 21 in f6bc2e2 doesn't need validate decorator like @validate(EncryptRequest)
|
Fixes #446 .
Changes proposed in this PR: