Skip to content

Modified encrypt payload for RBAC.#516

Merged
calina-c merged 8 commits intomainfrom
enhance-rbac-encrypt
Sep 5, 2022
Merged

Modified encrypt payload for RBAC.#516
calina-c merged 8 commits intomainfrom
enhance-rbac-encrypt

Conversation

@mariacarmina
Copy link
Copy Markdown
Contributor

Fixes #446 .

Changes proposed in this PR:

  • updated RBAC encrypt request.

@mariacarmina mariacarmina self-assigned this Jul 13, 2022
@mariacarmina mariacarmina requested a review from calina-c July 13, 2022 11:09
@soonhuat
Copy link
Copy Markdown
Contributor

@mariacarmina should


have @Validate(EncryptRequest) decorator?

Comment thread ocean_provider/validation/RBAC.py Outdated
if "data" not in self.request.keys():
raise Exception("Data to encrypt is empty.")
# import pdb;pdb.set_trace()
if not isinstance(self.request["data"], list) and not isinstance(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

while request /encrypt hit, both asset files encryption and ddo encryption will send to RBAC and let RBAC decide what to do next?

Copy link
Copy Markdown
Contributor

@calina-c calina-c left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there is still some confusion about what needs to be done here. Can you guys please share some discussions/conclusions? @soonhuat @mariacarmina because I might be misguided in what I'm reviewing exactly.

Comment thread ocean_provider/validation/RBAC.py Outdated
def get_data(self):
if "data" not in self.request.keys():
raise Exception("Data to encrypt is empty.")
if not isinstance(self.request["data"], list) and not isinstance(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add an empty line before the second if, for readability. How come these can be only list and Asset? Is it possible this is a dict or a string json that needs conversion to Asset?

Comment thread ocean_provider/validation/RBAC.py Outdated
if not isinstance(self.request["data"], list) and not isinstance(
self.request["data"], Asset
):
raise Exception("Invalid type of data.")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not big on raising an Exception here. Is it possible to not have data? Previously we had this option, so it should be still supported? i.e. if there is no data, do not send it?

I would also recommend changing the function name to get_request_data, which is clearer.

Comment thread tests/test_RBAC.py Outdated
}
req = {
"document": json.dumps(document),
"data": [document],
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we shhould definitely still support the version with json.dumps, this should be fully compatible.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is still a thing, I'll add a generic code review comment explaining what I recommend.

@soonhuat
Copy link
Copy Markdown
Contributor

soonhuat commented Aug 8, 2022

I think there is still some confusion about what needs to be done here. Can you guys please share some discussions/conclusions? @soonhuat @mariacarmina because I might be misguided in what I'm reviewing exactly.

@calina-c to extend RBAC to validate metadata data structure or values before encryption of metadata.
I assume this RBAC extended checks should only happens during metadata encryption, but not during asset url encryption, since now encryption endpoint are shared for both type of encryption (metadata and asset url)

@mariacarmina
Copy link
Copy Markdown
Contributor Author

I have understood so far that provider should be a pass-through for RBAC server. Is it necessary to validate the type for data before sending to RBAC if RBAC will do the distinction between asset URL (files list) and the actual asset object. In the first place, I have done the checks to see what kind of data I send to RBAC, but for asset encryption it won't work, because I have a specific Asset object. For example, it won't work from ocean.js side, because the objects won't be compatible with the python object Asset. In my perspective, if we want to keep the checks we should check if the asset object is actually a JSON or a stringified JSON. What are your thoughts here @soonhuat @calina-c?

@soonhuat
Copy link
Copy Markdown
Contributor

soonhuat commented Aug 9, 2022

won't be compatible with the python object Asset

@mariacarmina if Asset object wouldn't compatible is request were from oceanJs or market, how about check against if the data is files (asset URL) instead, then skip RBAC if is files structure.

@mariacarmina
Copy link
Copy Markdown
Contributor Author

mariacarmina commented Aug 10, 2022

won't be compatible with the python object Asset

@mariacarmina if Asset object wouldn't compatible is request were from oceanJs or market, how about check against if the data is files (asset URL) instead, then skip RBAC if is files structure.

Ok I have understood, so RBAC should accept a list of URLs, not of file structure object, but how about the Asset object case? If provider receives an asset for encryption, in which form should be exactly or what should I expect to receive for asset case? A disctionary or a stringified JSON?

@soonhuat
Copy link
Copy Markdown
Contributor

soonhuat commented Aug 10, 2022

RBAC should accept a list of URLs

so RBAC should accept a list of URLs, not of file structure object , I guess no, because provider will skip RBAC if is asset files structure right?

As for scenario when provider encrypt request input are Metadata/Asset, then RBAC will receive as it is, assuming RBAC will then (proxy to another validation API) or (market operator will fork the RBAC handle the validation themselves), finally return true/false.

@mariacarmina
Copy link
Copy Markdown
Contributor Author

As for scenario when provider encrypt request input are Metadata/Asset, then RBAC will receive as it is, assuming RBAC will then (proxy to another validation API) or (market operator will fork the RBAC handle the validation themselves), finally return true/false.

What means then RBAC will receive as it is? The Asset object from provider is not compatible with Asset object from ocean.js for example, is the same situation as it is with the file structure object, that's why I assume that provider should check first if the asset object is serialized or is a dictionary.

@soonhuat
Copy link
Copy Markdown
Contributor

object from provider is not compatible with Asset object from ocean.js for example, is the same situation as it is with the file structure object

@mariacarmina yup, agree (provider should check first if the asset object is serialized or is a dictionary), looks like there is similar check around 71-72

if not isinstance(files_list, list):

maybe this is the checking u mean?

after the check, if is NOT data files encrypt, then just like your PR : POST the request body to RBAC (the then RBAC will receive as it is I mean previously)

@mariacarmina
Copy link
Copy Markdown
Contributor Author

Yes, I have added the proper checks as you mentioned in your description. I will do the changes likewise and provide additional checks for Asset if it's serialized already or a dictionary. Regarding the check for files list, I need to be more specific for it because it should only contains list of URLs, not just a simple list.

@mariacarmina mariacarmina requested a review from calina-c August 10, 2022 17:06
Copy link
Copy Markdown
Contributor

@calina-c calina-c left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sure this works, but you can do much better.

Comment thread ocean_provider/validation/RBAC.py Outdated
def _check_if_asset(self):
data = self.request["data"]
if isinstance(data, dict):
if data.get("version") is not None:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if it is a dict you can directly return data.get("version", False)

Comment thread ocean_provider/validation/RBAC.py Outdated
return self.request["data"]

def _check_if_asset(self):
data = self.request["data"]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would add a preliminary check: if it is not dict, nor string, return False directly, to avoid if imbrication later

Comment thread ocean_provider/validation/RBAC.py Outdated
if data.get("version") is not None:
return True
elif isinstance(data, str):
data_dict = json.loads(data)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you need to add exception handling for the json.loads, e.g. what if the string is "bla bla"? That can not be loaded into a json dict. I would also change the order like this:

  • perform json loading if string. if the json conversion fails, we return False
  • the resulting dict (coming from the string loading or directly on the payload) needs the same version check and that's it. No code duplication.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if we can verify in the strigified json if the version: "4.1.0" exists? Without converting into a dictionary.

Copy link
Copy Markdown
Contributor

@calina-c calina-c Aug 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should cover all the cases, but if it is not a dictionary (or convertable to one) then it is invalid anyway, so I would check both.

Comment thread ocean_provider/validation/RBAC.py Outdated

def _check_if_respects_file_encryption_schema(self):
data = self.request["data"]
if isinstance(data, list):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can it be a stringified list?
I don't like the imbrication here, it is not clear to me at first glance what this does. This means you need to improve its readability.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@soonhuat is it necessary to check a stringified list of files in provider?

Copy link
Copy Markdown
Contributor

@soonhuat soonhuat Aug 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think don't need to the stringified list check
TODAY if provider interface from market or oceanJs, it won't be a stringify list, is always either stringify of DDO object or asset file object as below structure (schema 4.1.0):

{
    nftAddress,
    datatokenAddress,
    files: [
      {
        type: 'url',
        index: 0,
        url: "abcdefg",
        method: 'GET'
      }
    ]
}

based on calina suggestion, maybe rename it to _is_file_encryption_data would able to self explaining itself

Copy link
Copy Markdown
Contributor Author

@mariacarmina mariacarmina Aug 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @soonhuat for clarification. Should provider check if the list contains dictionaries with those specific keys?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

data is not a list, is an object
and probably line 107 and 108 is enough already, since that's exactly is asset file encryption data structure

Comment thread tests/test_RBAC.py


@pytest.mark.unit
def test_wrong_encrypt_request_payload(consumer_wallet, publisher_wallet, monkeypatch):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add tests for the individual functions as well, the test coverage is very poor on your added code. How do I know this? The imbrications many and hard to read. Poorly readable means poorly testable, so that's why you couldn't add tests.

@mariacarmina mariacarmina requested a review from calina-c August 16, 2022 15:04
Comment thread ocean_provider/validation/RBAC.py Outdated
return False

if isinstance(data, dict):
return data.get("version", False)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would convert this to a bool.

Comment thread ocean_provider/validation/RBAC.py Outdated
and list(file.keys()) == ["nftAddress", "datatokenAddress", "files"]
and isinstance(file["files"], dict)
):
return True
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't this be higher-level? Is it a file structure if just only one of the elements in data has the correct structure?

can you also look into the list comparison possible issues? What if I add the nftAddress, datatokenAddress and files in a different order? Will comparing with == work?

can files also be stringified?

I think we should only handle the data encryption separately, and leave the rest as is, without restrictions. As far as I know, anything should be encryptable.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And actually, people can create their own private aquariuses with different schemas, so we shouldn't do any validation on this structure.

Comment thread tests/test_RBAC.py
validator = RBACValidator(request_name="EncryptRequest", request=req)
with pytest.raises(Exception) as err:
validator.build_payload()
assert err.value.args[0] == "Data to encrypt is empty."
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can simplify testing using match as an argument of raises instead of err.value.args

Copy link
Copy Markdown
Contributor

@calina-c calina-c left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is my 2c on this. Since this has gone through so many iterations, I would like to scrap it altogether and copy just the data-related part in a new PR. I don't think we should impose any restrictions on the payload. No validation, nothing extra. All we need to do is detect if there is an asset structure using the _is_asset function and send it to rbac. That's it. And it will be more efficient to copy that over into a new PR, for easier tracking of changes.

After copying the _is_asset function, add tests for that function particularly. Then a couple of extra test in the rbac file: when data should be sent and when not.

@soonhuat
Copy link
Copy Markdown
Contributor

soonhuat commented Sep 1, 2022

Looks good, if

@services.route("/encrypt", methods=["POST"])

doesn't need validate decorator like @validate(EncryptRequest)

@calina-c calina-c merged commit 69d6444 into main Sep 5, 2022
@calina-c calina-c deleted the enhance-rbac-encrypt branch September 5, 2022 02:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Enhance Provider Encrypt DDO mechanism to have Fine-grained permission checks

3 participants