WIP: Remove the need of manually marking roles as dirty #1038

sechkova · 2020-05-27T11:28:18Z

Fixes issue #: #964 #958

Note: Although the best solution is undoubtedly to go for a general rework of the internal metadata representation, this pull request aims to be a short-term fix keeping the changes as close as possible to the current functionality.

I am opening it as a WIP to discuss the possible ways to handle some more complicated delegations scenarios.

Description of the changes being introduced by the pull request:

Remove the need of manually marking dirty all roles whose metadata needs to be (re)written on the filesystem:

fix functions like Metadata.load_signing_key() that mark a role as dirty, although it shouldn't update the corresponding metadata on filesystem.
add mark_role_as_dirty parameter to roledb.add_role() and Targets class which is needed to mark newly created delegated roles as dirty
add roledb. _propagate_dirty_roles() which recursively marks as dirty all roles affected by a roledb update according to the role dependencies graph e.g.:

   any update on targets:  'targets' -> 'snapshot' -> 'timestamp'
   update on targets' key: 'targets' -> 'root'

Note that the dependencies is the implementation are reversed in comparison with the specification: a targets key update is performed on the Targets object which then needs to be propagated to the root metadata.

add 'delegating_role' to role metadata dictionary. This is needed to be able to propagate an update of the delegated role to the delegating.

'unclaimed' -> 'targets' -> 'snapshot' -> 'timestamp'

Open for discussion: What would be the purpose of a chain of delegations where two roles both delegate to a third?

   targets -> A -> C
   targets -> B -> C

Are both A and B delegating the same key and paths/targets to C? Is a scenario where each A and B delegates a different key to a role with the same name (C) meaningful? What should be the chain of interdependencies if C's key is updated?

While working on this I also noticed that delegating role is not updated at all when the delegated role's key is changed (check #1037).

In addition:

do some code cleanup and remove repetitive code from Repository.status() (Revise intended and actual behavior of Repository's status() method #955 )

Please verify and check that the pull request fulfills the following
requirements:

The code follows the Code Style Guidelines
Tests have been added for the bug fix or new feature
Docs have been added for the bug fix or new feature

Add mark_role_as_dirty parameter to (un)load_signing_key() which is False by default since the functions have mainly the goal to load metadata from the filesystem. Add mark_role_as_dirty parameter to remove_signature() to match its corresponding add_signature(). Signed-off-by: Teodora Sechkova <tsechkova@vmware.com>

Add a boolean parameter to give the flexibility to repository tools to mark newly added roles as dirty depending on being created or loaded from disk. Signed-off-by: Teodora Sechkova <tsechkova@vmware.com>

Update accordingly the tests for the new parameter of add_role(). Signed-off-by: Teodora Sechkova <tsechkova@vmware.com>

Remove unnecessary global declarations of variables which are not assigned a value inside the functions (or not referenced at all). Signed-off-by: Teodora Sechkova <tsechkova@vmware.com>

Add a boolean parameter to the Targets class constructor so that when a new delegation object is created, the new rolename is correctly marked as dirty in roledb . By default Targets objects are added to roledb with empty roleinfo and are marked as dirty later on "update" but this is not the case for delegations. Signed-off-by: Teodora Sechkova <tsechkova@vmware.com>

Recursively mark as dirty all roles affected by the roledb update of a given role e.g.: 'targets' -> 'snapshot' -> 'timestamp' and in case of a key update: 'targets' -> 'root' Signed-off-by: Teodora Sechkova <tsechkova@vmware.com>

Add a new field in internal role metadata for storing the delegating role. This is needed since all updates on roledb are performed on the delegated role which has no knowledge about its 'delegator'. When a delegated role is marked as dirty the 'delegating_role' information is used to trigger an update chain. Update the load_repository() function to load the delegations metadata starting from 'targets' in order to get the 'delegating_role'. Signed-off-by: Teodora Sechkova <tsechkova@vmware.com>

Update _log_status_of_top_level_roles() function to accommodate for the inter-dependencies triggered when marking a role dirty. Do some basic clean up to avoid code repetition. Signed-off-by: Teodora Sechkova <tsechkova@vmware.com>

Add TOP_LEVEL_ROLES as a global variable in roledb. Signed-off-by: Teodora Sechkova <tsechkova@vmware.com>

Update tests with the new logic for marking roles dirty Signed-off-by: Teodora Sechkova <tsechkova@vmware.com>

lukpueh

Thanks a ton for working on this, @sechkova!! ❤️

Note: Although the best solution is undoubtedly to go for a general rework of the internal metadata representation, this pull request aims to be a short-term fix keeping the changes as close as possible to the current functionality.

I don't want to spoil the fun, and I highly value your efforts, but I wonder if it doesn't make more sense to fix this after or in the course of a broader re-write. We do know how to work around the existing issues by marking metadata we want to be written to disk explicitly as dirty. Doing this automatically seems quite cumbersome and error-prone in the current state of the implementation.

Note that the dependencies is the implementation are reversed in comparison with the specification [...]

Do I understand correctly that you are pointing out the problem which @awwad has described at great length in #660, i.e. "a delegation is not a property of the delegated-to role."? If so, then yes, that is a problem of the current implementation, at least if we want to support more complex delegation graphs as you describe below.

add 'delegating_role' to role metadata dictionary.

I see how this is needed in the current design, where the signing key is stored on the delegated-to role. But ideally the delegation and its signing key is stored and modified on the delegating role, so that there is no need to propagate a key change upwards in the delegation graph but only downwards (see first bullet point in #958 (comment)).

What would be the purpose of a chain of delegations where two roles both delegate to a third? [...]

I am not sure if there is a real world use case for multiple roles delegating trust to the same role (@awwad calls this promiscuous delegation, see #660), maybe others do (cc @mnm678, @JustinCappos). At any rate, and at the risk of repeating myself, signing keys should be modified on the delegating role so there is no upwards propagation of the change. Also downwards propagation, in the case of promiscuous delegation, seems to be a bit tricky and probably depends on the actual scenario.

While working on this I also noticed that delegating role is not updated at all when the delegated role's key is changed (check #1037).

Wow, thanks for discovering this! I will need to take a deeper look at it.

do some code cleanup and remove repetitive code from Repository.status()

Awesome! The cleanup should make it easier to re-think the desired behavior (see first two bullet points of #955).

lukpueh · 2020-05-29T09:58:48Z

tuf/roledb.py

+    None.
+  """
+
+  global _dirty_roles


Since you don't re-bind _dirty_roles you don't need global here.

lukpueh · 2020-05-29T10:15:27Z

tuf/roledb.py

@@ -480,7 +584,6 @@ def mark_dirty(roles, repository_name='default'):
  securesystemslib.formats.NAMES_SCHEMA.check_match(roles)
  securesystemslib.formats.NAME_SCHEMA.check_match(repository_name)

-  global _roledb_dict


Thanks for removing unnecessary global statements. Since you're at it, would you mind removing all of them here in roledb? :)

As a reminder, the global statement is not necessary for reading or mutating global mutables, such as dictionaries (it is necessary for re-binding though).

If I'm not mistaken we really only need it in clear_roledb, but please double-check.

lukpueh · 2020-05-29T10:19:08Z

tuf/repository_lib.py

    # Strip metadata extension from filename.  The role database does not
    # include the metadata extension.
    if metadata_role.endswith(METADATA_EXTENSION):
      metadata_role = metadata_role[:-len(METADATA_EXTENSION)]
-


This is a matter of taste, but I personally like blank lines before an else clause. I wouldn't impose it as global code style, but I'd definitely leave it here, also for consistency with the the else above and below.

lukpueh · 2020-05-29T10:28:13Z

tuf/repository_lib.py

+  for metadata_role in metadata_files:
+    metadata_path = os.path.join(metadata_directory, metadata_role)
+    metadata_name = \
+      metadata_path[len(metadata_directory):].lstrip(os.path.sep)


I might be missing something here, but isn't metadata_name == metadata_role?

lukpueh · 2020-05-29T13:27:40Z

tuf/repository_tool.py

@@ -991,7 +1000,7 @@ def add_signature(self, signature, mark_role_as_dirty=True):



-  def remove_signature(self, signature):
+  def remove_signature(self, signature, mark_role_as_dirty=True):


Marking roles dirty when removing (or adding) signatures seems a bit odd, since the flag is used elsewhere to signal that the actual payload was modified and that it needs to be re-signed and written to disk.

While I can see how we somehow need to indicate that the signature removal (or addition) needs to be persisted on disk, this is different and also does not need to be propagated to any other metadata.

sechkova · 2020-06-01T14:42:40Z

Thanks @lukpueh :))

I don't want to spoil the fun, and I highly value your efforts, but I wonder if it doesn't make more sense to fix this after or in the course of a broader re-write. We do know how to work around the existing issues by marking metadata we want to be written to disk explicitly as dirty. Doing this automatically seems quite cumbersome and error-prone in the current state of the implementation.

I do share your concerns ... while working on this PR, it seems I have rediscovered many of the already known delegations issues that you are also pointing at. I would agree that the automatically marking metadata as dirty can wait.
However, it will be great to get your feedback on #1037 which is a consequence of the same design flow of the implementation (updating the delegated role attributes, while they are kept in the delegating one).

In case we agree to not continue with this fix and keep the workaround of manually marking roles as dirty, I think we can still benefit from this PR up to 4d86eed and the code cleanup here and there (and skip 93968f2).
This first part of the PR doesn't add major changes in the behaviour, fixes #964 + marks correctly a new delegated role as dirty (no automatic propagation) which will remove the need of manually marking dirty delegated hashed bins (not the most user-friendly feature 😬)

I appreciate any other comments that would move this work forward in any direction as well as references to old discussions to avoid repetitions :)

sechkova · 2020-06-05T14:37:57Z

Potential solution of the newly described #1045 in 3fb09c2.
Most likely with some additional rework.

jku · 2021-07-19T07:48:18Z

This looks like it's not going to proceed any time soon so I'm closing. If it does re-animate, please re-open

sechkova added 10 commits May 27, 2020 12:47

Add mark_role_as_dirty parameter to add_role()

6411de8

Add a boolean parameter to give the flexibility to repository tools to mark newly added roles as dirty depending on being created or loaded from disk. Signed-off-by: Teodora Sechkova <tsechkova@vmware.com>

Update tests calling add_role()

b9b95f1

Update accordingly the tests for the new parameter of add_role(). Signed-off-by: Teodora Sechkova <tsechkova@vmware.com>

Remove unnecessary globals in roledb.py

ac9ab6a

Remove unnecessary global declarations of variables which are not assigned a value inside the functions (or not referenced at all). Signed-off-by: Teodora Sechkova <tsechkova@vmware.com>

Propagate dirty roles in roledb

93968f2

Recursively mark as dirty all roles affected by the roledb update of a given role e.g.: 'targets' -> 'snapshot' -> 'timestamp' and in case of a key update: 'targets' -> 'root' Signed-off-by: Teodora Sechkova <tsechkova@vmware.com>

Update _log_status_of_top_level_roles

0410847

Update _log_status_of_top_level_roles() function to accommodate for the inter-dependencies triggered when marking a role dirty. Do some basic clean up to avoid code repetition. Signed-off-by: Teodora Sechkova <tsechkova@vmware.com>

Add TOP_LEVEL_ROLES as a global variable

7100dc3

Add TOP_LEVEL_ROLES as a global variable in roledb. Signed-off-by: Teodora Sechkova <tsechkova@vmware.com>

Update tests

1a6fb34

Update tests with the new logic for marking roles dirty Signed-off-by: Teodora Sechkova <tsechkova@vmware.com>

lukpueh reviewed May 29, 2020

View reviewed changes

woodruffw mentioned this pull request Jun 1, 2020

tuf/repository_tool: Return delegated bin_name during modifications #1040

Merged

3 tasks

sechkova mentioned this pull request Jun 2, 2020

Targets delegations are not updated when adding a new delegation key #1037

Closed

jku closed this Jul 19, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP: Remove the need of manually marking roles as dirty #1038

WIP: Remove the need of manually marking roles as dirty #1038

sechkova commented May 27, 2020

lukpueh left a comment

lukpueh May 29, 2020

lukpueh May 29, 2020

lukpueh May 29, 2020

lukpueh May 29, 2020

lukpueh May 29, 2020

sechkova commented Jun 1, 2020

sechkova commented Jun 5, 2020

jku commented Jul 19, 2021

		@@ -991,7 +1000,7 @@ def add_signature(self, signature, mark_role_as_dirty=True):



		def remove_signature(self, signature):
		def remove_signature(self, signature, mark_role_as_dirty=True):

WIP: Remove the need of manually marking roles as dirty #1038

WIP: Remove the need of manually marking roles as dirty #1038

Conversation

sechkova commented May 27, 2020

lukpueh left a comment

Choose a reason for hiding this comment

lukpueh May 29, 2020

Choose a reason for hiding this comment

lukpueh May 29, 2020

Choose a reason for hiding this comment

lukpueh May 29, 2020

Choose a reason for hiding this comment

lukpueh May 29, 2020

Choose a reason for hiding this comment

lukpueh May 29, 2020

Choose a reason for hiding this comment

sechkova commented Jun 1, 2020

sechkova commented Jun 5, 2020

jku commented Jul 19, 2021