Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A simple, per-file metadata CRUD API #1060

Closed
wants to merge 42 commits into from

Conversation

trishankatdatadog
Copy link
Member

@trishankatdatadog trishankatdatadog commented Jun 26, 2020

Fixes issue #:

#1048

Description of the changes being introduced by the pull request:

A simple, per-file metadata CRUD API that should make it easy to do the most common operations, without resorting to reading and writing the entire repository all at once.

Please verify and check that the pull request fulfills the following
requirements
:

  • The code follows the Code Style Guidelines
  • Tests have been added for the bug fix or new feature
  • Docs have been added for the bug fix or new feature

Cc @joshuagl @sechkova @MVrachev @lukpueh @woodruffw

@trishankatdatadog
Copy link
Member Author

There is so much incomplete with this draft PR, but I hope it gives the right idea. Please let me know if you vehemently disagree with the approach here!

Cc @joshuagl @sechkova and @MVrachev who have agreed to help write and polish the code

Copy link
Member

@joshuagl joshuagl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your initial work on this @trishankatdatadog

Observation: typing support requires Python 3.5+ or the typing package and f-string support requires Python 3.6.
This needn't block the initial implementation, but assuming we continue to flesh out this low-level API and use it in the refactor it might be exposed to the client, where we want to continue to support Python 2.7 and thus we can't use f-strings and may not want to add an additional dependency on typing.

So far as I can see from a preliminary review the to-do list is:

  • keys
    • Support non ed25519 keys. Perhaps by making key type a part of: get_private_keypath(), write_keypair(), read_keypair(), etc? f2861bf
  • metadata
    • Figure out how to handle consistent snapshots
    • Figure out how to handle writing consistent targets not relevant at this low-level A simple, per-file metadata CRUD API #1060 (comment)
    • Add schema checks to read_from_json() 1fbff55
    • Implement update_signatures() 5ef60ca
    • Implement signable() for each of:
    • Implement update() for each of
    • Add method to Metadata to bump expiration by a delta, Metadata.bump_expiration(delta='1d')
      • It would be easy for the implementation if this took a timedelta object, i.e. Metadata.bump_expiration(delta=timedelta(days=1)) – I'd appreciate some thoughts on how useful/usable such an API is? 721def4
    • Add method to Metadata to bump version number; Metadata.bump_version() 721def4
  • Use StorageBackendInterface 0ca471e
  • We should raise sslib.FormatError or ValueError instead of AssertionErrors (for input validation) 76cb560
  • Add docstrings
  • Add unit tests – WIP here 57c98d4 and here 34a0680
  • Streamline metadata interaction model so that signatures and signed are only ever accessed as a property, like signable
  • make as many thing as possible private variables of the objects 76cb560

tuf/api/keys.py Outdated Show resolved Hide resolved
tuf/api/keys.py Outdated Show resolved Hide resolved
tuf/api/metadata.py Outdated Show resolved Hide resolved
@trishankatdatadog
Copy link
Member Author

Hi Joshua,

Observation: typing support requires Python 3.5+ or the typing package and f-string support requires Python 3.6.
This needn't block the initial implementation, but assuming we continue to flesh out this low-level API and use it in the refactor it might be exposed to the client, where we want to continue to support Python 2.7 and thus we can't use f-strings and may not want to add an additional dependency on typing.

Agreed. I left the typing in there as to help communicate intent between different developers 🙂 So you can think of it as scaffolding we can remove before hitting production...

Add bump_role() API somewhere?

Aha, it's not clear, sorry, but it's supposed to be a part of Metadata.sign(). Let me know whether this is a good idea or not...

How should we divide-and-conquer going forward?

tuf/api/keys.py Outdated Show resolved Hide resolved
Copy link
Member

@lukpueh lukpueh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work, @trishankatdatadog! I especially like the scaffolding in metadata.py.

A left a few inline comments and here are two more high-level observations:

tuf/api/keys.py Outdated Show resolved Hide resolved
tuf/api/keys.py Outdated Show resolved Hide resolved
tuf/api/keys.py Outdated Show resolved Hide resolved
tuf/api/metadata.py Outdated Show resolved Hide resolved
tuf/api/keys.py Outdated Show resolved Hide resolved
@joshuagl
Copy link
Member

joshuagl commented Jun 30, 2020

Observation: typing support requires Python 3.5+ or the typing package and f-string support requires Python 3.6.
This needn't block the initial implementation, but assuming we continue to flesh out this low-level API and use it in the refactor it might be exposed to the client, where we want to continue to support Python 2.7 and thus we can't use f-strings and may not want to add an additional dependency on typing.

Agreed. I left the typing in there as to help communicate intent between different developers 🙂 So you can think of it as scaffolding we can remove before hitting production...

I assumed it was something like that, but thought it worth explicitly noting.

Add bump_role() API somewhere?

Aha, it's not clear, sorry, but it's supposed to be a part of Metadata.sign(). Let me know whether this is a good idea or not...

Ah, understood! We'll need a way to set/pass a delta by which to increase the expiration, which doesn't feel like it fits naturally into the method signature of sign() and is even less clean in the method signature of write_to_json(), from which sign() is called.

I thought a delta by which to bump the expiration could become a property of the Metadata object (i.e. PEP 458 recommends specific expiry deltas per role and this would map nicely) however the delta isn't a part of the metadata itself and therefore couldn't be loaded into the object by reading the on-disk representation... therefore – this wouldn't work.

As this is defined as a low-level (or granular?) API, perhaps we should just be explicit and have functions for incrementing version number and increasing expiration. We can build convenience API around these in future as we flesh this out?

@joshuagl
Copy link
Member

How should we divide-and-conquer going forward?

The to-do list I sketched above has many of the items broken down by module, perhaps you could tackle keys while we work on metadata?

Then we can hammer out typing, docstrings, tests, coding style and assertions as a follow-on task?

@joshuagl
Copy link
Member

joshuagl commented Jul 1, 2020

I've got some WIP changes https://github.com/joshuagl/tuf/commits/joshuagl/simple-update-api%2BMVrachev-length-hashes-optional and have updated the to do list to indicate what I have done.

Note: I'm using a helper from #1031 so my branch includes a merge of that PR

Signed-off-by: Joshua Lock <jlock@vmware.com>
Signed-off-by: Joshua Lock <jlock@vmware.com>
Signed-off-by: Joshua Lock <jlock@vmware.com>
Signed-off-by: Joshua Lock <jlock@vmware.com>
Signed-off-by: Joshua Lock <jlock@vmware.com>
Signed-off-by: Joshua Lock <jlock@vmware.com>
Signed-off-by: Joshua Lock <jlock@vmware.com>
Signed-off-by: Joshua Lock <jlock@vmware.com>
Signed-off-by: Joshua Lock <jlock@vmware.com>
Signed-off-by: Joshua Lock <jlock@vmware.com>
@joshuagl
Copy link
Member

joshuagl commented Jul 1, 2020

all of the code I've written so far assumes that the sequence of events is:

metadata = metadata.Metadata()
metadata.read_from_json('/path/to/metadata.json')

and therefore self.signatures and self.signed exist and are meaningful. That means we can't currently instantiate an object, populate it with code and write it to disk. We can only load and modify. That seems undesirable?

Should we smooth out the interaction model so that signatures and signed are only ever accessed as a property, like signable?

Additionally, should we make read_from_json() a class method that can be used to create an instance of an object by loading a file?

snapshot = Snapshot.read_from_json('/path/to/snapshot.json')

@trishankatdatadog
Copy link
Member Author

Should we smooth out the interaction model so that signatures and signed are only ever accessed as a property, like signable?

Yes, we should. Also, signatures and signed should be objects under signable, no?

Additionally, should we make read_from_json() a class method that can be used to create an instance of an object by loading a file?

💯

trishankatdatadog and others added 6 commits July 1, 2020 18:39
Signed-off-by: Trishank Karthik Kuppusamy <trishank.kuppusamy@datadoghq.com>
Signed-off-by: Teodora Sechkova <tsechkova@vmware.com>
Signed-off-by: Teodora Sechkova <tsechkova@vmware.com>
Signed-off-by: Teodora Sechkova <tsechkova@vmware.com>
Skip a keyword if it is optional in the schema and the value
passed in is set to None.

Signed-off-by: Teodora Sechkova <tsechkova@vmware.com>
Signed-off-by: Teodora Sechkova <tsechkova@vmware.com>
return generate_targets_metadata()

# Add or update metadata about the target.
# TODO: how to handle writing consistent targets?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the initial API design I assume that it covers only the case of using existing fileinfo. If this is true then writing consistent targets is not applicable here:
https://github.com/theupdateframework/tuf/blob/de8649e12a850fccbc745f0129d43956ef7da33b/tuf/repository_lib.py#L155

If we want to add the option to NOT use existing fileinfo and using tuf for calculating hashes, we need to extend the api (maybe on a later pass) with targets paths and storage_backend.

Signed-off-by: Teodora Sechkova <tsechkova@vmware.com>
Signed-off-by: Teodora Sechkova <tsechkova@vmware.com>
@trishankatdatadog trishankatdatadog force-pushed the trishankatdatadog/simple-update-api branch from 497955a to b9a9050 Compare July 21, 2020 20:49
@trishankatdatadog
Copy link
Member Author

Before we finalize, I also think we should consider the notion of unbounded vs bounded keys.

@joshuagl
Copy link
Member

The Vault integration sounds very interesting, great work @trishankatdatadog. 💯

I wonder whether a key management interface would be better suited to being integrated at the securesystemslib level, rather than TUF?

Even if TUF is the more appropriate home, I think it would be better to keep this PR focused on introducing the CRUD API. With any extra functionality built on top of that introduced in future PR's. What do you think?

On the CRUD API; @woodruffw did you have chance to review the proposed API and see how well it suits your needs for Warehouse integration?

@trishankatdatadog
Copy link
Member Author

Even if TUF is the more appropriate home, I think it would be better to keep this PR focused on introducing the CRUD API. With any extra functionality built on top of that introduced in future PR's. What do you think?

No problemo. I will let @lukpueh, @woodruffw, and yourself decide.


class VaultKey(Key):

class AuthenticationError(Exception): pass
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tiny nit, but perhaps don't nest this exception class or the following enums?

Having the former in a standalone exceptions module and the latter in a models (or similar) module makes it a bit easier to reuse them in other contexts.

Signed-off-by: Trishank Karthik Kuppusamy <trishank.kuppusamy@datadoghq.com>
Signed-off-by: Trishank Karthik Kuppusamy <trishank.kuppusamy@datadoghq.com>
# Classes.


class Metadata(ABC):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think any ABC features are used anymore?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, let's remove.

def sign(self, key_ring: KeyRing) -> JsonDict:
# FIXME: Needs documentation of expected behavior
signed_bytes = self.signed_bytes
signatures = self.__signatures
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any intention behind self.__signatures and self.signatures both existing?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so. I haven't touched the sign/verify functions yet, because I wasn't sure about the original intended behavior (that's why I left above comment there).

But I think we agreed that my plans for these functions as outlined in #1060 (comment) seemed reasonable.

self.__update_signature(signatures, key.keyid, signature)

self.__signatures = signatures
return self.signable
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self.signable seems like a leftover?

}

@classmethod
def read_from_json(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would expect read_from_json and write_to_json to be part of the same Metadata class. What do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, the only problem is that read_from_json, as opposed to write_to_json, requires knowledge about the class it should instantiate an object from for the signed field. That's why it made sense to me that this method is on the Signed class (i.e. any of Target/Timestamp/Snapshot).

But it would be easy to also have a Metadata.read_from_json that instantiates the corresponding child class of Signed, based on the "signed.type" JSON field.

That's what we do in in-toto.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI 41fe57f

@sechkova
Copy link
Contributor

AFAICT initially this API was created having in mind easily updating a single targets metadata but if we consider it as a base of a tuf rework we need an implementation of a Root(Signed) class too.

lukpueh added a commit to lukpueh/tuf that referenced this pull request Aug 18, 2020
Add metadata module with container classes for TUF role metadata, including
methods to read/serialize/write from and to JSON, perform TUF-compliant
metadata updates, and create and verify signatures.

The 'Metadata' class provides a container for inner TUF metadata objects (Root,
Timestamp, Snapshot, Targets) (i.e. OOP composition)

The 'Signed' class provides a base class to aggregate common attributes (i.e.
version, expires, spec_version) of the inner metadata classes. (i.e. OOP
inheritance). The name of the class also aligns with the 'signed' field of
the outer metadata container.

Based on prior observations in TUF's sister project in-toto, this architecture
seems to well represent the metadata model as it is defined in the
specification (see in-toto/in-toto#98 and in-toto/in-toto#142 for related
discussions).

This commits also adds tests.

**Some more design considerations**
(also in regards to prior sketches of this module)

 - Aims at simplicity, brevity and recognizability of the wireline metadata
   format.

 - All attributes that correspond to fields in TUF JSON metadata are public.
   There doesn't seem to be a good reason to protect them with leading
   underscores and use setters/getters instead, it just adds more code, and
   impedes recognizability of the wireline metadata format.

 - Although, it might be convenient to have short-cuts on the Metadata class
   that point to methods and attributes that are common to all subclasses of
   the contained Signed class (e.g. Metadata.version instead of
   Metadata.signed.version, etc.), this also conflicts with goal of
   recognizability of the wireline metadata. Thus we won't add such short-cuts
   for now. See:
   theupdateframework#1060 (comment)

 - Signing keys and a 'consistent_snapshot' boolean are not on the targets
   metadata class. They are a better fit for management code. See:
   theupdateframework#1060 (comment),
   and theupdateframework#660.

 - Does not use sslib schema checks (see TODO notes about validation in
   doc header)

 - Does not use existing tuf utils, such as make_metadata_fileinfo,
   build_dict_conforming_to_schema, if it is easy and more explicit to
   just re-implement the desired behavior on the metadata classes.

 - All datetime's are treated as UTC. Since timezone info is not captured in
   the wireline metadata format it should not be captured in the internal
   representation either.

 - Does not use 3rd-party dateutil package, in order to minimize dependency
   footprint, which is especially important for update clients which often have
   to vendor their dependencies.
   However, compatibility between the more advanced dateutil.relativedelta (e.g
   handles leap years automatically) and timedelta is tested.

 - Uses PEP8 indentation (4 space) and Google-style doc string instead of
   sslab-style. See
   secure-systems-lab/code-style-guidelines#20

 - Does not support Python2

**TODO:**

See doc header TODO list
lukpueh added a commit to lukpueh/tuf that referenced this pull request Aug 18, 2020
Add metadata module with container classes for TUF role metadata, including
methods to read/serialize/write from and to JSON, perform TUF-compliant
metadata updates, and create and verify signatures.

The 'Metadata' class provides a container for inner TUF metadata objects (Root,
Timestamp, Snapshot, Targets) (i.e. OOP composition)

The 'Signed' class provides a base class to aggregate common attributes (i.e.
version, expires, spec_version) of the inner metadata classes. (i.e. OOP
inheritance). The name of the class also aligns with the 'signed' field of
the outer metadata container.

Based on prior observations in TUF's sister project in-toto, this architecture
seems to well represent the metadata model as it is defined in the
specification (see in-toto/in-toto#98 and in-toto/in-toto#142 for related
discussions).

This commits also adds tests.

**Additional design considerations**
(also in regards to prior sketches of this module)

 - Aims at simplicity, brevity and recognizability of the wireline metadata
   format.

 - All attributes that correspond to fields in TUF JSON metadata are public.
   There doesn't seem to be a good reason to protect them with leading
   underscores and use setters/getters instead, it just adds more code, and
   impedes recognizability of the wireline metadata format.

 - Although, it might be convenient to have short-cuts on the Metadata class
   that point to methods and attributes that are common to all subclasses of
   the contained Signed class (e.g. Metadata.version instead of
   Metadata.signed.version, etc.), this also conflicts with goal of
   recognizability of the wireline metadata. Thus we won't add such short-cuts
   for now. See:
   theupdateframework#1060 (comment)

 - Signing keys and a 'consistent_snapshot' boolean are not on the targets
   metadata class. They are a better fit for management code. See:
   theupdateframework#1060 (comment),
   and theupdateframework#660.

 - Does not use sslib schema checks (see TODO notes about validation in
   doc header)

 - Does not use existing tuf utils, such as make_metadata_fileinfo,
   build_dict_conforming_to_schema, if it is easy and more explicit to
   just re-implement the desired behavior on the metadata classes.

 - All datetime's are treated as UTC. Since timezone info is not captured in
   the wireline metadata format it should not be captured in the internal
   representation either.

 - Does not use 3rd-party dateutil package, in order to minimize dependency
   footprint, which is especially important for update clients which often have
   to vendor their dependencies.
   However, compatibility between the more advanced dateutil.relativedelta (e.g
   handles leap years automatically) and timedelta is tested.

 - Uses PEP8 indentation (4 space) and Google-style doc string instead of
   sslab-style. See
   secure-systems-lab/code-style-guidelines#20

 - Does not support Python =< 3.5

**TODO: See doc header TODO list**

Co-authored-by: Trishank Karthik Kuppusamy <trishank.kuppusamy@datadoghq.com>
Co-authored-by: Joshua Lock <jlock@vmware.com>
Co-authored-by: Teodora Sechkova <tsechkova@vmware.com>
Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>
lukpueh added a commit to lukpueh/tuf that referenced this pull request Aug 18, 2020
Add metadata module with container classes for TUF role metadata, including
methods to read/serialize/write from and to JSON, perform TUF-compliant
metadata updates, and create and verify signatures.

The 'Metadata' class provides a container for inner TUF metadata objects (Root,
Timestamp, Snapshot, Targets) (i.e. OOP composition)

The 'Signed' class provides a base class to aggregate common attributes (i.e.
version, expires, spec_version) of the inner metadata classes. (i.e. OOP
inheritance). The name of the class also aligns with the 'signed' field of
the outer metadata container.

Based on prior observations in TUF's sister project in-toto, this architecture
seems to well represent the metadata model as it is defined in the
specification (see in-toto/in-toto#98 and in-toto/in-toto#142 for related
discussions).

This commits also adds tests.

**TODO: See doc header TODO list**

**Additional design considerations**
(also in regards to prior sketches of this module)

 - Aims at simplicity, brevity and recognizability of the wireline metadata
   format.

 - All attributes that correspond to fields in TUF JSON metadata are public.
   There doesn't seem to be a good reason to protect them with leading
   underscores and use setters/getters instead, it just adds more code, and
   impedes recognizability of the wireline metadata format.

 - Although, it might be convenient to have short-cuts on the Metadata class
   that point to methods and attributes that are common to all subclasses of
   the contained Signed class (e.g. Metadata.version instead of
   Metadata.signed.version, etc.), this also conflicts with goal of
   recognizability of the wireline metadata. Thus we won't add such short-cuts
   for now. See:
   theupdateframework#1060 (comment)

 - Signing keys and a 'consistent_snapshot' boolean are not on the targets
   metadata class. They are a better fit for management code. See:
   theupdateframework#1060 (comment),
   and theupdateframework#660.

 - Does not use sslib schema checks (see TODO notes about validation in
   doc header)

 - Does not use existing tuf utils, such as make_metadata_fileinfo,
   build_dict_conforming_to_schema, if it is easy and more explicit to
   just re-implement the desired behavior on the metadata classes.

 - All datetime's are treated as UTC. Since timezone info is not captured in
   the wireline metadata format it should not be captured in the internal
   representation either.

 - Does not use 3rd-party dateutil package, in order to minimize dependency
   footprint, which is especially important for update clients which often have
   to vendor their dependencies.
   However, compatibility between the more advanced dateutil.relativedelta (e.g
   handles leap years automatically) and timedelta is tested.

 - Uses PEP8 indentation (4 space) and Google-style doc string instead of
   sslab-style. See
   secure-systems-lab/code-style-guidelines#20

 - Does not support Python =< 3.5

Co-authored-by: Trishank Karthik Kuppusamy <trishank.kuppusamy@datadoghq.com>
Co-authored-by: Joshua Lock <jlock@vmware.com>
Co-authored-by: Teodora Sechkova <tsechkova@vmware.com>
Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>
@lukpueh lukpueh mentioned this pull request Aug 18, 2020
3 tasks
lukpueh added a commit to lukpueh/tuf that referenced this pull request Aug 18, 2020
Add metadata module with container classes for TUF role metadata, including
methods to read/serialize/write from and to JSON, perform TUF-compliant
metadata updates, and create and verify signatures.

The 'Metadata' class provides a container for inner TUF metadata objects (Root,
Timestamp, Snapshot, Targets) (i.e. OOP composition)

The 'Signed' class provides a base class to aggregate common attributes (i.e.
version, expires, spec_version) of the inner metadata classes. (i.e. OOP
inheritance). The name of the class also aligns with the 'signed' field of
the outer metadata container.

Based on prior observations in TUF's sister project in-toto, this architecture
seems to well represent the metadata model as it is defined in the
specification (see in-toto/in-toto#98 and in-toto/in-toto#142 for related
discussions).

This commits also adds tests.

**TODO: See doc header TODO list**

**Additional design considerations**
(also in regards to prior sketches of this module)

 - Aims at simplicity, brevity and recognizability of the wireline metadata
   format.

 - All attributes that correspond to fields in TUF JSON metadata are public.
   There doesn't seem to be a good reason to protect them with leading
   underscores and use setters/getters instead, it just adds more code, and
   impedes recognizability of the wireline metadata format.

 - Although, it might be convenient to have short-cuts on the Metadata class
   that point to methods and attributes that are common to all subclasses of
   the contained Signed class (e.g. Metadata.version instead of
   Metadata.signed.version, etc.), this also conflicts with goal of
   recognizability of the wireline metadata. Thus we won't add such short-cuts
   for now. See:
   theupdateframework#1060 (comment)

 - Signing keys and a 'consistent_snapshot' boolean are not on the targets
   metadata class. They are a better fit for management code. See:
   theupdateframework#1060 (comment),
   and theupdateframework#660.

 - Does not use sslib schema checks (see TODO notes about validation in
   doc header)

 - Does not use existing tuf utils, such as make_metadata_fileinfo,
   build_dict_conforming_to_schema, if it is easy and more explicit to
   just re-implement the desired behavior on the metadata classes.

 - All datetime's are treated as UTC. Since timezone info is not captured in
   the wireline metadata format it should not be captured in the internal
   representation either.

 - Does not use 3rd-party dateutil package, in order to minimize dependency
   footprint, which is especially important for update clients which often have
   to vendor their dependencies.
   However, compatibility between the more advanced dateutil.relativedelta (e.g
   handles leap years automatically) and timedelta is tested.

 - Uses PEP8 indentation (4 space) and Google-style doc string instead of
   sslab-style. See
   secure-systems-lab/code-style-guidelines#20

 - Does not support Python =< 3.5

Co-authored-by: Trishank Karthik Kuppusamy <trishank.kuppusamy@datadoghq.com>
Co-authored-by: Joshua Lock <jlock@vmware.com>
Co-authored-by: Teodora Sechkova <tsechkova@vmware.com>
Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>
lukpueh added a commit to lukpueh/tuf that referenced this pull request Aug 20, 2020
Add metadata module with container classes for TUF role metadata, including
methods to read/serialize/write from and to JSON, perform TUF-compliant
metadata updates, and create and verify signatures.

The 'Metadata' class provides a container for inner TUF metadata objects (Root,
Timestamp, Snapshot, Targets) (i.e. OOP composition)

The 'Signed' class provides a base class to aggregate common attributes (i.e.
version, expires, spec_version) of the inner metadata classes. (i.e. OOP
inheritance). The name of the class also aligns with the 'signed' field of
the outer metadata container.

Based on prior observations in TUF's sister project in-toto, this architecture
seems to well represent the metadata model as it is defined in the
specification (see in-toto/in-toto#98 and in-toto/in-toto#142 for related
discussions).

This commits also adds tests.

**TODO: See doc header TODO list**

**Additional design considerations**
(also in regards to prior sketches of this module)

 - Aims at simplicity, brevity and recognizability of the wireline metadata
   format.

 - All attributes that correspond to fields in TUF JSON metadata are public.
   There doesn't seem to be a good reason to protect them with leading
   underscores and use setters/getters instead, it just adds more code, and
   impedes recognizability of the wireline metadata format.

 - Although, it might be convenient to have short-cuts on the Metadata class
   that point to methods and attributes that are common to all subclasses of
   the contained Signed class (e.g. Metadata.version instead of
   Metadata.signed.version, etc.), this also conflicts with goal of
   recognizability of the wireline metadata. Thus we won't add such short-cuts
   for now. See:
   theupdateframework#1060 (comment)

 - Signing keys and a 'consistent_snapshot' boolean are not on the targets
   metadata class. They are a better fit for management code. See:
   theupdateframework#1060 (comment),
   and theupdateframework#660.

 - Does not use sslib schema checks (see TODO notes about validation in
   doc header)

 - Does not use existing tuf utils, such as make_metadata_fileinfo,
   build_dict_conforming_to_schema, if it is easy and more explicit to
   just re-implement the desired behavior on the metadata classes.

 - All datetime's are treated as UTC. Since timezone info is not captured in
   the wireline metadata format it should not be captured in the internal
   representation either.

 - Does not use 3rd-party dateutil package, in order to minimize dependency
   footprint, which is especially important for update clients which often have
   to vendor their dependencies.
   However, compatibility between the more advanced dateutil.relativedelta (e.g
   handles leap years automatically) and timedelta is tested.

 - Uses PEP8 indentation (4 space) and Google-style doc string instead of
   sslab-style. See
   secure-systems-lab/code-style-guidelines#20

 - Does not support Python =< 3.5

Co-authored-by: Trishank Karthik Kuppusamy <trishank.kuppusamy@datadoghq.com>
Co-authored-by: Joshua Lock <jlock@vmware.com>
Co-authored-by: Teodora Sechkova <tsechkova@vmware.com>
Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>
MVrachev pushed a commit to MVrachev/tuf that referenced this pull request Sep 17, 2020
Add metadata module with container classes for TUF role metadata, including
methods to read/serialize/write from and to JSON, perform TUF-compliant
metadata updates, and create and verify signatures.

The 'Metadata' class provides a container for inner TUF metadata objects (Root,
Timestamp, Snapshot, Targets) (i.e. OOP composition)

The 'Signed' class provides a base class to aggregate common attributes (i.e.
version, expires, spec_version) of the inner metadata classes. (i.e. OOP
inheritance). The name of the class also aligns with the 'signed' field of
the outer metadata container.

Based on prior observations in TUF's sister project in-toto, this architecture
seems to well represent the metadata model as it is defined in the
specification (see in-toto/in-toto#98 and in-toto/in-toto#142 for related
discussions).

This commits also adds tests.

**TODO: See doc header TODO list**

**Additional design considerations**
(also in regards to prior sketches of this module)

 - Aims at simplicity, brevity and recognizability of the wireline metadata
   format.

 - All attributes that correspond to fields in TUF JSON metadata are public.
   There doesn't seem to be a good reason to protect them with leading
   underscores and use setters/getters instead, it just adds more code, and
   impedes recognizability of the wireline metadata format.

 - Although, it might be convenient to have short-cuts on the Metadata class
   that point to methods and attributes that are common to all subclasses of
   the contained Signed class (e.g. Metadata.version instead of
   Metadata.signed.version, etc.), this also conflicts with goal of
   recognizability of the wireline metadata. Thus we won't add such short-cuts
   for now. See:
   theupdateframework#1060 (comment)

 - Signing keys and a 'consistent_snapshot' boolean are not on the targets
   metadata class. They are a better fit for management code. See:
   theupdateframework#1060 (comment),
   and theupdateframework#660.

 - Does not use sslib schema checks (see TODO notes about validation in
   doc header)

 - Does not use existing tuf utils, such as make_metadata_fileinfo,
   build_dict_conforming_to_schema, if it is easy and more explicit to
   just re-implement the desired behavior on the metadata classes.

 - All datetime's are treated as UTC. Since timezone info is not captured in
   the wireline metadata format it should not be captured in the internal
   representation either.

 - Does not use 3rd-party dateutil package, in order to minimize dependency
   footprint, which is especially important for update clients which often have
   to vendor their dependencies.
   However, compatibility between the more advanced dateutil.relativedelta (e.g
   handles leap years automatically) and timedelta is tested.

 - Uses PEP8 indentation (4 space) and Google-style doc string instead of
   sslab-style. See
   secure-systems-lab/code-style-guidelines#20

 - Does not support Python =< 3.5

Co-authored-by: Trishank Karthik Kuppusamy <trishank.kuppusamy@datadoghq.com>
Co-authored-by: Joshua Lock <jlock@vmware.com>
Co-authored-by: Teodora Sechkova <tsechkova@vmware.com>
Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>
@joshuagl
Copy link
Member

joshuagl commented Nov 26, 2020

The core feature of this PR, the metadata API, has landed in #1112. That leaves the Vault functionality of this PR to resolve.

I think securesystemslib would be a good home for it, but either way. Should we close this PR and open an Issue in sslib or here to request the Vault feature with a link to this PR for a reference implementation?

@trishankatdatadog
Copy link
Member Author

The core feature of this PR, the metadata API, has landed in #1112. That leaves the Vault functionality of this PR to resolve.

I think securesystemslib would be a good home for it, but either way. Should we close this PR and open an Issue in sslib or here to request the Vault feature with a link to this PR for a reference implementation?

Yes, please, we should upstream the Hashicorp Vault functionality against securesystemslib

@joshuagl
Copy link
Member

Closing this PR as the metadata API landed in #1112.

Cross-linking to a securesystemslib Issue requesting an enhancement to support Hashicorp Vault: secure-systems-lab/securesystemslib#307

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants