Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The Curious Case of *.descriptor IDs #61

Open
shuaibsiddiqui opened this issue Mar 22, 2016 · 21 comments
Open

The Curious Case of *.descriptor IDs #61

shuaibsiddiqui opened this issue Mar 22, 2016 · 21 comments
Labels

Comments

@shuaibsiddiqui
Copy link
Contributor

Following up on @tsoenen vnfd_id query in sonata-nfv/son-mano-framework#33, I try to present the issue below:

If we want to keep the PD, NSD, & VNFD untouced within SP, then how do we handle

i) the URIs of the image stored by GK (after extraction, parsing, & validation) ?
a) Do we consider URLs entered by the developer for downloading the image only? That is no need for GK to store the image anywhere. [Note: SP may get errors in the middle of instantiation procedure if the URL is not working.]
b) If the GK stores the image somewhere locally, then do we edit the VNFD to store the info in virtual_deployment_units:vm_image field?

ii) the UUIDs generated by the SP-Catalogues.
a) If the NSD will use the internal ids (vnf_id) to reference different VNFs in it, then what is the purpose of generating UUIDs in the SP-Catalogue? Because the VNFD can be uniquely identified by the naming convention anyway. In case of service instantiation, the GK will fetch the required VNFDs by their unique names as found in the package/NS descriptor.
b) we edit the NSD/VNFD and replace the id field by the generated UUIDs. [Note: If a developer fetches a NSD/VNFD from SP-Catalogue then this id field should be replaced with its unique name value.]

I hope i captured all the aspects of the discussion we had on this topic. Please add if I have missed something out.

@mbredel
Copy link
Contributor

mbredel commented Mar 22, 2016

I just added a first version of an VNFD-Metadata schema (NSD-Metadata is still missing, but we would need something similar there) that addresses these issues. The metadata serves as glue between the VNFD, which should stay untouched by the SP, IMHO, and the service platform. On the one hand, it identifies the VNFD by the group-name-version triple, and on the other hand it contains the addition information, like the UUID and references to the VNF images, for example.

Please have look at https://github.com/sonata-nfv/son-schema/tree/master/function-descriptor and also https://github.com/sonata-nfv/son-schema/blob/master/docs/descriptior-relations.png that depicts my initial thoughts on the relation between the different entities (thinking a bit in standard database terms)

@jbonnet
Copy link
Member

jbonnet commented Mar 23, 2016

I'm really concerned about the complexity (higher) we're adding to the SP, and the gains we're (or will -- lower) getting from that. All these schemas are great to validate the files we're exchanging between systems, but the changes they imply are, for instance, making us change the Catalogues we brought in from T-NOVA, and thus waisting resources that could be used in other parts of the SP development.

I don't have a solution to present right now, but I'm not comfortable with this. For me, the simplest way is to provide a set of data to a system and get an ID associated with that data. That ID is then used for further interactions (queries, updates and deletes). No more, no less. All the turnarounds we're giving, adding the meta-data concept, the group+name+version to the NSDs/VNFDs... all these concepts do not, to me right now, bring a tremendous benefit that justifies it's adoption.

Could we adopt them later? Could we make them optional? I don't know... Further ideas?

@mbredel
Copy link
Contributor

mbredel commented Mar 23, 2016

I still don't see the higher complexity - but maybe I am missing something? Ok, for sure the T-NOVA catalogs have to be adapted, but - judging from the T-NOVA descriptors - this is something we have to do anyway. So where do you see the additional complexity? Maybe the schema approach makes it look more complex as it actually is?

For me, I am using the schema just as a ground truth, e.g. to implement the database. That is, if it would be me who develops the catalog, I would have a script that grabs the schema and initializes the database with it. In fact I am doing this for my Java code: I create the Java POJOs from the schema - automatically.

Regarding the IDs and the metadata: I am not a NoSQL database expert, but in SQL this is a no-brainer. Instead of a single ID field you have a composite key. Thus, you have to initialized the DB using ... PRIMARY KEY (Column_Group, Column_Name, Column_Version) ... and if you query the DB you use: SELECT * FROM nsd_table WHERE (group = X) AND (name = Y) AND (version = Z) instead of a SELECT * FROM nsd_table WHERE (uuid = X) in case of UUIDs.

The metadata is just adding a new table with references to the primary composite key and then it contains additional information. If this additional information contains a UUID it is possible to do a back-reference to the NSD-table ... in SQL using the JOIN command. Again - I don't see any complexity.

Am I missing something here?

@shuaibsiddiqui
Copy link
Contributor Author

I agree with @jbonnet on avoiding unnecessary complexity but we need to resolve the issue somehow, better in an efficient manner without duplicating data.

Before we look into the implementation aspect of the meta-data info (it is not as straight forward in a non-relational DB than in a relational one but I think it can be done), we should think it through all the way, keeping in mind all the user stories of SP Catalogues.

I have some doubts to the descriptor-relations.png,

  1. Why do we need the NSD meta-data ?
    According to the figure, it will contain UUID, creation date, and instances.
  2. Why do we want to reference the "to be" instantiated instances here in the meta-data as that information would be available in the NS repo.
  3. The "creation date" is not an essential/vital piece of information, so we are left with UUID. The SDK and BSS will not be using UUID, that is, the "GET" that GK will perform towards the sp-catalgoues in order to serve either a developer downloading an NSD or service start request from BSS will be using the unique name of the network service. The only use of UUID (for NSD) I can foresee now is that the GK while sending the service start request towards SLM includes only the UUID of the NSD (not the unique name) and in the NSR, the NSD will be referenced using the UUID. But it can be argued that this can be achieved using the unique name as well which begs the question, DO WE REALLY NEED UUID for NSD?

Now, for the VNFD meta-data, apart from UUID & creation date, we have an essential piece of information, the image reference (vm_image) which can be an external URL or an internal URI. For internal consumption (within SP) of image reference, we should go for internal URI only because external URL is out of SONATA SP control and may lead to errors during service instantiations.

I have couple of other proposals to tackle this issue but this post is already long enough, I'll wait for feedback and post them separately.

@mbredel
Copy link
Contributor

mbredel commented Mar 24, 2016

The picture, i.e. descriptor-relations.png, was just a start to have something to discuss. It is far from what is going to be implemented - I guess :-)

In fact, I don't really know what would go in the descriptor metadata entities. The examples, like the creation_time where just - well - examples. So this is up for discussion.

Regarding the NSD-metadata: I am not sure if we really need it - for sure we can live without the UUID (well - group+name+version IS a UUID - even if you just concatenate the strings) On the other hand, once we solved the issue on how to design the database for the VNFD it might be wise to use the same structure for the NSD .... However, I also started with the VNFD as I see a clear value here.

@shuaibsiddiqui
Copy link
Contributor Author

Instead of using meta-data to tackle UUIDs and variable info inside VNFD, I propose another possible approach.

Once the GK sends the NS_package.zip file to the SP Catalogues, it will:

i). Extract PD, NSD, VNFDs, etc.
ii). Generate UUID for VNFDs, NSD, & PD and insert them in a new field id inside the respective descriptor file. (We are NOT going to update the vnf_id field inside the NSD).
iii.a.) If the VNFDs contain external URL, then download the image,
iii.b.) Store the image locally and edit the vm_image field inside the VNFD accordingly.
iv) Store VNFDs, NSD and PD in the SP Catalogues.

Now, in case of BSS service start request:

  1. GK GETs the required NSD and VNFDs from the SP Catalogues using the unique name and forwards them to SLM.
  2. The SLM uses the id field inside the NSD to create the reference in the NSR. The FLM does the same for creating the NFR. Furthermore, the FLM can use the vm_image field in the VNFD to access the image to forward the deployment request towards Infrastructure Adaptor.

In case of a developer download request:

  1. GK performs the GET using the unique name towards SP Catalogue.
  2. SP-Catalogue will strip off id field from the requested NSD & VNFDs. It will also update the vm_image field in each VNFD according to structure of new package to be returned to GK and to the developer eventually.
    In case, we decide to send back the requested NSD piece by piece, (that is in multiple POSTs and assume that the SDK will compile the package itself.), we can update the vm_image field accordingly.

In this manner, for anything external to SONATA SP, the PD, NSD, & VNFD remain untouched (same as provided by the author) but for its internal working it adapts them accordingly without the need for meta-data management. [See also #63]

What do you think ? Pros and Cons ? :)

@mbredel
Copy link
Contributor

mbredel commented Mar 24, 2016

I like this approach - but I also think it is not really different to the meta-data approach :-) We are adding additional information to the VNFD .. To me it looks a bit like the NoSQL way of doing the meta-data think, which in my original post was more tailored to the relational thinking. But this here might be the right approach.

Now, thinking about the SCHEMA files again. We could have a VNFD-Extended schema then that inherits the VNFD schema and adds the additional fields? This VNFD-Extended schema would be the ground truth for the catalogues then. Would that be reasonable?

Another idea: If we think about a pure stand-alone package catalogue that can exist outside the service platform, we could just store the package (and its content) and leave it untouched there. The SP can then modify the VNFD internally ... (We would need to think about how to address the sub-package content, but that's a different story for year-2 :-))

@shuaibsiddiqui
Copy link
Contributor Author

@mbredel Thanks for the feedback ! We'll move forward with implementation of SP Catalogue accordingly.

In terms of schema, the additional fields, we require would only be the id field, in both NSD and VNFD.

In my earlier post I mentioned I had couple of ideas, what you proposed, stand-alone package catalgoue, is exactly what I had in mind :)

@jbonnet
Copy link
Member

jbonnet commented Mar 24, 2016

@shuaibsiddiqui I was thinking of extracting the relevant documents from the Package in the Gatekeeper and submitting them to the Catalogues in JSON. In this way Catalogues might stay very much like T-NOVA's...

What do you think?

@shuaibsiddiqui
Copy link
Contributor Author

Right now, SP Catalogues can handle both, json as well as yaml descriptors. If it receives them in yaml it will translate them in json and store them. If in json then directly stores them.

Another important question that I have asked before is that how does GK POST the package and its content towards SP Catalogue? Either send the whole zip file or send its content one by one. I would like to dig in the pros & cons of the two approches.

I already elaborated in earlier post that how will SP Catalogue handle it if it receives the complete .zip file.

But if it receives content piece by piece then we need to confirm couple of things:

  1. Who stores the .zip file that is the complete package ? GK or SP Catalogue ? I guess it would be easier for GK because the SP Catalogue would need to put all pieces back together. Or we don't store the complete package .zip file, assuming that the SDK will put it back together itself upon receiving all the pieces ?
  2. Who stores the image and edits the vnf-image field inside the VNFD? GK does it before POSTing it to SP Catalogue or SP Catalogue does it?

Either we do all the preprocessing (storing images, generating UUIDs, editing descriptors, etc) in GK and POST them in SP Catalogue Or send the .zip to SP Catalogue and it does the preprocessing before storing them. Or is there a nice middle way ?

@jbonnet
Copy link
Member

jbonnet commented Mar 29, 2016

@shuaibsiddiqui:

  1. I was thinking of handling the .son file (only temporarily) in the GK, since it is not a relevant concept within the SP (I'd prefer not to, since a package is really only relevant within the SDK, but now the work is done). Therefore, only {P|NS|VNF}Ds in JSON would be submitted to the Catalogues. Possible problem: what if one or more of the {P|NS|VNF}Ds fails to be stored, should all fail? The Catalogues generate the UUIDs and returns them to the GK (which returns them to the SDK);
    Whenever the SDK requests a package, the GK requires all the relevant {P|NS|VNF}Ds from the Catalogues, builds the package file, and sends it to the SDK. I'm assuming we can link all the NSD/VNFDs/* from a PD.

  2. Storing images should be outside of the SP's scope, at least for the first version. The Developer should include it's link (say, from Dropbox, or from Amazon's S3) in the VNFD, and that can not be changed.

@jbonnet
Copy link
Member

jbonnet commented Mar 29, 2016

@mbredel:

It's not the implementation I'm concerned about (and it's not about SQL vs. NoSQL: every database, with very different implementations, have some sort of unique indexes for multiple fields), it's how this complexity gets out of the SP.

As I've written before, if we're open-sourcing our code, it should have the lowest barrier to entry as possible. And that, at the time of writing, means having REST over HTTP for the interfaces. If we're using REST, we need a unique way to reach a resource that has been created. The REST way is an ID field that is created with a POST and returned to the requester, so that it can use it in the other operations (GET, PUT, PATCH, DELETE). This is how we've built the Catalogues in T-NOVA.

Using group+name+version means 3 fields, not one, which poses problems in composing URIs. We might merge them together into one field (e.g., "||"), thus fulfilling REST's needs of a unique ID field, but then again, we're making thinks more complex then they need to be: after all, we're dealing with a Service Platform instance, not the whole world of possible unique names, right?

And if we extend this unique way of referring to Packages, Services and Functions by group+name+version, we're making it even more complex.

So, my proposal on this is to use group, name and version as properties of the descriptor (actually, name and version are already properties of NSDs and VNFDs). We can enforce uniqueness upon this trio. But the way to identify a resource is an (UU)ID.

Do we have an agreement on this?

@shuaibsiddiqui
Copy link
Contributor Author

@jbonnet Few Comments

  1. Submitting only JSON to Catalogues works fine.
  2. If a storing *Ds fails, we can try the following:
    i) roll back (like in banking transactions) and request developer to try again. OR
    ii) If GK has validated the package, then it may try the POST again for that failing one before notifying the developer to try again.
  3. UUIDs are only for SP consumption. No need to send them to SDK.
  4. About storing images, I think storing images locally is in SP's scope, otherwise SP is dependent on external links. We dont know how much time it will take to download the image (if the link is available) .. this will delay the service initiation process. The internally stored image can only be used for SP consumption and dropbox or external links for serving SDK requests.
  5. I am sorry I got lost in the NSD property and enforcing uniqueness via the trio but using UUID for identification. Can you please elaborate it a bit further.

@dang03
Copy link

dang03 commented Mar 29, 2016

@jbonnet
Since the beginning of Catalogue implementation I have been following the T-Nova Catalogues way. Now many changes have been done, but the concept behind is still the same. Our current Catalogues' API implementation is done to be able to receive descriptors from the GK in YAML (thought it could support JSON) format, however any response from the Catalogues is currently in YAML.
To address the problem of some Descriptor failing in its storage, an idea is: The Catalogues can send a response to GK with the UUID or name.trio as identifier whenever a descriptor is stored. If the GK sends a burst of descriptors (a NSD and some VNFD to Catalogues) and one fails, GK can decide to remove the stored ones using delete operations for each descriptor identifier... or simply leave in the Catalogues the ones stored if the store operation will be retried.

For the identifier topic I agree with you José: Right now, our SP Catalogues implementation can deal with both UUID ("d20fbafa-99d6-4bc8-a565-65e5e6784007") or name.trio convention ("eu.sonata-nfv.firewall-vnf.0.2") as identifier of a unique descriptor. For me its not a problem but we need to decide what will be used and in which field this value will be stored. Our database indexes are internally working with UUIDs anyway, which could be used in the Service Platform too.

@mbredel
Copy link
Contributor

mbredel commented Mar 31, 2016

@jbonnet Regarding the IDs:
Actually, I said from the beginning, that we can have (UU)IDs aside the name.trio.convention :-)

I guess I know where the confusion is coming from: When I think about descriptors I think a file that really (only) describes a function (or service). This file is generated at the SDK and shipped to the SP - possible multiple independent SPs. That also means that the descriptor - to me - is NOT a database schema. Thus, if you store the descriptor in a DB, you of course can add additional data, like the (UU)ID. This view, however, seems different from the T-NOVA approach for example, where the descriptor is a database schema, isn't it?

@dang03 Regarding the database:
How do you document the database schema? Because I kind of expected the schema in this repo to be the ground truth also for the catalog databases. That is, that the databases are generated from the schema. No?

@dang03
Copy link

dang03 commented Mar 31, 2016

The database (MongoDB) is a non-relational db. However it uses model classes to define schemas of the database objects (which can be found on /models/catalogue_models.rb). We use your schemas as base schema to define the data models, but right now these data models are hardcoded and only implement these critical or required fields from your schema, e.g. name, group and version. My idea is to implement the whole data model for the databases when your schemas reach a final status. Another feature planned is to implement soon a syntax validation against schemas before storing or updating objects on the database

@mbredel
Copy link
Contributor

mbredel commented Mar 31, 2016

@dang03 Thanks for the update. Would it be possible to generate the code automatically from the schema? I am doing something similar in Java (http://www.jsonschema2pojo.org) which works reasonably good :-)

My fear is that the schemas will never be stable - just judging from what I see at the T-NOVA repos :-) Thus, it might be beneficial to a have a (semi) automated approach.

@jbonnet
Copy link
Member

jbonnet commented Mar 31, 2016

@mbredel, @dang03
MongoDB is just of the NoSQL databases that can be schema-less: we took advantage of that feature with the extremely high dynamics of the NSDs/VNFDs we had. The idea is we validate the NSD/VND against a schema and then drop it in MongoDB the valid ones (before this approach we had a painful process of making the schema to a set of relational PostgreSQL tables). In the end we've waisted some energy and work on integrating 'each one''s xD, but I don't think the problem was MongoDB (I think it was a cultural problem, in which everyone just felt he/she could change the xD without letting others know).

My 2 cents,

@jbonnet
Copy link
Member

jbonnet commented Mar 31, 2016

@mbredel
It wouldn't hurt to put the IDs in the schema. Any way, and again, the Gatekeeper is expecting the SDK to use the UUID of the Package when it wants to update or delete, the UUID of the NS instance when it wants to gather monitoring params (e.g.) and the BSS will use the UUID of a NS when asking for an instantiation.
Is this expectation right?

@mbredel
Copy link
Contributor

mbredel commented Apr 1, 2016

Are we talking the catalog API (SP internal) or the Gatekeeper API (external) now? I think it was the catalog API before, wasn't it?

Anyway, I am happy that we are discussing APIs now. This is a lot more hands-on I think :-) And I will answer in two separate comments, as - at least to me - your comment mixes two aspects.

First - the "package handling" aspect (I initially thought this is a catalog feature, not necessarily a GK feature, e.g. in the case of a stand-alone catalog): For handling packages, API should be something like this (IMHO):

GET /packages/ -> Returns an array of all packages
GET /packages/GROUP -> Returns an array of all packages of group
GET /packages/GROUP/NAME -> Returns an array of all packages of group and name.
GET /packages/GROUP/NAME/VERSION -> Return one (or zero) package

This GET is a valid REST call and can be implemented even without a database. Just a webserver and a file system having the GROUP/NAME directory structure.

POST /packages/ -> A bit more work as it needs to parse the package descriptor to get GROUP, NAME, and VERSION.

DELETE /packages/GROUP/NAME/VERSION

Not sure about a put. Actually, the SDK should not modify an existing package, but upload a new version. If we want to have put then like this:

PUT /packages/GROUP/NAME/VERSION

I am also not sure if the SDK should be able to access the NSDs and VNFDs directly. We once said that the package is the artifact that is exchanged between the SDK and the SP. However, we could have a similar API also for NSDs and VNFDs.

@mbredel
Copy link
Contributor

mbredel commented Apr 1, 2016

Second aspect - instantiation and retrieving monitoring data:

These are operations tightly coupled to the SP and the user has to be logged in etc. So in this case, yes, operations using UUIDs of available or running services should be used.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants