-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
clarify GDC population at startup #9
Comments
As a start variant option 1 is ok from my point of view, later it depends on the total number of metadata, as they could then not be overwritten again after 24 hours like the other cached data. |
I would prefer Option 1. Republishing metadata regularly sounds not good to me. For the endpoint: This could be added to the GC service metadata. This way GCs could change the URL by updating their own metadata. A GDC could then query GC metadata from another GDC and use this information to populate its own metadata store. |
@kaiwirt sure. GDCs can publish WCMP2 service records of themselves, and provide a link to an archive like: {
"rel": "archives",
"type": "application/zip",
"title": "Archive of all WCMP2 records",
"href": "https://example.org/path/to/gdc-wcmp2-latest.zip"
} |
Option 1 was that GCs have the metadata at a known endpoint. What i meant was, that this endpoint needs not to be known. It can be part of the GCs metadata.
|
Yes, we are saying the same thing. The "known URL" is a function of the link with |
Real world example: we (Canada) had to re-initialize our GDC this week, and found that we could not find an "archive" from which to perform a cold start. We have initially loaded all known wis2box discovery metadata because we know how wis2box makes WCMP2 available (supported by the GDC reference implementation). Of course this is not enough, as we need all WCMP2 for all WIS2 Nodes on cold start. Ideally all records should be in the GC for a GDC to pull from. |
For (almost) static data like metadata records, or climate datasets or... I would suggest publishing (eg) once a day a notification message. I wouldn't create a special case for metadata such as zip file. |
Moving this Issue to "urgent" - we need some discussion on this ahead of the face-to-face meeting in November |
Discussed at ET-W2AT, 16-Oct-2023. Summary of key points and decision below. Objectives:
Proposal:Move the requirement to manage a discovery metadata archive from the Global Cache to the Global Discovery Catalogue. The GDC already has to implement logic dealing with inserts, updates and deletions to the catalogue. Publishing a resource which contains the full set of currently valid metadata records seems reasonable straightforward for the GDC. This means that the GC no longer has to implement special application logic just to deal with metadata records. As with other data and metadata, the GC would subscribe to updates about these metadata archive resources and download a copy for re-publication, caching it for 24-hours. This adds the need for the GDC to operate a Message Broker. But @golfvert noted that all Global Services will need to operate a Message Broker to publish WIS2 alert/monitoring messages (i.e., "reports"). This lead to a wider discussion about "report" messages in WIS2. We also noted that a WIS Centre may share its technical capabilities (e.g., a Message Broker) with another WIS Centre. For example, DWD may allow MSC to publish notifications on its broker. This is a bilateral agreement and doesn't need to be covered in the Tech Regs. In this example, MSC would still be accountable for operating a broker; they do so by delegating responsibility to DWD. The Guide needs updating to describe this arrangement, which is likely to happen where larger Centres are supporting smaller ones (e.g., NZ supporting Pacific Island States). Actions:Define the details of the metadata archive resource (e.g., zip file?). Define the details of the notification message to be used - especially where it's published in the topic hierarchy. Create / update Issue about the [different types of] report messages in WIS2. << @tomkralidis Update Technical Regulation (Manual on WIS Vol II):
Update the Guide to reflect this proposal and add section talking about bilateral agreements to share technical components. Amend GDC implementations. |
Added in #44 |
If we use the
Maybe easiest to direct the GC behaviour is:
|
Sure, so this means a GDC needs to explicitly specify that a metadata archive is to be cached. |
Why the metadata zip file wouldn't be like any other data that we exchange in the normal data tree ? So in summary:
|
Treating the metadata archive as If we do treat the metadata archive like data then maybe it should have a discovery metadata record too? Discoverable via the GDC. |
@tomkralidis, @golfvert - can we conclude this discussion? I think the only outstanding decision is whether we treat the metadata archive as a "normal" data resource - implying it sits in the |
Given the context of this issue, i.e. the purpose of a such an archive would be primarily to support a cold start; suggest (for this phase):
If we want to provide metadata as data, along with a WCMP2 record, I think we need to have more discussions around lifecycle:
In other words, the GDC becomes a pseudo data provider and needs to consider the data management lifecycle. |
Decision Each GDC will publish every day a zipfile containing all metadata records and will provide a landing page with a link to the zipfile. origin/a/wis2/centre-id/metadata GC will cache all the metadata (all metadata records are treated as core) in the /metadata topic with a persistence of at least 24 hours to allow the GDC to access them. |
PR in #48. Note that related GC provisions are already in the Guide. |
The WIS2 Guide states in section 8.4.1:
This ensures that a GDC can initiate itself from already published discovery metadata in the event of a catastrophe/re-deploy.
Options:
cc @golfvert @6a6d74 @efucile
The text was updated successfully, but these errors were encountered: