Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Define and add resources needed for free5GC SMF (to nephio-project/free5gc-packages) #88

Closed
s3wong opened this issue Apr 3, 2023 · 16 comments
Assignees
Labels
area/workload-cluster SIG Automation Workload Cluster Subproject sig/automation
Milestone

Comments

@s3wong
Copy link

s3wong commented Apr 3, 2023

free5gc SMF packages now have the following definitions: here

More information is definitely needed: list of UPFs, sNssai info...etc. Add them into the package

@gvbalaji gvbalaji added area/workload-cluster SIG Automation Workload Cluster Subproject sig/automation labels Apr 4, 2023
@gvbalaji gvbalaji modified the milestones: sprint1, sprint2 Apr 4, 2023
@gvbalaji gvbalaji modified the milestones: sprint2, sprint3 Apr 25, 2023
@tliron tliron changed the title Define and add resources needed for free5gc SMF (to nephio-project/free5gc-packages) Define and add resources needed for free5GC SMF (to nephio-project/free5gc-packages) Apr 25, 2023
@tliron tliron assigned henderiw and unassigned n2vo and vireshnavalli Apr 25, 2023
@gvbalaji gvbalaji modified the milestones: sprint3, sprint4 May 9, 2023
@henderiw
Copy link
Contributor

henderiw commented May 9, 2023

this is depending on this:
nephio-project/api#17

@gvbalaji gvbalaji modified the milestones: sprint4, sprint5 May 23, 2023
@johnbelamaric
Copy link
Member

@henderiw to discuss with @s3wong

@johnbelamaric
Copy link
Member

@s3wong @tliron @henderiw @n2vo @denysaleksandrov

The issue we discussed in the meeting yesterday is that the SMF configuration needs many details from the config of each connected UPF. Here are the options we discussed yesterday, please correct as needed:

  • Implement this in the topology controller
    • How: topology controller waits for UPF packages, and then inserts the details in the SMF packages
    • Pros:
      • Expediency
      • Demonstrates the idea of a controller with awareness of multiple functions
    • Cons:
      • Creates a free5gc-specific function in the topology controller
      • Creates a dependency on topology controller, which is a stretch goal and may not be met
  • Implement via service discovery
    • How:
      • Init container in the UPF registers with service discovery (new component, or can we use mongodb in free5gc-cp?)
      • SMF operator reads from there and rewrites the SMF config as needed
    • Pros:
      • Possibly more accurately reflects a real deployment
      • Demonstrates swimlane 3 functionality
    • Cons:
      • Requires exposing central service discovery component, and distribution of secrets, etc. to workload clustesr
      • Requires runtime connectivity to service discovery cluster from edge clusters (mgmt cluster or "regional" cluster)
      • Time to implement?
  • Implement a specializer
    • How:
      • New specializer in nephio-controller-manager watching package revisions
      • Sees a sentinel resource (ie, a specially annotated resource) in the SMF package indicating that a cross-package injection is needed
      • Reads label, kind, maybe name from sentinel resource, and uses those to search through package revisions
      • Injects UPFDeployment resources from those other package revisions
    • Pros:
      • Demonstrates resolution of a cross-package dependency
      • Expediency
    • Cons:
      • Another nephio-specific reconciler, may not be long term solution
  • Implement as part of PackageVariant
    • How: Like the specializer option, but do it in Porch PackageVariant controller
    • Pros:
      • Demonstrates resolution of a cross-package dependency
      • Putting it in Porch makes it clear it is not nephio-specific
      • Can leverage readiness and other infrastructure of PackageVariant
      • May be where it ends up eventually anyway
    • Cons:
      • May require us to ship with a private build of Porch
      • Constrains use of this feature to only when a PackageVariant is used (not when using manual deployment, for example).

@gvbalaji
Copy link

gvbalaji commented Jun 6, 2023

Implementing a specializer may be the easiest option for R1.

@johnbelamaric
Copy link
Member

Can we agree what the final state of the data in the SMF package looks like, and then how that is reflected in the SMF package so the operator changes can be unblocked? @s3wong @n2vo @henderiw

@henderiw
Copy link
Contributor

henderiw commented Jun 8, 2023

The service discovery creates an additional dependency on a register that need to be accessible from the various locations. From my experience many people in telco are not keen on this given this dependency. They even try to avoid basic things like DNS/DHCP. So something we have to keep in mind. Also there are very good solutions in the market for this, but our main dependency is free5gc here. Afaik we cannot do dynamic updates to it. So even service discovery in this ctxt will not work.

Here are the things that need to happen in my view:

  1. Identify the dependency -> this can be a resource in the package

  2. We need to know where the resource exists -> right now this seems to be in a package (which would be identified by the dependency (so would be good that the name of the package gets known so it is easier to search). Going fwd this could also be in the mgmt cluster or a service discovery centrally. The resource backend does this e.g. today for IP/VLAN and TOPOLOGY
    Also important is to know the completeness and validate whether we have all the dependent resources, could be based on package name.

  3. So eventually a NFDeployment resource need to be created with the references to these resources. We have a configuration item in NFDeployment for this.

     ConfigRefs []corev1.ObjectReference `json:"configRefs,omitempty" yaml:"configRefs,omitempty"`
    

By referencing the dependent resources here we can tell the SMF operator exactly what to look for. Now one of the challenges is if we have both UPF and SMF deployed on the same cluster. The resource comes from 2 packages. We could wrap it in another resource potentially to avoid dual actuation by configsync.

Alternative we could annotate the Dependent resources but the SMF operator will not know if they are all there. So I believe the explicit reference is a better approach

Given the above we need to bring this together in the specialization flow.
Here is the proposal to do this.

  1. the original package contains the dependent resource and the specialised ownerRef is set to NF deployment

example: how we do this for interface. We have a reference to the final SMFDeployment that is the metaObject we need to apply to the cluster.

apiVersion: req.nephio.org/v1alpha1
kind: Interface
metadata:
name: n4
annotations:
config.kubernetes.io/local-config: "true"
specializer.nephio.org/owner: workload.nephio.org/v1alpha1.SMFDeployment.upf-empty-empty

the cond SDK ensures the conditions are set, by have the sms-deploy fn run first in the pipeline

The dependency specialised, act as we describe above and once ready completes its job.
it would act the same way as the VLAN/IPAM specialiser but here it injects new resources from outside.
Once complete the cond sdk set the condition to True

If the SMFDeploy fn conditions are all true it will aggregate it in the SMFdeployment CR which will be actuated on the cluster

@henderiw
Copy link
Contributor

henderiw commented Jun 8, 2023

I would say the approach I describe above is basically service discovery.

  • identify what you need (the dependency)
  • go find it
  • provide the consolidated info

I believe the difference is distributed or central and this is a choice you have

@s3wong
Copy link
Author

s3wong commented Jun 8, 2023

Ran some test on a "more common" scenario:

  1. nftopology has an instance of SMF and an instance of UPF
  2. there are two clusters matching the UPF NFInstance clusterlabels, thus the expectation is there will be one instance of UPF for each cluster (i.e., total of two for the SMF)
  3. when the first cluster showed up, the first UPF instance is deployed (I am still just simulating these)
  4. from nftopology reconciliation loop, the task (for this instance of nftopology) is done (i.e., I only have one instance of UPF for this Topology, at this moment)
  5. sometimes later, another cluster matches this label, and Nephio deploys another UPF onto the new cluster
  6. nftopology controller doesn't detect this change, and therefore info isn't passed onto specializer on SMF package update

so it seems like this is more than holding off SMF until all UPFs are deployed; a new packagerevision matching the same NFTopology name in label and also connecting to the same NF (SMF in this case) will need to update the dependent packages

@s3wong
Copy link
Author

s3wong commented Jun 9, 2023

@henderiw

I really wonder how this works:

currently, in e2e tests (https://github.com/nephio-project/test-infra/pull/63/files#diff-11333ec2ac174948e748483206666f9217ba389984073b4d467bdc66664cce45), UPF is set up via PVS where the objectSelector is key'ed off of WorkloadCluster label --- so package is only cloned when a cluster matches that scenario. What that means is it is possible (however remotely) that the SMF would be deployed before we even have a single UPF package created, and as such there is nothing in the system that would even know if this SMF package has any dependency, so the "identify what you need" part may not be known at time of the SM package deployment.

My take (as I wrote above) is that the only logical way to deal with this is to do:

  1. if some UPF packages are already deployed (or just created even), the package for the SMF that is connected to these UPF can include reference to them
  2. for those UPF instances (packages) come AFTER the SMF package is deployed, then as each of them is created / deployed, the SMF package will be updated, and as such the SMF instance will be reloaded with new configmap

@henderiw
Copy link
Contributor

henderiw commented Jun 9, 2023

good point. the lime of thinking was like this. Someone schedules this deployment in harmony. Let's call this the 'UBER' package which contains PVS for UPF and SMF. The person deploying this would apply this UBER package to the cluster.

So this is somehow the link. As you said both of these PVS will result in PVC, etc. So they will specialise.

My assumption is none of this gets deployed unless a human approve this. Now with the auto-approval controller, what we could do is tie this back to the original package and only approve once all the conditions of each individual packages get their conditions to true.

So I see this as a bundled approval.

@henderiw
Copy link
Contributor

henderiw commented Jun 9, 2023

@s3wong the other thing you should be aware is specialisation is not 1 shot. it continuously runs.

@henderiw
Copy link
Contributor

henderiw commented Jun 9, 2023

here is the proposal for the reference structure.

apiVersion: ref.nephio.org/v1alpha1
kind: Config
metadata:
name: upf-cluster01
namespace: default
spec:
gvk:
apiVersion: workload.nephio.org
kind: UPFDeployment
config: ""

apiVersion: ref.nephio.org/v1alpha1
kind: Config
metadata:
name: upf-cluster01
namespace: default
spec:
gvk:
apiVersion: workload.nephio.org
kind: UPFDeployment
config: ""

apiVersion: workload.nephio.org/v1alpha1
kind: SMFDeployment
metadata:
name: smf-region
spec:
configRefs:

  • apiVersion: ref.nephio.org/v1alpha1
    kind: Config
    name: upf-cluster01
    namespace: default
  • apiVersion: ref.nephio.org/v1alpha1
    kind: Config
    name: upf-cluster02
    namespace: default

Here is a proposal on how to add the reference to the refrence
SMF deployment has a list of configuration references that
will be applied to the workload cluster

The SMF operator should get these reference at the beginning of
the reconcile cycle. When all refs are not all there the reconciliation
should retyr
One all reference are found we need to parse the refs.
The CRD is a generic CRD where the gvk specifies the type

  • apiversion and Kind
  • alternatively we can use the unstructured object to get the GVK
    and resolve it to the type in a 2nd step

Another alternative is using a configmap iso the Config.ref.nephio.org/v1alpha1
object

@johnbelamaric
Copy link
Member

@s3wong the other thing you should be aware is specialisation is not 1 shot. it continuously runs.

Yes, as an "eventually consistent" system we should expect continuous change and reconciliation. So I think it's ok if it reconfigures a few times.

For the "first" time, we auto-approve. After that it requires human approval. We can delay the first approval for an arbitrary amount of time - say 5 or 10 minutes.

If we want to have more sophisticated auto-approval for future updates, we can consider that in later releases.

@s3wong
Copy link
Author

s3wong commented Jun 13, 2023

@henderiw

config: ""

I am assuming this is a json string from the actual UPFDeployment.Spec? The SMF controller would simply json.Unmarshal this string?

@gvbalaji
Copy link

Stephen will do a PR today and planing for integration tomorrow.

@johnbelamaric
Copy link
Member

This is done and working

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/workload-cluster SIG Automation Workload Cluster Subproject sig/automation
Projects
Status: Done
Development

No branches or pull requests

6 participants