Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix for SGWC and SMF round robin selection #556

Merged
merged 1 commit into from Sep 9, 2020

Conversation

kbarlee
Copy link
Contributor

@kbarlee kbarlee commented Sep 8, 2020

Previously, after searching for a SGWU/UPF designed to serve a particular TAC/APN/cell_ID and not finding a node that was suitable, SGWC and SMF would run this line to select a node in round robin fashion:

return next ? next : ogs_list_first(&ogs_pfcp_self()->peer_list);

However this did not check that node next was PFCP associated.
#555

The code now runs as follows:

  • search for matching SGWU/UPF nodes as before
  • if none are found, enter RR mode
  • in RR mode, find next PFCP associated node in list and use it
  • if there are no PFCP associated nodes, print error message (and select first from list)

@acetcom
Copy link
Member

acetcom commented Sep 9, 2020

Hi @kbarlee

When I first designed it, I didn't make a distinction between RR mode and the rest.

For example, let's say you have 10 UPFs. At this time, I wanted to be able to select 3 UPFs as Round-Robin manner when the APN of 3 UPFs are internet.

Stabilization is in progress now. After that is done, let me look at this part again. Of course, you can also request a Pull Request again. If it fits my original design, I will merge it.

Thank you so much!
Sukchan

@kbarlee
Copy link
Contributor Author

kbarlee commented Sep 9, 2020

Hey @acetcom

You don't see the issue for the scenario you described. The code is executed along the following lines:

  1. Search through node list to see if any are dedicated to support the TAC
    -- round robin between nodes that support the TAC && are PFCP associated
    -- if there are none, move to 2

  2. Search through node list to see if any are dedicated to support the e_cell_ID
    -- round robin between nodes that support the e_cell_ID && are PFCP associated
    -- if there are none, move to 3

  3. Search through node list to see if any are dedicated to support the nr_cell_ID
    -- round robin between nodes that support the nr_cell_ID && are PFCP associated
    -- if there are none, move to 4

  4. Search through node list to see if any are dedicated to support the APN
    -- round robin between nodes that support the APN && are PFCP associated
    -- if there are none, move to 5

  5. Round robin full list. As described above, this line does not check that the node selected in RR is PFCP associated

return next ? next : ogs_list_first(&ogs_pfcp_self()->peer_list);

So for the situation you describe where there are 3 UPFs that support APN 'internet', it will happily round robin between them. If one of the 3 is offline, it will round robin between the other two. This fix is for the situations where

  • no TAC/e_cell_ID/nr_cell_ID/APN are specified in the list, and it defaults to full list RR
  • none of the nodes support the TAC/e_cell_ID/nr_cell_ID/APN that the UE is connecting with, and it defaults to full list RR
  • or no nodes that do support the TAC/e_cell_ID/nr_cell_ID/APN are currently PFCP associated, and it defaults to full list RR

(I had already implemented a fix for this here, but it didnt make it into the current master)

open5gs/src/smf/context.c

Lines 687 to 743 in bcd02b1

if (ogs_pfcp_self()->upf_selection_mode == UPF_SELECT_RR) {
/* Select UPF (PFCP) with round-robin manner */
ogs_debug("Select UPF by RR");
/*
- starting from list position of last used UPF, search down list for next PFCP associated UPF
- if PFCP associated UPF found
- use it
- if no associated UPF found, keep searching
- if bottom of list reached, reset search to top of list
- if completed full cyclic search of list and still not found a UPF, use default (first)
*/
int connect=0, numreset=0;
char startUPFIP[OGS_ADDRSTRLEN];
OGS_ADDR(&ogs_pfcp_self()->node->addr, startUPFIP);
//search UPF list, find next UPF that is associated
while (!connect)
{
// (when end of UPF (PFCP) list reached, reset search to top of UPF list)
if(ogs_pfcp_self()->node == NULL){
ogs_pfcp_self()->node = ogs_list_first(&ogs_pfcp_self()->n4_list);
numreset = 1;
}
// search UPF (PFCP) list, find next UPF that is associated
while(ogs_pfcp_self()->node && !connect) {
// if cyclic list check complete and still not found a UPF, break
OGS_ADDR(&ogs_pfcp_self()->node->addr, buf);
if (numreset == 1 && !strcmp(buf,startUPFIP)) {
break;
}
// has UPF <x> associated over PFCP?
if (OGS_FSM_CHECK( &ogs_pfcp_self()->node->sm, smf_pfcp_state_associated) ){
// then use it
connect = 1;
} else {
// else check next UPF in list
OGS_ADDR(&ogs_pfcp_self()->node->addr, buf);
ogs_debug("UPF on IP[%s] is not PFCP associated",
buf);
ogs_pfcp_self()->node = ogs_list_next(ogs_pfcp_self()->node);
}
}
// after checking from the top of the list and not finding a suitable UPF
if (!connect && numreset == 1 ){
// default to first in list
ogs_error("No UPF (PFCP) is currently PFCP associated");
ogs_error("Defaulting to first UPF (PFCP) in smf.yaml list");
ogs_pfcp_self()->node = ogs_list_first(&ogs_pfcp_self()->n4_list);
break;
}
}
// list UPF used
OGS_ADDR(&ogs_pfcp_self()->node->addr, buf);
ogs_debug("UE using UPF on IP[%s]",
buf);

Hope that makes sense!

Kenny

@acetcom acetcom merged commit 56a866c into open5gs:master Sep 9, 2020
@acetcom
Copy link
Member

acetcom commented Sep 9, 2020

@kbarlee I misunderstood your original code. Now I have merged your code. And also, The coding indentation was slightly modified. 228dd34

That's really nice work!

Thank you so much for your effort!
Sukchan

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants