Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sanitize stale session state in Orc8r #8621

Closed
themarwhal opened this issue Aug 12, 2021 · 4 comments
Closed

Sanitize stale session state in Orc8r #8621

themarwhal opened this issue Aug 12, 2021 · 4 comments
Assignees
Labels
component: orc8r Orchestrator-related issue fbcpp v1.8

Comments

@themarwhal
Copy link
Member

themarwhal commented Aug 12, 2021

Original discussion: #7681

Problem

As part of AGW state sent to Orc8r periodically, SessionD sends a snapshot of all existing sessions keyed by its IMSI.
As SessionD can only keep track of active sessions in the system, there is no way for SessionD to remove subscriber state in the Orc8r once a subscriber leaves the AGW. As Orc8r does not clean up stale state and NMS also does not check for stale state before displaying it, we end up with a growing number of stale subscribers that show active sessions.

Possible solutions

  1. SessionD sends a per-AGW state as a single object. This way, every state reported has the most updated snapshot of all subscribers. We will need to do indexing on the Orc8r side to enable per-subscriber queries from this state.
  2. Orc8r periodically goes through all reported states to remove stale entries.

One caveat here is that with approach 1, we will still have stale state if an AGW goes offline. However, this may be OK if we also display the last checkin time to let the user know the state may be stale.

Actions

  1. @themarwhal to add an additional SessionD state that encapsulates all subscribers in one object.
  2. @hcgatewood to add any Orc8r modifications needed to support the same subscriber state API endpoint.
  3. Once we have migrated to the new state, @themarwhal can remove per-subscriber state entries from SessionD's side.
@ekowtaylor
Copy link
Contributor

Phase out implementation and land design in this sprint. @themarwhal to help break this down for tracking in this sprint

@bhuvaneshne
Copy link
Contributor

Hello @themarwhal , @hcgatewood ,
I was searching for existing bugs to report this exact issue when I found this thread.
Wanted to check how this is coming along.
Regards,
Bhuvanesh

@sebathomas
Copy link
Contributor

sebathomas commented Apr 28, 2022

A few more thoughts about solution 1:

  • On the AGW side, this would mean adapting sessiond's OperationalStatesHandler.cpp so that it returns a single state object with a new type containing all subscriber states, instead of a list of subscriber states as it's done currently. I assume the ID of the full state object would be the gateway's hardware ID.
  • The state object would be forwarded by magmad's state_reporter.py without modification, same as before. The main difference would be that the new object would always be sent to the cloud, even if there are no subscribers (and this is part of the reason why it solves this issue). Is this problematic because of the traffic it would generate? Or is there a way to send it only if there are changes?
  • I am not sure how the indexing on the orc8r side would work. At the moment, the orc8r ReportStates servicer is completely generic and simply dumps all states it receives into the table states. It would be possible to insert some special logic here that separates the new object into different states per subscriber (to write the same data structure as before, and clean up subscribers that are no longer reported). But I think that would be the first time we add special logic while writing a particular type of state, and break the existing generic design. Are there other examples in orc8r where we re-index state data? I did not know about reindexers when I originally wrote this text. So we will use a reindexer to store the data indexed by IMSI and gateway ID in a new table. The next time the same gateway sends an update, subscribers that are no longer there can be cleaned up.

@sebathomas sebathomas self-assigned this Jun 1, 2022
@sebathomas
Copy link
Contributor

sebathomas commented Jun 2, 2022

Concrete plan for solution 1:

  1. Adapt sessiond's OperationalStatesHandler to produce a new kind of state with type gateway_subscriber_state. This will get the hardware ID as a key and then contain a map of all IMSIs to the corresponding subscriber state. This will ensure that there will also be a state update if the list of subscribers is empty. For the moment, sessiond can send both types of state until the changes on the orc8r side are finished.
    chore(agw): integrate new gateway subscriber state #12952
  2. Add a new table gateway_subscriber_states to subscriberdb that will contain the reindexed state. The table will have the following structure:
    fix(orc8r): Add new table for reindexed subscriber state #12949
Column Type
network_id text
gateway_id text (hardware UUID)
imsi text
last_updated_at int8 (unix timestamp)
reported_state bytea (same as the reported subscriber state before)
  1. Extend subscriberdb's reindexer to also handle the new gateway_subscriber_state, split it up into different states per subscriber. Then update all states in gateway_subscriber_states belonging to that gateway, i.e. upsert them into the new table and remove all IMSIs that no longer exist.
    fix(orc8r): add reindexing of gateway subscriber state to subscriberdb indexer #13041
    fix(orc8r): add k8s annotations to trigger subscriberdb for gateway_subscriber_state #13087
  2. Change the existing GET subscribers and subscriber_state calls to get subscriber state data from the new table instead of from the state service.
    fix(orc8r): Read from subscriber storage by IMSI #13151
  3. Remove the old logic in sessiond that produces individual subscriber states.
    chore(sessiond): Remove legacy subscriber_state #13553

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component: orc8r Orchestrator-related issue fbcpp v1.8
Projects
None yet
Development

No branches or pull requests

6 participants