Skip to content

fix: typos in k8s discovery code #12288

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

ChuanFF
Copy link

@ChuanFF ChuanFF commented Jun 4, 2025

there is a spelling error in k8s discovery module. post_List should be post_list

Description

There is a spelling error in the k8s service discovery module code. As shown below, the judgment condition in the code of apisix/discovery/kubernetes/informer_factory.lua should be informer.post_list instead of informer.post_List. The current bug causes the post_list function to not be executed.

    informer.fetch_state = "list finished"
    if informer.post_List then
        informer:post_list()
    end

Checklist

  • I have explained the need for this PR and the problem it solves
  • I have explained the changes or the new features added to this PR
  • I have added tests corresponding to this change
  • I have updated the documentation to reflect this change
  • I have verified that this change is backward compatible (If not, please discuss on the APISIX mailing list first)

Spelling error. post_List should be post_list
@dosubot dosubot bot added size:XS This PR changes 0-9 lines, ignoring generated files. bug Something isn't working labels Jun 4, 2025
@ChuanFF ChuanFF changed the title fix: typos in k8s service discovery code fix: typos in k8s discovery code Jun 4, 2025
@Baoyuantop
Copy link
Contributor

Hi @ChuanFF, need add test for this change.

@ChuanFF
Copy link
Author

ChuanFF commented Jun 5, 2025

@Baoyuantop just a typo, is the test necessary?

@Baoyuantop
Copy link
Contributor

The current bug causes the post_list function to not be executed.

We need to verify this.

@ChuanFF
Copy link
Author

ChuanFF commented Jun 6, 2025

There are two typos in the following code:
informer.pre_List should be informer.pre_list,
informer.post_List should be informer.post_list.

if informer.pre_List then
informer:pre_list()
end
ok, reason, message = list(httpc, apiserver, informer)
if not ok then
informer.fetch_state = "list failed"
core.log.error("list failed, kind: ", informer.kind,
", reason: ", reason, ", message : ", message)
return false
end
informer.fetch_state = "list finished"
if informer.post_List then
informer:post_list()
end

Function Behavior:

  • pre_list calls shared_dict:flush_all() (docs), which expires all keys in the shared dictionary but doesn't physically delete them.
  • post_list calls shared_dict:flush_expired() (docs), which physically deletes expired keys from the shared dictionary.

Issue After Fixing Typos:

When the typos are corrected, a new problem emerges:
flush_all function seems will kill 3 key-value pairs data in shared_dict (code reference). This creates a critical window between the pre_list call and when new endpoints are populated. During this interval:

  1. some endpoint data in the shared dictionary will be missing
  2. requests may fail due to unavailable endpoint information

Therefore, this typos bug seems not to be fixed directly. I'm not sure whether the behavior of shared_dict:flush_all() deleting some data is a bug

@Baoyuantop
Copy link
Contributor

Hi @ChuanFF, I think we can optimize this by modifying the pre_list and pre_post methods:

  • pre_list function only records existing data keys and does not clean up any data.
  • pre_post compares old and new data and removes old data that no longer exists.

@ChuanFF
Copy link
Author

ChuanFF commented Jun 23, 2025

@Baoyuantop if we want to removes old data that no longer exists, there is a problem that how to records existing data keys,ngx.shared.DICT.get_keys does not seem to be suitable(official document have a caution: Avoid calling this method on dictionaries with a very large number of keys as it may lock the dictionary for significant amount of time and block Nginx worker processes trying to access the dictionary.)

I think we can prepare two tables, data_history and data_existing, recording data to data_existing during list and watch, assigning data_existing to data_history in pre_list and setting data_existing to {}, then comparing data_history with data_existing in post_list and removes old data that no longer exists.

Test cases seem difficult to write, dirty data is generated only after the watch ends and before the next list call, i do not know how to verify old data that no longer exists, do you have any suggest?

Or can we ignore dirty data? After all, the conditions for dirty data to be generated are harsh, and it seems to have no effect.

@Baoyuantop
Copy link
Contributor

Lua table cannot share data between multiple workers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working size:XS This PR changes 0-9 lines, ignoring generated files.
Projects
Status: 👀 In review
Development

Successfully merging this pull request may close these issues.

2 participants