Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP]Only keep 1000 acked items #131

Closed
wants to merge 1 commit into from
Closed

Conversation

peter-wangxu
Copy link
Owner

@peter-wangxu peter-wangxu commented Apr 11, 2020

previous implementation only delete 1000 items each
clear_acked_data which causes confusion, since this
commit may also surprise other people, I am holding
merge for the time being.

fixes #126

@peter-wangxu peter-wangxu force-pushed the only_keep_1000 branch 4 times, most recently from b1009fa to d889105 Compare April 12, 2020 14:53
previous implementation only delete 1000 items each
clear_acked_data which causes confusion, since this
commit may also surprise other people, I am holding
merge from time being.
@codecov
Copy link

codecov bot commented Apr 12, 2020

Codecov Report

Merging #131 into master will not change coverage by %.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master     #131   +/-   ##
=======================================
  Coverage   95.62%   95.62%           
=======================================
  Files          15       15           
  Lines        1510     1510           
  Branches      162      162           
=======================================
  Hits         1444     1444           
  Misses         46       46           
  Partials       20       20           
Flag Coverage Δ
#python 95.62% <ø> (ø)
Impacted Files Coverage Δ
persistqueue/sqlackqueue.py 96.36% <ø> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 54bf1c2...d889105. Read the comment docs.

@imidoriya
Copy link
Collaborator

imidoriya commented Nov 13, 2020

Since you're making a major version change, might be time to consider removing the default 1000 limit on clear_acked_data if it still exists in 5.x. Make setting the _MAX_ACKED_LENGTH one of the options instead. Seems to be a similar request to #82

@imidoriya
Copy link
Collaborator

imidoriya commented Nov 13, 2020

For some reason, I thought you had gone from 4.x to 5.x. Must have been looking at another project. lol.
But adding it as an option when building it, even if defaulted to 1000 to maintain compatibility, would be nice.
Right now I'm doing this..

jobs = persistqueue.SQLiteAckQueue(
        "queue", multithreading=True, auto_commit=True
    )
jobs._MAX_ACKED_LENGTH = 0

So I was thinking something like this..

jobs = persistqueue.SQLiteAckQueue(
        "queue", multithreading=True, auto_commit=True, max_acked_length=0
    )

@imidoriya
Copy link
Collaborator

imidoriya commented Dec 7, 2020

I think there are two variables to look at with this request.

  1. LIMIT 1000
  2. {max_acked_length}

The description of the function is:

  • clear_acked_data: perform a sql delete agaist sqlite, it remove the latest 1000 items whose status is AckStatus.acked (note: this does not shrink the file size on disk)

However, that's not what the function actually does. It removes 1000 items, but keeps latest 1000 items. So it deletes 1000- 2000 of the DESC return. For the readme description to reflect the behavior, the _MAX_ACKED_LENGTH would be 0, not 1000.

In the PR, you've set it to delete anything below the returned value, which is an efficient way to do it, but it changes the way the function behaves by default as you mentioned. The PR keeps the latest 1000 and delete everything else. That function is the exact opposite of the readme description, though from a feature perspective, I think this makes more sense.

Since you're concerned with causing confusion with the commit, it might be important to determine if you want the feature to reflect the readme description, the current behavior, or something new. But in each case, the values of limit and max_acked_length define the behavior. While less efficient than the PR, you could preserve your desired default by building it into the function as optional parameters the user can specify. This won't remove the limit from the query, but they can set it high enough to be irrelevant.

    import typing

    def clear_acked_data(self, max_delete: Optional[int]=1000, keep_latest: Optional[int]=1000):
        sql = """DELETE FROM {table_name}
            WHERE {key_column} IN (
                SELECT _id FROM {table_name} WHERE status = ?
                ORDER BY {key_column} DESC
                LIMIT {acked_to_delete} OFFSET {acked_to_keep}
            )""".format(table_name=self._table_name,
                        key_column=self._key_column,
                        acked_to_delete=max_delete)
                        acked_to_keep=keep_latest)
        return sql, AckStatus.acked

imidoriya added a commit to imidoriya/persist-queue that referenced this pull request Mar 31, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Clean DB
2 participants