Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix redis lua script errors #259

Closed
wants to merge 2 commits into from
Closed

Conversation

benekastah
Copy link
Contributor

@benekastah benekastah commented Jan 7, 2020

Calling unpack on a list with 7999 items or more causes lua to throw an error (reason why is in the code comments). This will fix errors of that kind in the lua script (primarily encountered in the do_maintenance branch).

At least partially fixes #258 (maybe fixes the whole thing; time will tell).

@Bogdanp
Copy link
Owner

@Bogdanp Bogdanp commented Jan 7, 2020

Thanks! I'll dig into this on Saturday.

@benekastah
Copy link
Contributor Author

@benekastah benekastah commented Jan 7, 2020

@Bogdanp I may have misread the test failure output, but it looks like the redis tests are failing because it can't persist to disk. Maybe the disk is full?

@@ -75,6 +75,42 @@ for i=8,#ARGV do
ARGS[i - 7] = ARGV[i]
end


-- In my testing I found the max size that we can unpack to be 7999 (which
-- is LUAI_MAXCSTACK - 1). Lua can be built with a different LUAI_MAXCSTACK
Copy link
Owner

@Bogdanp Bogdanp Jan 12, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any way we can get this exact value at runtime?

Copy link
Contributor Author

@benekastah benekastah Jan 12, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately it's only exposed in lua's C api. AFAICT, there isn't a way to get those from lua code.

@Bogdanp
Copy link
Owner

@Bogdanp Bogdanp commented Jan 12, 2020

Maybe the disk is full?

Most likely, the process is not allowed to write to disk and the test changes that you've made push it past the number of commands (10k by default?) it can run before persisting. I'll see what I can do about it next weekend.

Have you started running these changes in production? If so, do they seem to have fixed the problems you ran into?

This looks good, but I didn't get a chance to dive in as deep as I would've liked this weekend so I'm going to hold off merging until the next weekend.

@benekastah
Copy link
Contributor Author

@benekastah benekastah commented Jan 12, 2020

I haven't run this in production yet. Instead we temporarily set the maintenance chance to 0 since it was consistently failing for us. We may try running this in production next week, as well as the other PR I put up (which I think has a better chance of fixing our "lazy worker" issue).

I did run the new tests I wrote on the old code and verified that it failed, so the fact that they pass for me convinces me it will get the maintenance code working again.

ETA: the other PR I mentioned is #262

@Bogdanp
Copy link
Owner

@Bogdanp Bogdanp commented Jan 20, 2020

Thanks! I made a couple changes on top of this and fixed the CI and the changes are now in master. They'll be release with 1.8.

@Bogdanp Bogdanp closed this Jan 20, 2020
@Bogdanp Bogdanp added this to the v1.8.0 milestone Jan 20, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants