[feature] Add bulk_operations invalidation limit #31

donaldong · 2021-02-11T00:40:01Z

Summary

This adds a limit to the number of records we would select from the database. Previously I thought we're selecting by uniq_by columns -- I was confused and wrong. We're actually selecting by the columns_to_update so the number of records can be super big.

This PR:

Sets such limit. If we're going to load many records; simply invalidate everything
We only care about the columns_to_update if they overlap with the columns we're memorizing
Adds monitoring

Test Plan

ci

Pull Request Branch: donaldong/feature_add_bulk_operat97f3f74 Pull Request Link: #31

donaldong · 2021-02-11T00:46:26Z

update

Pull Request Branch: donaldong/feature_add_bulk_operat97f3f74 Pull Request Link: #31

codecov-io · 2021-02-11T00:49:02Z

Codecov Report

Merging #31 (df61e80) into base/donaldong/feature_add_bulk_o76d8ee5 (2b8173f) will decrease coverage by 0.08%.
The diff coverage is 94.87%.

@@                             Coverage Diff                              @@
##           base/donaldong/feature_add_bulk_o76d8ee5      #31      +/-   ##
============================================================================
- Coverage                                     97.05%   96.97%   -0.09%     
============================================================================
  Files                                            32       32              
  Lines                                          1901     1916      +15     
============================================================================
+ Hits                                           1845     1858      +13     
- Misses                                           56       58       +2

Impacted Files	Coverage Δ
lib/redis_memo/memoize_query/invalidation.rb	`93.82% <93.54%> (-1.77%)`	⬇️
lib/redis_memo/memoize_query.rb	`98.52% <100.00%> (+0.02%)`	⬆️
lib/redis_memo/options.rb	`85.71% <100.00%> (+0.29%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2b8173f...df61e80. Read the comment docs.

Pull Request Branch: donaldong/feature_add_bulk_operat97f3f74 Pull Request Link: #31

donaldong · 2021-02-11T01:12:10Z

add a test case

Pull Request Branch: donaldong/feature_add_bulk_operat97f3f74 Pull Request Link: #31

katyho · 2021-02-11T16:43:39Z

lib/redis_memo/memoize_query/invalidation.rb

-        or_chain = or_chain.or(model_class.where(conditions))
+
+      record_count = RedisMemo.without_memo { or_chain.count }
+      if record_count > bulk_operations_invalidation_limit


I wonder if it'd be better to just look at the total # of records being imported, to avoid this extra query / extra computation to iterate through all the records.

extra computation to iterate through all the records.

Well this will send a SELECT count(*) query.

By sending this query we can avoid the expensive payload transfer and activerecord record installation.

I wonder if it'd be better to just look at the total # of records being imported

I don't think this would do. For example, if we on_duplicate_key_update: [:visibility], and we are also memorizing visibility, we could still be getting a lot of records back from the database

Actually, now that I think about it, we can still query by uniq_by if we're really querying by uniq_by instead of those columns to update! That should be more efficient and we don't need to worry about the size limit since the uniq_by set is smaller or equal to the record set we're importing.

nvm -- we cannot really query by uniq_by only because the uniq_by value might or might not exist -- it could be something filled by the database. So yeah I think query by the columns_to_update is still the best we can do.

the best we can do

not really the best we can do, but I think it's good enough for now. Let's save more optimizations for later

katyho · 2021-02-11T16:54:13Z

lib/redis_memo/memoize_query/invalidation.rb

-      conditions = {}
-      unique_by.each do |column|
-        conditions[column] = record.send(column)
+    columns_to_select = columns_to_update & RedisMemo::MemoizeQuery


Can we add unit tests / modify existing ones to check the logic we originally missed here?

E.g. if we're importing Site.import(records, on_duplicate_key_update: [:a, :b]), but we only memoize the column :a, we shouldn't be querying / invalidating :b

Good idea! Will add this test case.

Pull Request Branch: donaldong/feature_add_bulk_operat97f3f74 Pull Request Link: #31

donaldong · 2021-02-11T18:57:17Z

modify test case

Pull Request Branch: donaldong/feature_add_bulk_operat97f3f74 Pull Request Link: #31

### Features - Support Rails 6 versions - Support memoizing a method conditionally (#28) - Add Redis connection pool (#29) - Add bulk_operations invalidation limit (#31) - Add an env var to skip memoize_table_column (#30) ### Bug Fixes - Avoid fetching too many records for bulk_operations invalidation (#31)

donaldong changed the base branch from main to base/donaldong/feature_add_bulk_o76d8ee5 February 11, 2021 00:40

donaldong added a commit that referenced this pull request Feb 11, 2021

[feature] Add bulk_operations invalidation limit

5525c7a

Pull Request Branch: donaldong/feature_add_bulk_operat97f3f74 Pull Request Link: #31

donaldong force-pushed the donaldong/feature_add_bulk_operat97f3f74 branch from a8598a2 to 5525c7a Compare February 11, 2021 00:40

donaldong requested a review from katyho February 11, 2021 00:46

donaldong added a commit that referenced this pull request Feb 11, 2021

[feature] Add bulk_operations invalidation limit

0091543

Pull Request Branch: donaldong/feature_add_bulk_operat97f3f74 Pull Request Link: #31

donaldong added a commit that referenced this pull request Feb 11, 2021

[feature] Add bulk_operations invalidation limit

df61e80

Pull Request Branch: donaldong/feature_add_bulk_operat97f3f74 Pull Request Link: #31

donaldong force-pushed the donaldong/feature_add_bulk_operat97f3f74 branch from 5525c7a to df61e80 Compare February 11, 2021 00:46

donaldong added a commit that referenced this pull request Feb 11, 2021

[feature] Add bulk_operations invalidation limit

6a88210

Pull Request Branch: donaldong/feature_add_bulk_operat97f3f74 Pull Request Link: #31

donaldong force-pushed the base/donaldong/feature_add_bulk_o76d8ee5 branch from 2b8173f to d1f5d54 Compare February 11, 2021 01:12

donaldong added a commit that referenced this pull request Feb 11, 2021

[feature] Add bulk_operations invalidation limit

7772396

Pull Request Branch: donaldong/feature_add_bulk_operat97f3f74 Pull Request Link: #31

donaldong force-pushed the donaldong/feature_add_bulk_operat97f3f74 branch from df61e80 to 7772396 Compare February 11, 2021 01:12

katyho approved these changes Feb 11, 2021

View reviewed changes

donaldong added a commit that referenced this pull request Feb 11, 2021

[feature] Add bulk_operations invalidation limit

406eeee

Pull Request Branch: donaldong/feature_add_bulk_operat97f3f74 Pull Request Link: #31

donaldong added a commit that referenced this pull request Feb 11, 2021

[feature] Add bulk_operations invalidation limit

7687c3a

Pull Request Branch: donaldong/feature_add_bulk_operat97f3f74 Pull Request Link: #31

donaldong force-pushed the donaldong/feature_add_bulk_operat97f3f74 branch from 7772396 to 7687c3a Compare February 11, 2021 18:57

[feature] Add bulk_operations invalidation limit

2c80ca1

Pull Request Branch: donaldong/feature_add_bulk_operat97f3f74 Pull Request Link: #31

donaldong changed the base branch from base/donaldong/feature_add_bulk_o76d8ee5 to main February 11, 2021 19:02

donaldong force-pushed the donaldong/feature_add_bulk_operat97f3f74 branch from 7687c3a to 2c80ca1 Compare February 11, 2021 19:02

donaldong merged commit 27da298 into main Feb 11, 2021

donaldong deleted the donaldong/feature_add_bulk_operat97f3f74 branch February 11, 2021 19:06

donaldong mentioned this pull request Feb 11, 2021

Release v0.0.0.beta.4 #34

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[feature] Add bulk_operations invalidation limit #31

[feature] Add bulk_operations invalidation limit #31

donaldong commented Feb 11, 2021 •

edited

Loading

donaldong commented Feb 11, 2021

codecov-io commented Feb 11, 2021 •

edited

Loading

donaldong commented Feb 11, 2021

katyho Feb 11, 2021

donaldong Feb 11, 2021

donaldong Feb 11, 2021

donaldong Feb 11, 2021

donaldong Feb 11, 2021

katyho Feb 11, 2021

donaldong Feb 11, 2021

donaldong commented Feb 11, 2021

[feature] Add bulk_operations invalidation limit #31

[feature] Add bulk_operations invalidation limit #31

Conversation

donaldong commented Feb 11, 2021 • edited Loading

Summary

Test Plan

donaldong commented Feb 11, 2021

codecov-io commented Feb 11, 2021 • edited Loading

Codecov Report

donaldong commented Feb 11, 2021

katyho Feb 11, 2021

Choose a reason for hiding this comment

donaldong Feb 11, 2021

Choose a reason for hiding this comment

donaldong Feb 11, 2021

Choose a reason for hiding this comment

donaldong Feb 11, 2021

Choose a reason for hiding this comment

donaldong Feb 11, 2021

Choose a reason for hiding this comment

katyho Feb 11, 2021

Choose a reason for hiding this comment

donaldong Feb 11, 2021

Choose a reason for hiding this comment

donaldong commented Feb 11, 2021

donaldong commented Feb 11, 2021 •

edited

Loading

codecov-io commented Feb 11, 2021 •

edited

Loading