Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fsrefs: Optimize IO (take 2) #340

Merged
merged 3 commits into from Mar 29, 2021
Merged

Conversation

navytux
Copy link
Contributor

@navytux navytux commented Mar 24, 2021

Access objects in the order of their position in file instead of in the order
of their OID. This should give dramatical speedup when data are on HDD.

For example @perrinjerome reports that on a 73Go database it takes
almost 8h to run fsrefs (where on the same database, fstest takes 15
minutes) [1,2]. After the patch fsrefs took ~80 minutes to run on the same
database. In other words this is ~ 6x improvement.

Fsrefs has no tests. I tested it only lightly via generating a bit
corrupt database with deleted referred object(*), and it gives the same
output as unmodified fsrefs.

oid 0x0 __main__.Object
last updated: 1979-01-03 21:00:42.900001, tid=0x285cbacb70a3db3
refers to invalid objects:
        oid 0x07 missing: '<unknown>'
        oid 0x07 object creation was undone: '<unknown>'

This "take 2" version is derived from #338
and only iterates objects in the order of their in-file position without
building complete references graph in-RAM, because that in-RAM graph would
consume ~12GB of memory.

Added pos2oid in-RAM index also consumes memory: for the 73GB database in
question fs._index takes ~700MB, while pos2oid takes ~2GB. In theory it could be less,
because we need only array of oid sorted by key(oid)=fs._index[oid]. However
array.array does not support sorting, and if we use plain list to keep just
[]oid, the memory consumption just for that list is ~5GB. Also because
list.sort(key=...) internally allocates memory for key array (and
list.sort(cmp=...) was removed from Python3), total memory consumption just to
produce list of []oid ordered by pos is ~10GB.
So without delving into C/Cython and/or manually sorting the array in Python (=
slow), using QQBTree seems to be the best out-of-the-box option for oid-by-pos index.

[1] https://lab.nexedi.com/nexedi/zodbtools/merge_requests/19#note_129480
[2] https://lab.nexedi.com/nexedi/zodbtools/merge_requests/19#note_129551

(*) test database generated via a bit modified gen_testdata.py from
zodbtools:

https://lab.nexedi.com/nexedi/zodbtools/blob/v0.0.0.dev8-28-g129afa6/zodbtools/test/gen_testdata.py

--- a/zodbtools/test/gen_testdata.py
+++ b/zodbtools/test/gen_testdata.py
@@ -229,7 +229,7 @@ def ext(subj): return {}
         # delete an object
         name = random.choice(list(root.keys()))
         obj = root[name]
-        root[name] = Object("%s%i*" % (name, i))
+#       root[name] = Object("%s%i*" % (name, i))
         # NOTE user/ext are kept empty on purpose - to also test this case
         commit(u"", u"predelete %s" % unpack64(obj._p_oid), {})

/cc @tim-one, @jeremyhylton, @jamadden

Closes #338.

@navytux navytux mentioned this pull request Mar 24, 2021
@navytux
Copy link
Contributor Author

navytux commented Mar 24, 2021

Below is debug output for RAM consumption produced using debug patch:

fsrefs: timings + RAM usage for 73GB database
[0.0s]  start:  VIRT: 160 MB    RSS: 20MB
[0.4s]  open:   VIRT: 816 MB    RSS: 675MB
# building pos2oid index ...
[86.5s] pos2oid:        VIRT: 3074 MB   RSS: 2933MB
# pass 1 ...
[108.1s]        1000000 / 85814607 (1.2 %):     VIRT: 3074 MB   RSS: 2933MB
[129.4s]        2000000 / 85814607 (2.3 %):     VIRT: 3078 MB   RSS: 2938MB
[151.9s]        3000000 / 85814607 (3.5 %):     VIRT: 3081 MB   RSS: 2940MB
[173.9s]        4000000 / 85814607 (4.7 %):     VIRT: 3081 MB   RSS: 2940MB
[196.6s]        5000000 / 85814607 (5.8 %):     VIRT: 3082 MB   RSS: 2941MB
[217.2s]        6000000 / 85814607 (7.0 %):     VIRT: 3082 MB   RSS: 2941MB
[239.5s]        7000000 / 85814607 (8.2 %):     VIRT: 3084 MB   RSS: 2943MB
[262.4s]        8000000 / 85814607 (9.3 %):     VIRT: 3084 MB   RSS: 2943MB
[284.9s]        9000000 / 85814607 (10.5 %):    VIRT: 3094 MB   RSS: 2953MB
[307.5s]        10000000 / 85814607 (11.7 %):   VIRT: 3094 MB   RSS: 2953MB
[328.8s]        11000000 / 85814607 (12.8 %):   VIRT: 3094 MB   RSS: 2953MB
[350.5s]        12000000 / 85814607 (14.0 %):   VIRT: 3094 MB   RSS: 2953MB
[371.8s]        13000000 / 85814607 (15.1 %):   VIRT: 3094 MB   RSS: 2953MB
[393.4s]        14000000 / 85814607 (16.3 %):   VIRT: 3094 MB   RSS: 2953MB
[415.0s]        15000000 / 85814607 (17.5 %):   VIRT: 3094 MB   RSS: 2953MB
[436.0s]        16000000 / 85814607 (18.6 %):   VIRT: 3094 MB   RSS: 2953MB
[457.3s]        17000000 / 85814607 (19.8 %):   VIRT: 3094 MB   RSS: 2953MB
[477.9s]        18000000 / 85814607 (21.0 %):   VIRT: 3094 MB   RSS: 2953MB
[498.6s]        19000000 / 85814607 (22.1 %):   VIRT: 3094 MB   RSS: 2953MB
[519.5s]        20000000 / 85814607 (23.3 %):   VIRT: 3094 MB   RSS: 2953MB
[540.6s]        21000000 / 85814607 (24.5 %):   VIRT: 3094 MB   RSS: 2953MB
[562.0s]        22000000 / 85814607 (25.6 %):   VIRT: 3094 MB   RSS: 2953MB
[582.9s]        23000000 / 85814607 (26.8 %):   VIRT: 3094 MB   RSS: 2953MB
[603.6s]        24000000 / 85814607 (28.0 %):   VIRT: 3094 MB   RSS: 2953MB
[624.5s]        25000000 / 85814607 (29.1 %):   VIRT: 3094 MB   RSS: 2953MB
[645.5s]        26000000 / 85814607 (30.3 %):   VIRT: 3094 MB   RSS: 2953MB
[666.4s]        27000000 / 85814607 (31.5 %):   VIRT: 3094 MB   RSS: 2953MB
[687.4s]        28000000 / 85814607 (32.6 %):   VIRT: 3094 MB   RSS: 2953MB
[708.6s]        29000000 / 85814607 (33.8 %):   VIRT: 3094 MB   RSS: 2953MB
[729.4s]        30000000 / 85814607 (35.0 %):   VIRT: 3094 MB   RSS: 2953MB
[750.3s]        31000000 / 85814607 (36.1 %):   VIRT: 3094 MB   RSS: 2953MB
[771.1s]        32000000 / 85814607 (37.3 %):   VIRT: 3094 MB   RSS: 2953MB
[792.5s]        33000000 / 85814607 (38.5 %):   VIRT: 3094 MB   RSS: 2953MB
[813.3s]        34000000 / 85814607 (39.6 %):   VIRT: 3094 MB   RSS: 2953MB
[834.2s]        35000000 / 85814607 (40.8 %):   VIRT: 3094 MB   RSS: 2953MB
[855.0s]        36000000 / 85814607 (42.0 %):   VIRT: 3094 MB   RSS: 2953MB
[875.8s]        37000000 / 85814607 (43.1 %):   VIRT: 3094 MB   RSS: 2953MB
[896.8s]        38000000 / 85814607 (44.3 %):   VIRT: 3094 MB   RSS: 2953MB
[917.7s]        39000000 / 85814607 (45.4 %):   VIRT: 3094 MB   RSS: 2953MB
[938.7s]        40000000 / 85814607 (46.6 %):   VIRT: 3094 MB   RSS: 2953MB
[959.7s]        41000000 / 85814607 (47.8 %):   VIRT: 3094 MB   RSS: 2953MB
[992.6s]        42000000 / 85814607 (48.9 %):   VIRT: 3094 MB   RSS: 2953MB
[1035.1s]       43000000 / 85814607 (50.1 %):   VIRT: 3094 MB   RSS: 2953MB
[1059.3s]       44000000 / 85814607 (51.3 %):   VIRT: 3094 MB   RSS: 2953MB
[1081.5s]       45000000 / 85814607 (52.4 %):   VIRT: 3094 MB   RSS: 2953MB
[1104.7s]       46000000 / 85814607 (53.6 %):   VIRT: 3094 MB   RSS: 2953MB
[1126.3s]       47000000 / 85814607 (54.8 %):   VIRT: 3094 MB   RSS: 2953MB
[1147.4s]       48000000 / 85814607 (55.9 %):   VIRT: 3094 MB   RSS: 2953MB
[1169.0s]       49000000 / 85814607 (57.1 %):   VIRT: 3094 MB   RSS: 2953MB
[1192.0s]       50000000 / 85814607 (58.3 %):   VIRT: 3094 MB   RSS: 2953MB
[1213.5s]       51000000 / 85814607 (59.4 %):   VIRT: 3094 MB   RSS: 2953MB
[1235.3s]       52000000 / 85814607 (60.6 %):   VIRT: 3094 MB   RSS: 2953MB
[1257.3s]       53000000 / 85814607 (61.8 %):   VIRT: 3094 MB   RSS: 2953MB
[1279.1s]       54000000 / 85814607 (62.9 %):   VIRT: 3094 MB   RSS: 2953MB
[1301.7s]       55000000 / 85814607 (64.1 %):   VIRT: 3094 MB   RSS: 2953MB
[1324.2s]       56000000 / 85814607 (65.3 %):   VIRT: 3094 MB   RSS: 2953MB
[1347.9s]       57000000 / 85814607 (66.4 %):   VIRT: 3094 MB   RSS: 2953MB
[1371.2s]       58000000 / 85814607 (67.6 %):   VIRT: 3094 MB   RSS: 2953MB
[1394.1s]       59000000 / 85814607 (68.8 %):   VIRT: 3094 MB   RSS: 2953MB
[1416.7s]       60000000 / 85814607 (69.9 %):   VIRT: 3094 MB   RSS: 2953MB
[1439.4s]       61000000 / 85814607 (71.1 %):   VIRT: 3094 MB   RSS: 2953MB
[1463.5s]       62000000 / 85814607 (72.2 %):   VIRT: 3094 MB   RSS: 2953MB
[1486.4s]       63000000 / 85814607 (73.4 %):   VIRT: 3094 MB   RSS: 2953MB
[1510.4s]       64000000 / 85814607 (74.6 %):   VIRT: 3094 MB   RSS: 2953MB
[1532.2s]       65000000 / 85814607 (75.7 %):   VIRT: 3094 MB   RSS: 2953MB
[1553.4s]       66000000 / 85814607 (76.9 %):   VIRT: 3094 MB   RSS: 2953MB
[1577.6s]       67000000 / 85814607 (78.1 %):   VIRT: 3094 MB   RSS: 2953MB
[1600.4s]       68000000 / 85814607 (79.2 %):   VIRT: 3094 MB   RSS: 2953MB
[1622.3s]       69000000 / 85814607 (80.4 %):   VIRT: 3094 MB   RSS: 2953MB
[1644.7s]       70000000 / 85814607 (81.6 %):   VIRT: 3094 MB   RSS: 2953MB
[1667.0s]       71000000 / 85814607 (82.7 %):   VIRT: 3094 MB   RSS: 2953MB
[1688.5s]       72000000 / 85814607 (83.9 %):   VIRT: 3094 MB   RSS: 2953MB
[1712.7s]       73000000 / 85814607 (85.1 %):   VIRT: 3094 MB   RSS: 2953MB
[1735.1s]       74000000 / 85814607 (86.2 %):   VIRT: 3094 MB   RSS: 2953MB
[1757.6s]       75000000 / 85814607 (87.4 %):   VIRT: 3094 MB   RSS: 2953MB
[1780.2s]       76000000 / 85814607 (88.6 %):   VIRT: 3094 MB   RSS: 2953MB
[1803.9s]       77000000 / 85814607 (89.7 %):   VIRT: 3094 MB   RSS: 2953MB
[1826.9s]       78000000 / 85814607 (90.9 %):   VIRT: 3094 MB   RSS: 2953MB
[1849.1s]       79000000 / 85814607 (92.1 %):   VIRT: 3094 MB   RSS: 2953MB
[1871.0s]       80000000 / 85814607 (93.2 %):   VIRT: 3094 MB   RSS: 2953MB
[1891.9s]       81000000 / 85814607 (94.4 %):   VIRT: 3094 MB   RSS: 2953MB
[1912.6s]       82000000 / 85814607 (95.6 %):   VIRT: 3094 MB   RSS: 2953MB
[1934.4s]       83000000 / 85814607 (96.7 %):   VIRT: 3094 MB   RSS: 2953MB
[1955.7s]       84000000 / 85814607 (97.9 %):   VIRT: 3094 MB   RSS: 2953MB
[1976.9s]       85000000 / 85814607 (99.1 %):   VIRT: 3094 MB   RSS: 2953MB

#noload: 0
#undone: 0

# pass 2 ...
[2029.7s]       1000000 / 85814607 (1.2 %):     VIRT: 3100 MB   RSS: 2960MB
[2062.2s]       2000000 / 85814607 (2.3 %):     VIRT: 3100 MB   RSS: 2960MB
[2096.8s]       3000000 / 85814607 (3.5 %):     VIRT: 3100 MB   RSS: 2960MB
[2132.6s]       4000000 / 85814607 (4.7 %):     VIRT: 3100 MB   RSS: 2960MB
[2166.7s]       5000000 / 85814607 (5.8 %):     VIRT: 3100 MB   RSS: 2960MB
[2199.4s]       6000000 / 85814607 (7.0 %):     VIRT: 3100 MB   RSS: 2960MB
[2234.5s]       7000000 / 85814607 (8.2 %):     VIRT: 3100 MB   RSS: 2960MB
[2272.3s]       8000000 / 85814607 (9.3 %):     VIRT: 3100 MB   RSS: 2960MB
[2306.8s]       9000000 / 85814607 (10.5 %):    VIRT: 3113 MB   RSS: 2972MB
[2345.0s]       10000000 / 85814607 (11.7 %):   VIRT: 3113 MB   RSS: 2972MB
[2378.4s]       11000000 / 85814607 (12.8 %):   VIRT: 3113 MB   RSS: 2972MB
[2411.8s]       12000000 / 85814607 (14.0 %):   VIRT: 3113 MB   RSS: 2972MB
[2447.6s]       13000000 / 85814607 (15.1 %):   VIRT: 3113 MB   RSS: 2972MB
[2481.0s]       14000000 / 85814607 (16.3 %):   VIRT: 3113 MB   RSS: 2972MB
[2515.6s]       15000000 / 85814607 (17.5 %):   VIRT: 3113 MB   RSS: 2972MB
[2548.6s]       16000000 / 85814607 (18.6 %):   VIRT: 3113 MB   RSS: 2972MB
[2582.3s]       17000000 / 85814607 (19.8 %):   VIRT: 3113 MB   RSS: 2972MB
[2615.0s]       18000000 / 85814607 (21.0 %):   VIRT: 3113 MB   RSS: 2972MB
[2647.6s]       19000000 / 85814607 (22.1 %):   VIRT: 3113 MB   RSS: 2972MB
[2680.2s]       20000000 / 85814607 (23.3 %):   VIRT: 3113 MB   RSS: 2972MB
[2713.1s]       21000000 / 85814607 (24.5 %):   VIRT: 3113 MB   RSS: 2972MB
[2750.9s]       22000000 / 85814607 (25.6 %):   VIRT: 3113 MB   RSS: 2972MB
[2783.4s]       23000000 / 85814607 (26.8 %):   VIRT: 3113 MB   RSS: 2972MB
[2816.4s]       24000000 / 85814607 (28.0 %):   VIRT: 3113 MB   RSS: 2972MB
[2849.2s]       25000000 / 85814607 (29.1 %):   VIRT: 3113 MB   RSS: 2972MB
[2882.4s]       26000000 / 85814607 (30.3 %):   VIRT: 3113 MB   RSS: 2972MB
[2915.8s]       27000000 / 85814607 (31.5 %):   VIRT: 3113 MB   RSS: 2972MB
[2949.3s]       28000000 / 85814607 (32.6 %):   VIRT: 3113 MB   RSS: 2972MB
[2983.6s]       29000000 / 85814607 (33.8 %):   VIRT: 3113 MB   RSS: 2972MB
[3016.4s]       30000000 / 85814607 (35.0 %):   VIRT: 3113 MB   RSS: 2972MB
[3049.5s]       31000000 / 85814607 (36.1 %):   VIRT: 3113 MB   RSS: 2972MB
[3082.7s]       32000000 / 85814607 (37.3 %):   VIRT: 3113 MB   RSS: 2972MB
[3116.8s]       33000000 / 85814607 (38.5 %):   VIRT: 3113 MB   RSS: 2972MB
[3150.0s]       34000000 / 85814607 (39.6 %):   VIRT: 3113 MB   RSS: 2972MB
[3183.5s]       35000000 / 85814607 (40.8 %):   VIRT: 3113 MB   RSS: 2972MB
[3216.8s]       36000000 / 85814607 (42.0 %):   VIRT: 3113 MB   RSS: 2972MB
[3250.0s]       37000000 / 85814607 (43.1 %):   VIRT: 3113 MB   RSS: 2972MB
[3283.6s]       38000000 / 85814607 (44.3 %):   VIRT: 3113 MB   RSS: 2972MB
[3316.8s]       39000000 / 85814607 (45.4 %):   VIRT: 3113 MB   RSS: 2972MB
[3350.8s]       40000000 / 85814607 (46.6 %):   VIRT: 3113 MB   RSS: 2972MB
[3384.3s]       41000000 / 85814607 (47.8 %):   VIRT: 3113 MB   RSS: 2972MB
[3427.4s]       42000000 / 85814607 (48.9 %):   VIRT: 3113 MB   RSS: 2972MB
[3474.3s]       43000000 / 85814607 (50.1 %):   VIRT: 3113 MB   RSS: 2972MB
[3512.2s]       44000000 / 85814607 (51.3 %):   VIRT: 3113 MB   RSS: 2972MB
[3546.9s]       45000000 / 85814607 (52.4 %):   VIRT: 3113 MB   RSS: 2972MB
[3584.1s]       46000000 / 85814607 (53.6 %):   VIRT: 3113 MB   RSS: 2972MB
[3617.8s]       47000000 / 85814607 (54.8 %):   VIRT: 3113 MB   RSS: 2972MB
[3651.5s]       48000000 / 85814607 (55.9 %):   VIRT: 3113 MB   RSS: 2972MB
[3685.9s]       49000000 / 85814607 (57.1 %):   VIRT: 3113 MB   RSS: 2972MB
[3721.7s]       50000000 / 85814607 (58.3 %):   VIRT: 3113 MB   RSS: 2972MB
[3759.5s]       51000000 / 85814607 (59.4 %):   VIRT: 3113 MB   RSS: 2972MB
[3793.3s]       52000000 / 85814607 (60.6 %):   VIRT: 3113 MB   RSS: 2972MB
[3827.1s]       53000000 / 85814607 (61.8 %):   VIRT: 3113 MB   RSS: 2972MB
[3861.2s]       54000000 / 85814607 (62.9 %):   VIRT: 3113 MB   RSS: 2972MB
[3895.7s]       55000000 / 85814607 (64.1 %):   VIRT: 3113 MB   RSS: 2972MB
[3930.1s]       56000000 / 85814607 (65.3 %):   VIRT: 3113 MB   RSS: 2972MB
[3965.3s]       57000000 / 85814607 (66.4 %):   VIRT: 3113 MB   RSS: 2972MB
[4000.1s]       58000000 / 85814607 (67.6 %):   VIRT: 3113 MB   RSS: 2972MB
[4034.4s]       59000000 / 85814607 (68.8 %):   VIRT: 3113 MB   RSS: 2972MB
[4069.0s]       60000000 / 85814607 (69.9 %):   VIRT: 3113 MB   RSS: 2972MB
[4103.6s]       61000000 / 85814607 (71.1 %):   VIRT: 3113 MB   RSS: 2972MB
[4140.1s]       62000000 / 85814607 (72.2 %):   VIRT: 3113 MB   RSS: 2972MB
[4175.9s]       63000000 / 85814607 (73.4 %):   VIRT: 3114 MB   RSS: 2973MB
[4213.6s]       64000000 / 85814607 (74.6 %):   VIRT: 3114 MB   RSS: 2973MB
[4249.0s]       65000000 / 85814607 (75.7 %):   VIRT: 3114 MB   RSS: 2973MB
[4282.8s]       66000000 / 85814607 (76.9 %):   VIRT: 3114 MB   RSS: 2973MB
[4318.7s]       67000000 / 85814607 (78.1 %):   VIRT: 3114 MB   RSS: 2973MB
[4353.0s]       68000000 / 85814607 (79.2 %):   VIRT: 3114 MB   RSS: 2973MB
[4387.0s]       69000000 / 85814607 (80.4 %):   VIRT: 3114 MB   RSS: 2973MB
[4421.7s]       70000000 / 85814607 (81.6 %):   VIRT: 3114 MB   RSS: 2973MB
[4455.9s]       71000000 / 85814607 (82.7 %):   VIRT: 3114 MB   RSS: 2973MB
[4490.0s]       72000000 / 85814607 (83.9 %):   VIRT: 3114 MB   RSS: 2973MB
[4525.2s]       73000000 / 85814607 (85.1 %):   VIRT: 3114 MB   RSS: 2973MB
[4559.4s]       74000000 / 85814607 (86.2 %):   VIRT: 3114 MB   RSS: 2973MB
[4594.5s]       75000000 / 85814607 (87.4 %):   VIRT: 3114 MB   RSS: 2973MB
[4629.7s]       76000000 / 85814607 (88.6 %):   VIRT: 3114 MB   RSS: 2973MB
[4666.4s]       77000000 / 85814607 (89.7 %):   VIRT: 3114 MB   RSS: 2973MB
[4701.9s]       78000000 / 85814607 (90.9 %):   VIRT: 3114 MB   RSS: 2973MB
[4736.6s]       79000000 / 85814607 (92.1 %):   VIRT: 3114 MB   RSS: 2973MB
[4771.4s]       80000000 / 85814607 (93.2 %):   VIRT: 3114 MB   RSS: 2973MB
[4804.5s]       81000000 / 85814607 (94.4 %):   VIRT: 3114 MB   RSS: 2973MB
[4836.7s]       82000000 / 85814607 (95.6 %):   VIRT: 3114 MB   RSS: 2973MB
[4871.1s]       83000000 / 85814607 (96.7 %):   VIRT: 3114 MB   RSS: 2973MB
[4905.1s]       84000000 / 85814607 (97.9 %):   VIRT: 3114 MB   RSS: 2973MB
[4940.0s]       85000000 / 85814607 (99.1 %):   VIRT: 3114 MB   RSS: 2973MB

@navytux
Copy link
Contributor Author

navytux commented Mar 24, 2021

(+changelog)

@navytux
Copy link
Contributor Author

navytux commented Mar 26, 2021

@jamadden, I believe this patch should be ok to go in. Would you please have a look?

@navytux navytux requested a review from jamadden March 26, 2021 15:22
@perrinjerome
Copy link
Contributor

I am a bit lost in the figures from the commit message:

the memory consumption just for that list is ~5MB ... total memory consumption just to produce list of []oid ordered by pos is ~10GB.

Isn't it 5GB ?

Access objects in the order of their position in file instead of in the order
of their OID. This should give dramatical speedup when data are on HDD.

For example @perrinjerome reports that on a 73Go database it takes
almost 8h to run fsrefs (where on the same database, fstest takes 15
minutes) [1,2]. After the patch fsrefs took ~80 minutes to run on the same
database. In other words this is ~ 6x improvement.

Fsrefs has no tests. I tested it only lightly via generating a bit
corrupt database with deleted referred object(*), and it gives the same
output as unmodified fsrefs.

    oid 0x0 __main__.Object
    last updated: 1979-01-03 21:00:42.900001, tid=0x285cbacb70a3db3
    refers to invalid objects:
            oid 0x07 missing: '<unknown>'
            oid 0x07 object creation was undone: '<unknown>'

This "take 2" version is derived from zopefoundation#338
and only iterates objects in the order of their in-file position without
building complete references graph in-RAM, because that in-RAM graph would
consume ~12GB of memory.

Added pos2oid in-RAM index also consumes memory: for the 73GB database in
question fs._index takes ~700MB, while pos2oid takes ~2GB. In theory it could be less,
because we need only array of oid sorted by key(oid)=fs._index[oid]. However
array.array does not support sorting, and if we use plain list to keep just
[]oid, the memory consumption just for that list is ~5GB. Also because
list.sort(key=...) internally allocates memory for key array (and
list.sort(cmp=...) was removed from Python3), total memory consumption just to
produce list of []oid ordered by pos is ~10GB.
So without delving into C/Cython and/or manually sorting the array in Python (=
slow), using QQBTree seems to be the best out-of-the-box option for oid-by-pos index.

[1] https://lab.nexedi.com/nexedi/zodbtools/merge_requests/19#note_129480
[2] https://lab.nexedi.com/nexedi/zodbtools/merge_requests/19#note_129551

(*) test database generated via a bit modified gen_testdata.py from
zodbtools:

https://lab.nexedi.com/nexedi/zodbtools/blob/v0.0.0.dev8-28-g129afa6/zodbtools/test/gen_testdata.py

+

```diff
--- a/zodbtools/test/gen_testdata.py
+++ b/zodbtools/test/gen_testdata.py
@@ -229,7 +229,7 @@ def ext(subj): return {}
         # delete an object
         name = random.choice(list(root.keys()))
         obj = root[name]
-        root[name] = Object("%s%i*" % (name, i))
+#       root[name] = Object("%s%i*" % (name, i))
         # NOTE user/ext are kept empty on purpose - to also test this case
         commit(u"", u"predelete %s" % unpack64(obj._p_oid), {})
```

/cc @tim-one, @jeremyhylton, @jamadden
Provide changelog entry.
@navytux
Copy link
Contributor Author

navytux commented Mar 29, 2021

@perrinjerome, right, that should be 5GB. I've corrected the description. Thanks for spotting this.

@navytux
Copy link
Contributor Author

navytux commented Mar 29, 2021

( force-pushed to correct commit message as well )

Copy link
Member

@jamadden jamadden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. It's a very simple change just to order disk seeking.

I had some very minor comments about the change note and some leftover commented code.

CHANGES.rst Outdated
Comment on lines 11 to 12
- Rework `fsrefs` script to work significantly faster by optimizing how it does
IO. See `PR 340 <https://github.com/zopefoundation/ZODB/pull/340>`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Rework `fsrefs` script to work significantly faster by optimizing how it does
IO. See `PR 340 <https://github.com/zopefoundation/ZODB/pull/340>`.
- Rework ``fsrefs`` script to work significantly faster by optimizing how it does
IO. See `PR 340 <https://github.com/zopefoundation/ZODB/pull/340>`_.

Double-backtictks are literal code; single backticks are the default object reference.

Links need a trailing underscore to be valid.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for catching this and sorry I did not verify that rst -> html conversion works correctly after my edits. I've integrated your suggestions into this pull-requst.

By the way, now, after verifying rst, I see this:

(neo) (z4-dev) (g.env) kirr@deco:~/src/wendelin/z/ZODB$ rst2html --link-stylesheet CHANGES.rst x.html
CHANGES.rst:3: (WARNING/2) Duplicate explicit target name: "issue 268".

The warning it emits is about the following text where link for issue 286 seems to be incorrect:

- Fix UnboundLocalError when running fsoids.py script.
  See `issue 268 <https://github.com/zopefoundation/ZODB/issues/285>`_.

It should be changed to

--- a/CHANGES.rst
+++ b/CHANGES.rst
@@ -6,7 +6,7 @@
 ==================
 
 - Fix UnboundLocalError when running fsoids.py script.
-  See `issue 268 <https://github.com/zopefoundation/ZODB/issues/285>`_.
+  See `issue 285 <https://github.com/zopefoundation/ZODB/issues/285>`_.
 
 - Rework ``fsrefs`` script to work significantly faster by optimizing how it does
   IO. See `PR 340 <https://github.com/zopefoundation/ZODB/pull/340>`_.

If it is ok I can do and push this fix to master after merging this PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, though it'd be fine to just include the fix here.

FWIW, I usually use @mgedmin 's restview to verify rendering because it has an option that makes it do what PyPI does, and because it checks the contents of the markup as they will be seen by PyPI — many zopefoundation projects, including ZODB, create this output by concatenating README.rst and CHANGES.rst.

$ restview --long --pypi 
Listening on http://localhost:62676/
127.0.0.1 - - [29/Mar/2021 07:26:22] "GET / HTTP/1.1" 200 -

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for restview link. To do the verification I used to do something like

$ python setup.py --long-description |rst2html - x.html

but restview indeed might be more convenient.

Let's have changelog fixup coming as separate patch (out of this PR) not to mix things.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've fixed "issue 268" in 2798502e. Hope it is ok.

src/ZODB/scripts/fsrefs.py Outdated Show resolved Hide resolved
src/ZODB/scripts/fsrefs.py Outdated Show resolved Hide resolved
src/ZODB/scripts/fsrefs.py Outdated Show resolved Hide resolved
- remove debug prints;
- fix link and literal markup in the changelog.

Addresses review feedback provided by @jamadden.
@navytux
Copy link
Contributor Author

navytux commented Mar 29, 2021

@jamadden, thanks. I've amended the PR with the changes you suggested.

@navytux
Copy link
Contributor Author

navytux commented Mar 29, 2021

Tests are green. @jamadden, should I wait for you to merge this work, or should it be me to merge/apply this by myself?

@navytux
Copy link
Contributor Author

navytux commented Mar 29, 2021

(if by myself - is it ok to "squash and merge" - i.e. to produce only one commit coming out of this PR ?)

@jamadden
Copy link
Member

The convention in zopefoundation repos is that the author of the PR commits after getting approval and green CI.

Squashing the commits is fine if there is no useful data in the intermediate steps; here, I would have kept the revision that uses the in-memory dicts as that is a useful record of something that was tried and didn't work.

@navytux
Copy link
Contributor Author

navytux commented Mar 29, 2021

Thanks, I see. This PR does not have in-RAM dict as intermediate step as I bootstrapped this PR from scratch (i.e. from master) without taking #338 commit history not to create changes noise. On the other hand the commit message links to #338 and explicitly describes what was tried there and not taken here. This way if one investigate through git blame the history of edits, there is backpointer to in-RAM graph version as well.

@navytux navytux merged commit 7907804 into zopefoundation:master Mar 29, 2021
@navytux
Copy link
Contributor Author

navytux commented Mar 29, 2021

Patch applied as 7907804. Hope it is ok. Thanks for taking the time to review it and on #338.

@navytux navytux deleted the y/fsrefs-opt-2 branch March 29, 2021 13:10
navytux added a commit to navytux/ZODB that referenced this pull request Oct 29, 2021
to resolve trivial conflict on CHANGES.rst

* origin/master: (22 commits)
  Fix TypeError for fsoids (zopefoundation#351)
  Fix deprecation warnings occurring on Python 3.10.
  fix more PY3 incompatibilities in `fsstats`
  fix Python 3 incompatibility for `fsstats`
  add `fsdump/fsstats` test
  fsdump/fsstats improvements
  - add coverage combine step
  - first cut moving tests from Travis CI to GitHub Actions
  - ignore virtualenv artifacts [ci skip]
  tests: Run race-related tests with high frequency of switches between threads
  tests: Add test for load vs external invalidation race
  tests: Add test for open vs invalidation race
  fixup! doc/requirements: Require pygments < 2.6 on py2
  doc/requirements: Require pygments < 2.6 on py2
  fixup! buildout: Fix Sphinx install on Python2
  buildout: Fix Sphinx install on Python2
  Update README.rst
  Security fix documentation dependencies (zopefoundation#342)
  changes: Correct link to UnboundLocalError fsoids.py fix
  fsrefs: Optimize IO  (take 2) (zopefoundation#340)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants