Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gossip store optimizations II: the gossipd parts #2545

Merged

Commits on Apr 9, 2019

  1. gossipd: use u32 for timestamp.

    We used an s64 so we could use -1 and save a check, but that's just
    silly as we have adjacent non-u64 fields: wastes 7 bytes per node
    and 16 per channel.
    
    Interestingly, this seemed to make us a little slower for some reason.
    
    MCP results from 5 runs, min-max(mean +/- stddev):
    	store_load_msec:35569-38776(37169.8+/-1.2e+03)
    	vsz_kb:2621808
    	store_rewrite_sec:35.870000-40.290000(38.14+/-1.6)
    	listnodes_sec:0.740000-0.800000(0.768+/-0.023)
    	listchannels_sec:29.820000-32.730000(30.972+/-0.99)
    	routing_sec:30.110000-30.590000(30.346+/-0.18)
    	peer_write_all_sec:52.420000-59.160000(54.692+/-2.5)
    
    MCP notable changes from previous patch (>1 stddev):
    	-store_load_msec:32825-36365(34615.6+/-1.1e+03)
    	+store_load_msec:35569-38776(37169.8+/-1.2e+03)
    	-vsz_kb:2637488
    	+vsz_kb:2621808
    	-store_rewrite_sec:35.150000-36.200000(35.59+/-0.4)
    	+store_rewrite_sec:35.870000-40.290000(38.14+/-1.6)
    	-listnodes_sec:0.590000-0.710000(0.682+/-0.046)
    	+listnodes_sec:0.740000-0.800000(0.768+/-0.023)
    	-peer_write_all_sec:49.020000-52.890000(50.376+/-1.5)
    	+peer_write_all_sec:52.420000-59.160000(54.692+/-2.5)
    
    Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
    rustyrussell committed Apr 9, 2019
    Copy the full SHA
    486ce62 View commit details
    Browse the repository at this point in the history

Commits on Apr 10, 2019

  1. gossip_store: avoid gratuitous copy on load.

    Doesn't make measurable difference, but an obvious optimization.
    
    Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
    rustyrussell committed Apr 10, 2019
    Copy the full SHA
    46323ea View commit details
    Browse the repository at this point in the history
  2. gossipd: embed broadcast information into each structure.

    This is more compact, but also required once we replace the arbitrary
    "index" with an actual offset into the gossip store.  That will let us
    remove the in-memory variants entirely.
    
    MCP results from 5 runs, min-max(mean +/- stddev):
    	store_load_msec:35685-38538(37090.4+/-9.1e+02)
    	vsz_kb:2288768
    	store_rewrite_sec:35.530000-41.230000(37.904+/-2.3)
    	listnodes_sec:0.720000-0.810000(0.762+/-0.041)
    	listchannels_sec:30.750000-35.990000(32.704+/-2)
    	routing_sec:29.570000-34.010000(31.374+/-1.8)
    	peer_write_all_sec:51.140000-58.350000(55.69+/-2.4)
    
    MCP notable changes from previous patch (>1 stddev):
    	-vsz_kb:2621808
    	+vsz_kb:2288768
    
    Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
    rustyrussell committed Apr 10, 2019
    Copy the full SHA
    267eef7 View commit details
    Browse the repository at this point in the history
  3. gossipd: make gossip loading stats accurate.

    They didn't count the header sizes when reporting bytes, which is
    misleading.
    
    Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
    rustyrussell committed Apr 10, 2019
    Copy the full SHA
    5867d45 View commit details
    Browse the repository at this point in the history
  4. gossipd: use file offset within store as broadcast index.

    Instead of an arbitrary counter, we can use the file offset for our
    partial ordering, removing a field.  It takes some care when we compact
    the store, however, as this field changes.
    
    MCP results from 5 runs, min-max(mean +/- stddev):
    	store_load_msec:34271-35283(34789.6+/-3.3e+02)
    	vsz_kb:2288784
    	store_rewrite_sec:38.060000-39.130000(38.426+/-0.39)
    	listnodes_sec:0.750000-0.850000(0.794+/-0.042)
    	listchannels_sec:30.740000-31.760000(31.096+/-0.35)
    	routing_sec:29.600000-33.560000(30.472+/-1.5)
    	peer_write_all_sec:49.220000-52.690000(50.892+/-1.3)
    
    MCP notable changes from previous patch (>1 stddev):
    	-store_load_msec:35685-38538(37090.4+/-9.1e+02)
    	+store_load_msec:34271-35283(34789.6+/-3.3e+02)
    	-vsz_kb:2288768
    	+vsz_kb:2288784
    	-peer_write_all_sec:51.140000-58.350000(55.69+/-2.4)
    	+peer_write_all_sec:49.220000-52.690000(50.892+/-1.3)
    
    Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
    rustyrussell committed Apr 10, 2019
    Copy the full SHA
    d656746 View commit details
    Browse the repository at this point in the history
  5. gossipd: gossip_store_get() to load a single store entry.

    This will allow us to load on demand, and not keep all messages in
    memory.
    
    Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
    rustyrussell committed Apr 10, 2019
    Copy the full SHA
    65ae60b View commit details
    Browse the repository at this point in the history
  6. broadcast: don't keep payload pointer.

    If we need the payload, pull it from the gossip store.
    
    MCP results from 5 runs, min-max(mean +/- stddev):
    	store_load_msec:30189-52561(39416.4+/-8.8e+03)
    	vsz_kb:1812904
    	store_rewrite_sec:21.390000-27.070000(23.596+/-2.4)
    	listnodes_sec:1.120000-1.230000(1.176+/-0.044)
    	listchannels_sec:38.900000-50.580000(44.716+/-3.9)
    	routing_sec:45.080000-48.160000(46.814+/-1.1)
    	peer_write_all_sec:58.780000-87.150000(72.278+/-9.7)
    
    MCP notable changes from previous patch (>1 stddev):
    	-vsz_kb:2288784
    	+vsz_kb:1812904
    	-store_rewrite_sec:38.060000-39.130000(38.426+/-0.39)
    	+store_rewrite_sec:21.390000-27.070000(23.596+/-2.4)
    	-listnodes_sec:0.750000-0.850000(0.794+/-0.042)
    	+listnodes_sec:1.120000-1.230000(1.176+/-0.044)
    	-listchannels_sec:30.740000-31.760000(31.096+/-0.35)
    	+listchannels_sec:38.900000-50.580000(44.716+/-3.9)
    	-routing_sec:29.600000-33.560000(30.472+/-1.5)
    	+routing_sec:45.080000-48.160000(46.814+/-1.1)
    	-peer_write_all_sec:49.220000-52.690000(50.892+/-1.3)
    	+peer_write_all_sec:58.780000-87.150000(72.278+/-9.7)
    
    Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
    rustyrussell committed Apr 10, 2019
    Copy the full SHA
    db2c21c View commit details
    Browse the repository at this point in the history
  7. gossipd: make routing_add_channel_update discard old timestamps.

    This is currently done higher up, in handle_channel_update(), but
    that's one reason why handle_channel_update() has to do a channel
    lookup.  Moving the check down means handle_channel_update() can do a
    minimal "get node id for this channel" so it can check the signature.
    
    This helps, because the chan lookup semantics are changing in the next
    few patches.
    
    Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
    rustyrussell committed Apr 10, 2019
    Copy the full SHA
    3248de1 View commit details
    Browse the repository at this point in the history
  8. gossipd: handle duplicate nodes from unverified channel_announces pro…

    …perly.
    
    If we have a channel_announcement, we catch any node_announcement for
    either end while we validate the channel_announcement.  But if we have
    multiple channel_announcements and the first one failed to verify, it
    would remove this catch, meaning we'd discard following node_announcements
    even though there was a pending channel_announcement.
    
    The answer is to use a simple reference count, and as a further
    optimization, only place the `pending_node_announce` if there's no
    node already.
    
    We also move the process_pending_node_announcement() calls lower down,
    so *any* new channel creation checks it.  This is more robust, and
    will prove useful for the next patch, where we can use the same
    mechanism to handle node_announcements on channel_announcements which
    are verified, but don't yet have a channel_update.
    
    Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
    rustyrussell committed Apr 10, 2019
    Copy the full SHA
    47ce6e9 View commit details
    Browse the repository at this point in the history

Commits on Apr 11, 2019

  1. Copy the full SHA
    8229aea View commit details
    Browse the repository at this point in the history
  2. gossipd: don't create struct chan for yet-to-be-updated channels.

    We currently create a struct chan when we receive a `channel_announcement`,
    but we can only broadcast once we have a `channel_update` (since that
    provides the timestamp).
    
    This means a `struct chan` can be in a weird state where it exists,
    but is unusable (can't use without an update), and also means we need to
    keep the channel_announcement message around until an update arrives, so
    we can put it in the gossip_store.
    
    Instead, keep track of these "unupdated" channels separately, and check
    for them in all the places we search for a specific channel to update.
    
    MCP results from 5 runs, min-max(mean +/- stddev):
    	store_load_msec:30640-33236(32202+/-8.7e+02)
    	vsz_kb:1812956
    	store_rewrite_sec:13.410000-16.970000(14.438+/-1.3)
    	listnodes_sec:0.590000-0.660000(0.62+/-0.033)
    	listchannels_sec:28.140000-29.560000(28.816+/-0.56)
    	routing_sec:29.530000-32.590000(30.352+/-1.1)
    	peer_write_all_sec:60.380000-61.320000(60.836+/-0.37)
    
    MCP notable changes from previous patch (>1 stddev):
    	-vsz_kb:1812904
    	+vsz_kb:1812956
    	-store_rewrite_sec:21.390000-27.070000(23.596+/-2.4)
    	+store_rewrite_sec:13.410000-16.970000(14.438+/-1.3)
    	-listnodes_sec:1.120000-1.230000(1.176+/-0.044)
    	+listnodes_sec:0.590000-0.660000(0.62+/-0.033)
    	-listchannels_sec:38.900000-50.580000(44.716+/-3.9)
    	+listchannels_sec:28.140000-29.560000(28.816+/-0.56)
    	-routing_sec:45.080000-48.160000(46.814+/-1.1)
    	+routing_sec:29.530000-32.590000(30.352+/-1.1)
    	-peer_write_all_sec:58.780000-87.150000(72.278+/-9.7)
    	+peer_write_all_sec:60.380000-61.320000(60.836+/-0.37)
    
    Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
    rustyrussell committed Apr 11, 2019
    Copy the full SHA
    865433f View commit details
    Browse the repository at this point in the history
  3. gossipd: don't use cached node_announcement for redundancy checking

    Re-parse the existing message, since we'e going to get rid of those
    fields.
    
    Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
    rustyrussell committed Apr 11, 2019
    Copy the full SHA
    dbd903b View commit details
    Browse the repository at this point in the history
  4. gossipd: free tmpctx children in gossip_store_load loop.

    We're accumulating children, and we'll get more in the successive
    patches.
    
    Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
    rustyrussell committed Apr 11, 2019
    Copy the full SHA
    f425e62 View commit details
    Browse the repository at this point in the history
  5. gossipd: don't keep node_announcement messages in memory.

    MCP results from 5 runs, min-max(mean +/- stddev):
    store_load_msec:34779-38628(36903.4+/-1.4e+03)
    vsz_kb:1792996
    store_rewrite_sec:14.440000-15.040000(14.672+/-0.24)
    listnodes_sec:1.030000-1.120000(1.068+/-0.032)
    listchannels_sec:27.860000-32.850000(30.05+/-1.7)
    routing_sec:30.020000-31.700000(31.044+/-0.56)
    peer_write_all_sec:65.100000-70.600000(68.422+/-2)
    
    -vsz_kb:1780516
    +vsz_kb:1792996
    -listnodes_sec:1.280000-1.530000(1.382+/-0.096)
    +listnodes_sec:1.030000-1.120000(1.068+/-0.032)
    
    MCP notable changes from previous patch (>1 stddev):
    	-store_load_msec:30640-33236(32202+/-8.7e+02)
    	+store_load_msec:34779-38628(36903.4+/-1.4e+03)
    	-vsz_kb:1812956
    	+vsz_kb:1792996
    	-listnodes_sec:0.590000-0.660000(0.62+/-0.033)
    	+listnodes_sec:1.030000-1.120000(1.068+/-0.032)
    	-peer_write_all_sec:60.380000-61.320000(60.836+/-0.37)
    	+peer_write_all_sec:65.100000-70.600000(68.422+/-2)
    
    Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
    rustyrussell committed Apr 11, 2019
    Copy the full SHA
    3769f66 View commit details
    Browse the repository at this point in the history
  6. gossipd: remove info fields from struct node.

    Reload them from disk if they do listnodes.
    
    MCP results from 5 runs, min-max(mean +/- stddev):
    	store_load_msec:35390-38659(37336.4+/-1.3e+03)
    	vsz_kb:1780516
    	store_rewrite_sec:13.800000-16.800000(15.02+/-0.98)
    	listnodes_sec:1.280000-1.530000(1.382+/-0.096)
    	listchannels_sec:28.700000-30.440000(29.34+/-0.68)
    	routing_sec:30.120000-31.080000(30.526+/-0.35)
    	peer_write_all_sec:65.910000-76.850000(69.462+/-4.1)
    
    MCP notable changes from previous patch (>1 stddev):
    	-vsz_kb:1792996
    	+vsz_kb:1780516
    	-listnodes_sec:1.030000-1.120000(1.068+/-0.032)
    	+listnodes_sec:1.280000-1.530000(1.382+/-0.096)
    
    Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
    rustyrussell committed Apr 11, 2019
    Copy the full SHA
    252f70a View commit details
    Browse the repository at this point in the history
  7. gossipd: unify is_chan_public / is_chan_announced.

    We used to have a `struct chan` while we're waiting for an update; now we
    keep that internally.  So a `struct chan` without a channel_announcement
    in the store is private, and other is public.
    
    Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
    rustyrussell committed Apr 11, 2019
    Copy the full SHA
    b4eef30 View commit details
    Browse the repository at this point in the history
  8. gossipd: don't keep channel_announcement messages in memory.

    MCP results from 5 runs, min-max(mean +/- stddev):
    	store_load_msec:35107-37944(36686+/-1e+03)
    	vsz_kb:1218036
    	store_rewrite_sec:14.060000-17.970000(15.966+/-1.6)
    	listnodes_sec:1.270000-1.350000(1.314+/-0.034)
    	listchannels_sec:28.510000-30.270000(29.6+/-0.6)
    	routing_sec:30.230000-31.510000(30.83+/-0.44)
    	peer_write_all_sec:67.390000-70.710000(68.568+/-1.2)
    
    MCP notable changes from previous patch (>1 stddev):
    	-vsz_kb:1780516
    	+vsz_kb:1218036
    
    Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
    rustyrussell committed Apr 11, 2019
    Copy the full SHA
    2f0f859 View commit details
    Browse the repository at this point in the history
  9. gossipd: remove some fields from struct chan.

    The txout_script field is unused; the local_disable only applies to
    the handful of local channels, so move that into a hash table.
    
    MCP results from 5 runs, min-max(mean +/- stddev):
    	store_load_msec:39207-45089(41374.6+/-2.2e+03)
    	vsz_kb:1202316
    	store_rewrite_sec:15.090000-16.790000(15.654+/-0.63)
    	listnodes_sec:1.290000-3.790000(1.938+/-0.93)
    	listchannels_sec:30.190000-32.120000(31.31+/-0.69)
    	routing_sec:28.220000-31.340000(29.314+/-1.2)
    	peer_write_all_sec:66.830000-76.850000(71.976+/-3.6)
    
    MCP notable changes from previous patch (>1 stddev):
    	-store_load_msec:35107-37944(36686+/-1e+03)
    	+store_load_msec:39207-45089(41374.6+/-2.2e+03)
    	-vsz_kb:1218036
    	+vsz_kb:1202316
    	-listchannels_sec:28.510000-30.270000(29.6+/-0.6)
    	+listchannels_sec:30.190000-32.120000(31.31+/-0.69)
    
    Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
    rustyrussell committed Apr 11, 2019
    Copy the full SHA
    2368255 View commit details
    Browse the repository at this point in the history
  10. gossipd: allow reading from the store during load.

    When we no longer keep channel_updates in memory, there's a path where
    we access them on load: when we promote a local channel to an
    announced channel.
    
    This breaks at the moment, since gs->fd == -1; change it to a writable
    flag instead.
    
    Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
    rustyrussell committed Apr 11, 2019
    Copy the full SHA
    843c54b View commit details
    Browse the repository at this point in the history
  11. gossipd: use pread in the store.

    The next patch causes us to access the store while loading (we read
    channel_updates for local peers), which messes up loading due to the
    lseek involved.
    
    Using pread() is atomic with seek & read, and also a bit more
    efficient.  Make the header contiguous too, while we're here.
    
    We don't need pwrite: we always open with O_APPEND which means the
    seek-to-end is implicit.
    
    MCP results from 5 runs, min-max(mean +/- stddev):
    	store_load_msec:36771-38289(37529.6+/-5.3e+02)
    	vsz_kb:1202316
    	store_rewrite_sec:12.460000-13.280000(12.784+/-0.29)
    	listnodes_sec:1.240000-1.410000(1.34+/-0.058)
    	listchannels_sec:29.850000-31.840000(30.908+/-0.69)
    	routing_sec:27.800000-31.790000(28.822+/-1.5)
    	peer_write_all_sec:66.200000-68.720000(67.44+/-0.84)
    
    MCP notable changes from previous patch (>1 stddev):
    	-store_load_msec:39207-45089(41374.6+/-2.2e+03)
    	+store_load_msec:36771-38289(37529.6+/-5.3e+02)
    	-store_rewrite_sec:15.090000-16.790000(15.654+/-0.63)
    	+store_rewrite_sec:12.460000-13.280000(12.784+/-0.29)
    	-peer_write_all_sec:66.830000-76.850000(71.976+/-3.6)
    	+peer_write_all_sec:66.200000-68.720000(67.44+/-0.84)
    
    Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
    rustyrussell committed Apr 11, 2019
    Copy the full SHA
    d2a8545 View commit details
    Browse the repository at this point in the history
  12. gossipd: don't keep channel_updates in memory.

    This requires some trickiness when we want to re-add unannounced channels
    to the store after compaction, so we extract a common "copy_message" to
    transfer from old store to new.
    
    MCP results from 5 runs, min-max(mean +/- stddev):
    	store_load_msec:36034-37853(37109.8+/-5.9e+02)
    	vsz_kb:577456
    	store_rewrite_sec:12.490000-13.250000(12.862+/-0.27)
    	listnodes_sec:1.250000-1.480000(1.364+/-0.09)
    	listchannels_sec:30.820000-31.480000(31.068+/-0.24)
    	routing_sec:26.940000-27.990000(27.616+/-0.39)
    	peer_write_all_sec:65.690000-68.600000(66.698+/-0.99)
    
    MCP notable changes from previous patch (>1 stddev):
    	-vsz_kb:1202316
    	+vsz_kb:577456
    
    Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
    rustyrussell committed Apr 11, 2019
    Copy the full SHA
    118ab4a View commit details
    Browse the repository at this point in the history
  13. gossipd: adjust peers' broadcast_offset when compacting store.

    When we compact the store, we need to adjust the broadast index for
    peers so they know where they're up to.
    
    Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
    rustyrussell committed Apr 11, 2019
    Copy the full SHA
    169207a View commit details
    Browse the repository at this point in the history