Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

swarm connect will use any available address not just the given multiaddress #9895

Open
3 tasks done
jbenet opened this issue May 25, 2023 · 1 comment
Open
3 tasks done
Labels
kind/bug A bug in existing code (including security flaws) status/blocked Unable to be worked further until needs are met

Comments

@jbenet
Copy link
Member

jbenet commented May 25, 2023

Checklist

Installation method

ipfs-update or dist.ipfs.tech

Version

Kubo version: 0.20.0
Repo version: 13
System version: arm64/darwin
Golang version: go1.19.8

Config

{
  "API": {
    "HTTPHeaders": {}
  },
  "Addresses": {
    "API": "/ip4/127.0.0.1/tcp/5001",
    "Announce": [],
    "AppendAnnounce": [],
    "Gateway": "/ip4/127.0.0.1/tcp/8080",
    "NoAnnounce": [],
    "Swarm": [
      "/ip4/0.0.0.0/tcp/4001",
      "/ip6/::/tcp/4001",
      "/ip4/0.0.0.0/udp/4001/quic",
      "/ip4/0.0.0.0/udp/4001/quic-v1",
      "/ip4/0.0.0.0/udp/4001/quic-v1/webtransport",
      "/ip6/::/udp/4001/quic",
      "/ip6/::/udp/4001/quic-v1",
      "/ip6/::/udp/4001/quic-v1/webtransport"
    ]
  },
  "AutoNAT": {},
  "Bootstrap": [
    "/dnsaddr/bootstrap.libp2p.io/p2p/QmbLHAnMoJPWSCR5Zhtx6BHJX9KiKNN6tpvbUcqanj75Nb",
    "/dnsaddr/bootstrap.libp2p.io/p2p/QmcZf59bWwK5XFi76CZX8cbJ4BhTzzA3gU1ZjYZcYW3dwt",
    "/ip4/104.131.131.82/tcp/4001/p2p/QmaCpDMGvV2BGHeYERUEnRQAwe3N8SzbUtfsmvsqQLuvuJ",
    "/ip4/104.131.131.82/udp/4001/quic/p2p/QmaCpDMGvV2BGHeYERUEnRQAwe3N8SzbUtfsmvsqQLuvuJ",
    "/dnsaddr/bootstrap.libp2p.io/p2p/QmNnooDu7bfjPFoTZYxMNLWUQJyrVwtbZg5gBMjTezGAJN",
    "/dnsaddr/bootstrap.libp2p.io/p2p/QmQCU2EcMqAqQPR2i9bChDtGNJchTbq5TbXJJ16u19uLTa"
  ],
  "DNS": {
    "Resolvers": {}
  },
  "Datastore": {
    "BloomFilterSize": 0,
    "GCPeriod": "1h",
    "HashOnRead": false,
    "Spec": {
      "mounts": [
        {
          "child": {
            "path": "blocks",
            "shardFunc": "/repo/flatfs/shard/v1/next-to-last/2",
            "sync": true,
            "type": "flatfs"
          },
          "mountpoint": "/blocks",
          "prefix": "flatfs.datastore",
          "type": "measure"
        },
        {
          "child": {
            "compression": "none",
            "path": "datastore",
            "type": "levelds"
          },
          "mountpoint": "/",
          "prefix": "leveldb.datastore",
          "type": "measure"
        }
      ],
      "type": "mount"
    },
    "StorageGCWatermark": 90,
    "StorageMax": "10GB"
  },
  "Discovery": {
    "MDNS": {
      "Enabled": true
    }
  },
  "Experimental": {
    "AcceleratedDHTClient": false,
    "FilestoreEnabled": false,
    "GraphsyncEnabled": false,
    "Libp2pStreamMounting": false,
    "OptimisticProvide": false,
    "OptimisticProvideJobsPoolSize": 0,
    "P2pHttpProxy": false,
    "StrategicProviding": false,
    "UrlstoreEnabled": false
  },
  "Gateway": {
    "APICommands": [],
    "HTTPHeaders": {
      "Access-Control-Allow-Headers": [
        "X-Requested-With",
        "Range",
        "User-Agent"
      ],
      "Access-Control-Allow-Methods": [
        "GET"
      ],
      "Access-Control-Allow-Origin": [
        "*"
      ]
    },
    "NoDNSLink": false,
    "NoFetch": false,
    "PathPrefixes": [],
    "PublicGateways": null,
    "RootRedirect": ""
  },
  "Identity": {
    "PeerID": "12D3KooWNLgBDmPgwLvZKh6RVTXtXrzRdm26oWxiuSYwoqifqXxr"
  },
  "Internal": {},
  "Ipns": {
    "RecordLifetime": "",
    "RepublishPeriod": "",
    "ResolveCacheSize": 128
  },
  "Migration": {
    "DownloadSources": [],
    "Keep": ""
  },
  "Mounts": {
    "FuseAllowOther": false,
    "IPFS": "/ipfs",
    "IPNS": "/ipns"
  },
  "Peering": {
    "Peers": null
  },
  "Pinning": {
    "RemoteServices": {}
  },
  "Plugins": {
    "Plugins": null
  },
  "Provider": {
    "Strategy": ""
  },
  "Pubsub": {
    "DisableSigning": false,
    "Router": ""
  },
  "Reprovider": {},
  "Routing": {
    "Methods": null,
    "Routers": null
  },
  "Swarm": {
    "AddrFilters": null,
    "ConnMgr": {},
    "DisableBandwidthMetrics": false,
    "DisableNatPortMap": false,
    "RelayClient": {},
    "RelayService": {},
    "ResourceMgr": {
      "Limits": {}
    },
    "Transports": {
      "Multiplexers": {},
      "Network": {},
      "Security": {}
    }
  }
}

Description

Trying to debug a speed issue in a transfer between two nodes, locally connected in wifi (one 0.20.0, one 0.19.0). I tried disconnecting from quic to isolate, and reconnect w/ tcp. But no, kubo disregards the user commands and insists on using quic. Disregarding clear user commands and intent is not a feature -- it is a bug.

# find the relevant peer id
> ipfs swarm peers | grep 192.168
/ip4/192.168.1.179/udp/4001/quic-v1/p2p/12D3KooWBzbvHxkV3kBt8TPuMjGwknwsmCyMuy5uBAVPT8NarRXj

# get the peer version
> ipfs id 12D3KooWBzbvHxkV3kBt8TPuMjGwknwsmCyMuy5uBAVPT8NarRXj | grep Version
	"AgentVersion": "kubo/0.19.0/desktop",
	"ProtocolVersion": "ipfs/0.1.0",

# get the peer tcp addr
> ipfs id 12D3KooWBzbvHxkV3kBt8TPuMjGwknwsmCyMuy5uBAVPT8NarRXj | grep tcp
		"/ip4/127.0.0.1/tcp/4001/p2p/12D3KooWBzbvHxkV3kBt8TPuMjGwknwsmCyMuy5uBAVPT8NarRXj",
		"/ip4/192.168.1.179/tcp/4001/p2p/12D3KooWBzbvHxkV3kBt8TPuMjGwknwsmCyMuy5uBAVPT8NarRXj",
		"/ip4/85.94.113.87/tcp/15448/p2p/12D3KooWBzbvHxkV3kBt8TPuMjGwknwsmCyMuy5uBAVPT8NarRXj",
		"/ip6/::1/tcp/4001/p2p/12D3KooWBzbvHxkV3kBt8TPuMjGwknwsmCyMuy5uBAVPT8NarRXj",

# disconnect from quic
> ipfs swarm disconnect /ip4/192.168.1.179/udp/4001/quic-v1/p2p/12D3KooWBzbvHxkV3kBt8TPuMjGwknwsmCyMuy5uBAVPT8NarRXj
disconnect 12D3KooWBzbvHxkV3kBt8TPuMjGwknwsmCyMuy5uBAVPT8NarRXj success

# verify there's no connection now.
> ipfs swarm peers | grep 192.168
> ipfs swarm peers | grep 192.168

# ok, now connect to tcp addr
> ipfs swarm connect /ip4/192.168.1.179/tcp/4001/p2p/12D3KooWBzbvHxkV3kBt8TPuMjGwknwsmCyMuy5uBAVPT8NarRXj
connect 12D3KooWBzbvHxkV3kBt8TPuMjGwknwsmCyMuy5uBAVPT8NarRXj success

# check the addr i'm connected to
> ipfs swarm peers | grep 192.168
/ip4/192.168.1.179/udp/4001/quic-v1/p2p/12D3KooWBzbvHxkV3kBt8TPuMjGwknwsmCyMuy5uBAVPT8NarRXj

@jbenet jbenet added kind/bug A bug in existing code (including security flaws) need/triage Needs initial labeling and prioritization labels May 25, 2023
@aschmahmann
Copy link
Contributor

I've renamed the issue to describe what's going on, but high level:

  1. It's not that kubo is forcing a quic connection, it's the ipfs swarm connect only guarantees that it will give you a connection to a given peer not what that connection will look like. This comes from the go-libp2p abstractions.
  2. AFAICT it's been this way since as long as it's been possible to have two addresses for a given peer (e.g. probably about as long as the project has been around)
  3. There are other, more reliable ways to get the same debugging information including from kubo

How to better get the same debugging information from kubo

Trying to debug a speed issue in a transfer between two nodes, locally connected in wifi

You could take one of the nodes (e.g. the dialing node), disable QUIC on the node https://github.com/ipfs/kubo/blob/master/docs/config.md#swarmtransportsnetwork and then reboot it. If the nodes do not find each other over mDNS you could then do ipfs swarm connect to establish a connection.

Note: there are a variety of tools that use similar components as kubo under the hood that can help you exercise your kubo node. https://github.com/ipfs-shipyard/vole is one of these that I tend to use for poking at individual nodes for testing purposes which while not as end-to-end as using kubo as described above is easier and doesn't require the debugging tools to all get bundled into kubo.

What's going on under the hood

  1. ipfs swarm connect is a low level command for forming a connection to a peer.
  2. Internally it uses go-lib2p's host.Connect functionality https://github.com/libp2p/go-libp2p/blob/fa153c58dd24ffc6f97e4dfe1f40a7ce05e4a6af/core/host/host.go#L40-L45
  3. As described there the contract on Connect (which is the only exposed go-libp2p connection function which takes a multiaddr instead of a peerID) is Connect ensures there is a connection between this host and the peer with given peer.ID it does not guarantee that a new connection will be formed to the given peer or what addresses it will use.
  4. Therefore, the standard go-libp2p host implementation is this short function https://github.com/libp2p/go-libp2p/blob/fa153c58dd24ffc6f97e4dfe1f40a7ce05e4a6af/p2p/host/basic/basic_host.go#L718-L735 which absorbs any new addresses into the peerstore and then asks the dialer to dial the given peer with the addresses available and return the best connection.
  5. Because you previously were connected with the peer you learned about its other addresses and will use them in the new connection attempt from ipfs swarm connect for some period of time where that information is deemed "fresh enough"
  6. QUIC will beat TCP basically every time because the number of roundtrips is significantly less (4 for TCP vs 1 for QUIC, see https://connectivity.libp2p.io/ for more details).
  7. This makes it appear as though kubo is forcing QUIC when instead its just doing its job of getting you a connection quickly

Given this the more appropriate CLI UX here would probably have been ipfs swarm connect <peerID> --multiaddr-hints=<addr1>,<addr2>,.... However, it doesn't seem worth a breaking change for that. It does seem worth clarifying the text on ipfs swarm connect though.

Unless there becomes a way within go-libp2p to dial a specific multiaddr there's nothing to be done here. However, it's not obvious what the correct user contract for this would be in go-libp2p or in kubo. I will open an issue there linking here shortly and if you have any UX recommendations for some of the issue below it would be great to hear them.

Notes on why changing this behavior is probably not a good idea

If the idea was for ipfs swarm connect <multiaddr> to only work with the given multiaddr to enable testing as you've done you could also end up in any number of ways in which this could backfire due to misunderstanding the abstractions involved.

Some examples:

  • If Alice connects to Bob via a limited circuit-relay address that connection will need to be upgraded to a direct one before a protocol like Bitswap will run over it which means that even though Alice used TCP to connect to the relay Alice might end up with a QUIC, WebSockets, etc. direct connection to Bob
  • If Alice connects to Bob via TCP, Bob might connect to Alice via QUIC. Once libp2p has two connections open to a given peer it's up to the implementation which connection they want to open new streams on. If libp2p ends up resolving Stream Migration Protocol libp2p/specs#328 then even the existing TCP streams could move over to the QUIC connection
  • If Alice runs ipfs swarm connect <tcp-multiaddr> to connect to Bob, but already has a QUIC connection to Bob, what's supposed to happen? Does this kill the QUIC connection, what if it's in use? It could add the connection to the QUIC one but then the libp2p implementation gets to choose which connection to send streams on, so we're back to the above problem.
    • Assuming this isn't true by having users do ipfs swarm disconnect followed by connect exposes a race condition.

Historical notes

@aschmahmann aschmahmann changed the title kubo forces quic connection? swarm connect will use any available address not just the given multiaddress Jul 7, 2023
@aschmahmann aschmahmann added need/author-input Needs input from the original author status/blocked Unable to be worked further until needs are met and removed need/triage Needs initial labeling and prioritization labels Jul 7, 2023
@lidel lidel removed the need/author-input Needs input from the original author label Jul 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug A bug in existing code (including security flaws) status/blocked Unable to be worked further until needs are met
Projects
None yet
Development

No branches or pull requests

3 participants