Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug? device discovery not working without manual pairing #69

Closed
luckman212 opened this issue Mar 8, 2023 · 19 comments · Fixed by #110
Closed

Bug? device discovery not working without manual pairing #69

luckman212 opened this issue Mar 8, 2023 · 19 comments · Fixed by #110
Labels
help wanted Extra attention is needed

Comments

@luckman212
Copy link
Contributor

N.B. In the details below mydomain.com is a placeholder for the actual domain I'm using, which I prefer to keep private

Sorry to raise this as a bug, since it's more likely my ignorance on some part of the setup. I'm trying to set up a selfhosted PairDrop instance. I have read host-your-own.md and combed through the Issues but I can't figure this out.

  • I'm using the LSCR docker image (version at the time of this writing is: v1.4.4-ls13)
  • The docker host is running Ubuntu Ubuntu 22.04.1 LTS
  • I'm using a Cloudflare (Argo) tunnel to handle the HTTPS proxying (routing to localhost:8081)
  • I'm running an instance of coturn for my own STUN server
  • The host is behind a NAT but has the STUN TCP/UDP ports 3478,5349,49152-49200 forwarded to the internal IP. I have confirmed that's all working via Trickle ICE as well as from the CLI: stunclient stun.mydomain.com 3478

All of that seems to work fine!

I then prepared the PairDrop docker image with the following:

docker create \
--name=pairdrop \
-e PUID=1000 \
-e PGID=1000 \
-e TZ=America/New_York \
-e RATE_LIMIT=true \
-e RTC_CONFIG="rtc_config.json" \
-e WS_FALLBACK=true \
-p 127.0.0.1:8081:3000 \
-v pairdrop:/app \
--restart unless-stopped \
lscr.io/linuxserver/pairdrop:latest

I created my custom rtc_config.json via:

cd /var/snap/docker/common/var-lib-docker/volumes/pairdrop/_data/pairdrop
cat <<EOF >rtc_config.json
{
  "sdpSemantics": "unified-plan",
  "iceServers": [
    {
      "urls": "stun:stun.mydomain.com:3478"
    }
  ]
}
EOF

And finally, start the PairDrop container and tail the logs:

docker start pairdrop && docker logs -f pairdrop

It all opens and runs fine (with the exception of a node error about not being able to write to /root/.npm/_logs, which I believe is totally unrelated, but the output is below anyway)

BUT, upon visiting the page (drop.mydomain.com) no devices are seeing each other. It doesn't matter whether I'm accessing from desktop browser (Chrome, Safari) or mobile device, 4G LTE network or over WiFi/Ethernet.

I've tried it also with WS_FALLBACK=false, as well as with and without the custom RTC_CONFIG. No differences noted.

I opened the Chrome devtools console and don't see any errors there.

If I manually pair the devices with a pairing code, everything works fine. It's just the autodiscovery that's broken for me. Any help would be wonderfully appreciated 🙏


Output of the docker logs -f pairdrop below

pairdrop
[migrations] started
[migrations] no migrations found
───────────────────────────────────────

      ██╗     ███████╗██╗ ██████╗
      ██║     ██╔════╝██║██╔═══██╗
      ██║     ███████╗██║██║   ██║
      ██║     ╚════██║██║██║   ██║
      ███████╗███████║██║╚██████╔╝
      ╚══════╝╚══════╝╚═╝ ╚═════╝

   Brought to you by linuxserver.io
───────────────────────────────────────

To support the app dev(s) visit:
PairDrop: https://www.buymeacoffee.com/pairdrop

To support LSIO projects visit:
https://www.linuxserver.io/donate/

───────────────────────────────────────
GID/UID
───────────────────────────────────────

User UID:    1000
User GID:    1000
───────────────────────────────────────

[custom-init] No custom files found, skipping...
npm WARN logfile Error: EACCES: permission denied, scandir '/root/.npm/_logs'
npm WARN logfile  error cleaning log files [Error: EACCES: permission denied, scandir '/root/.npm/_logs'] {
npm WARN logfile   errno: -13,
npm WARN logfile   code: 'EACCES',
npm WARN logfile   syscall: 'scandir',
npm WARN logfile   path: '/root/.npm/_logs'
npm WARN logfile }

> pairdrop@1.4.4 start
> node index.js

PairDrop is running on port 3000
[ls.io-init] done.
@schlagmichdoch schlagmichdoch added the help wanted Extra attention is needed label Mar 8, 2023
@luckman212
Copy link
Contributor Author

Thank you kindly for the reply!

I definitely do have IPv6 on both the docker host as well as the connecting devices, so maybe that's it! I'm at work now but will give it a shot later and see what I can find out.

Nothing special on the CF tunnel side, I don't see anywhere that I could enable or disable those forwardfor headers so that's a bit of a black box. If that turns out to be the problem I'll have to switch to nginx + LetsEncrypt etc

@luckman212
Copy link
Contributor Author

I did some testing. The docker container doesn't have ss or tcpdump installed so I had to use netstat in a tight loop (watch -n0.5 netstat -an) but I was able to detect that, as you suspected, the proxied connections coming in via the Cloudflare tunnel all appeared to share the same RFC1918 IP, not their true source IP. Example:

Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address           Foreign Address         State
tcp        0      0 :::3000                 :::*                    LISTEN
tcp        0      0 ::ffff:192.19.16.3:3000 ::ffff:192.19.16.1:47884 TIME_WAIT
tcp        0      0 ::ffff:192.19.16.3:3000 ::ffff:192.19.16.1:52150 ESTABLISHED
tcp        0      0 ::ffff:192.19.16.3:3000 ::ffff:192.19.16.1:47906 TIME_WAIT
tcp        0      0 ::ffff:192.19.16.3:3000 ::ffff:192.19.16.1:47990 TIME_WAIT
tcp        0      0 ::ffff:192.19.16.3:3000 ::ffff:192.19.16.1:47900 TIME_WAIT
[...]
tcp        0      0 ::ffff:192.19.16.3:3000 ::ffff:192.19.16.1:47952 TIME_WAIT
tcp        0      0 ::ffff:192.19.16.3:3000 ::ffff:192.19.16.1:47868 TIME_WAIT
tcp        0      0 ::ffff:192.19.16.3:3000 ::ffff:192.19.16.1:47918 TIME_WAIT
tcp        0      0 ::ffff:192.19.16.3:3000 ::ffff:192.19.16.1:47944 TIME_WAIT

Not sure that there's any way I can "fix" this from the CF side, I poked around in their dashboard for a while but didn't find much (except for this "No Happy Eyeballs" switch which I tried toggling—made no difference)

image

I found this hacky method for disabling IPv6 on a per-container basis (adding --sysctl net.ipv6.conf.all.disable_ipv6=1 to the docker run which seems to work by the way) but that didn't solve this either.

Here's the output of ip addr from inside the container (no more IPv6):

root@6f94135e58b5:/# ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
2: sit0@NONE: <NOARP> mtu 1480 qdisc noop state DOWN qlen 1000
    link/sit 0.0.0.0 brd 0.0.0.0
176: eth0@if177: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1500 qdisc noqueue state UP
    link/ether 02:42:c0:13:10:03 brd ff:ff:ff:ff:ff:ff
    inet 192.19.16.3/24 brd 192.19.16.255 scope global eth0
       valid_lft forever preferred_lft forever

What DID work was removing the localhost-only bind (change -p 127.0.0.1:8081:3000 to just -p 8081:3000) and then connecting over HTTP directly to the private IP of my docker host: http://192.168.138.48:8081

So it seems the end result is that, for now, this isn't compatible with Cloudflare tunnels and I will need to switch to a slightly more complicated setup e.g. Nginx+LE.

@schlagmichdoch
Copy link
Owner

Thanks for testing so thoroughly!

the proxied connections coming in via the Cloudflare tunnel all appeared to share the same RFC1918 IP

It is expected, that the connecting IP is not the real IP of the client but that of the proxy. The same happens when you use traefik or another reverse proxy in between that relays the requests.

Not sure that there's any way I can "fix" this from the CF side

To get around this there is a x-forwarded-for header that contains all ip addresses of the request including the real ip of the client. Additionally there is the cf-connecting-ip header that is sometimes set by cloudflare.
Currently, this is the implementation - the ip addresses in the headers are used whenever they are present:

        if (request.headers['cf-connecting-ip']) {
            this.ip = request.headers['cf-connecting-ip'].split(/\s*,\s*/)[0];
        } else if (request.headers['x-forwarded-for']) {
            this.ip = request.headers['x-forwarded-for'].split(/\s*,\s*/)[0];
        } else {
            this.ip = request.connection.remoteAddress;
        }

I will create a debugging flag to log the headers and the remoteAddress of the connecting peers to make it easier for you to debug. I would put it on a separate branch first. Could you then clone the repo, checkout the branch and build the docker image yourself for testing purposes?

What DID work was removing the localhost-only bind (change -p 127.0.0.1:8081:3000 to just -p 8081:3000) and then connecting over HTTP directly to the private IP of my docker host: http://192.168.138.48:8081

Does this mean that the problem indeed lies with cloudflare-one?

@luckman212
Copy link
Contributor Author

Sure I am happy to test a separate branch. I will do a bit more testing today, I am trying to set up a new docker host and a fresh nginx setup.

I should have put this detail in: when I got it to work at the end of the above post, I was connecting directly to the docker host's IP over a site to site VPN, so there was no HTTP proxy involved, and PairDrop would have seen the actual real IPs of the clients.

@luckman212
Copy link
Contributor Author

🚀 Nice! Yes sure I will test now. The new build btw is a separate VM entirely, I won't touch the original setup, so we can definitely do any testing you want on there.

@schlagmichdoch
Copy link
Owner

Any news? :)

@luckman212
Copy link
Contributor Author

Sorry. I got totally derailed by something else. And had to rebuild my VMware infra this past week, set up new Docker hosts, ROUTE-48 shutdown caused some issues, has to move some things to HE.net, etc... 😵‍💫

I'm planning to revisit this weekend, will post soon!

@schlagmichdoch
Copy link
Owner

No worries! I'm just trying to keep the issue board somewhat tidy. Post whenever you're ready! Thanks for the quick reply and good luck with your rebuild! :)

@schlagmichdoch
Copy link
Owner

@luckman212 Have you tried again?

The debugging flag is now also available on the master branch:

docker run --rm --name=pairdrop -e DEBUG_MODE=true -p 127.0.0.1:8081:3000 -it pairdrop npm run start

See https://github.com/schlagmichdoch/PairDrop/blob/master/docs/host-your-own.md#debug-mode for more info

@luckman212
Copy link
Contributor Author

oops - posted this earlier on #71 by mistake!

Hi @schlagmichdoch

Thanks for your patience. Been a really busy time and I didn't have time to look at this (until now). Spent the last 2 days working on it, debugging etc and am here to report my findings, as well as a Pull Request.

Findings

  1. The original problem appears to mostly be caused by Cloudflare Argo tunnel/proxy combined with IPv6.
  2. If the Docker host has IPv6 enabled, it's fairly difficult to disable IPv6 for a single container. There are ways, such as the --sysctl net.ipv6.conf.all.disable_ipv6=1 or creating a separate Docker network that has IPv6 disabled (docker network create --ipv6=false...). I opted for the latter as it seemed like a better choice.
  3. Cloudflare absolutely does not allow disabling the creation of IPv6 CNAME records on Argo tunnels (anymore) unless you're on an Enterprise plan. They are trying to push IPv6 forward (and I do appreciate that) but in this case it would have been great if I could have selectively disabled the AAAA record from being returned for my PairDrop instance.

All that added up to me not able to "see" local devices even when they were on the same LAN, due to fully working IPv6 across the board from client <-> cloudflare <-> tunnel <-> Docker host <-> PairDrop container. I did come up with a working solution though.

PR

I made a patch that adds a IPV6_LOCALIZE flag. This flag accepts a parameter between 1 and 8 to truncate the client IPv6 (proxied or otherwise) to a specific number of segments. In most cases, for a standard /64 subnet, the correct parameter would be IPV6_LOCALIZE=4 but it will accept other sizes to be more flexible. This makes it possible for devices on the same /64 to automatically communicate (tested with a sample size of n=1).

I also encountered another bug/problem while testing which doesn't seem related to any of this: sending a TEXT message to another peer using the Submit button seems broken in recent builds (tested with master branch as well as the live PairDrop.net instance). The message is never sent, instead a js error is logged to the console:

image

If you instead send using CTRL+ENTER it works. I made a couple of small edits to the scripts and HTML which fixes that as well, plus a bug about disabling/enabling the submit button due to innerText not always equalling \n.

After building a new image from that branch, and running with e.g.:

docker run --rm \
--name=pairdrop-test \
--hostname=pairdrop \
--network=no_v6 \
-e TZ=America/New_York \
-e RATE_LIMIT=true \
-e WS_FALLBACK=true \
-e IPV6_LOCALIZE=4 \
-e DEBUG_MODE=true \
-v pairdrop-test:/home/node/app \
-it pairdrop-test npm run start

I now have a fully working instance of PairDrop, with working HTTPS behind the Cloudflare proxy, and device discovery working! 🚀

Hope these are ok, awaiting your feedback.

@schlagmichdoch
Copy link
Owner

I now have a fully working instance of PairDrop, with working HTTPS behind the Cloudflare proxy, and device discovery working! 🚀

Awesome, good work!

PR looks good so far, but I need to do some testing myself and look into localizing IPv6 addresses. Before we merge this, we must add this env var to the docs as well: here and here

2. If the Docker host has IPv6 enabled, it's fairly difficult to disable IPv6 for a single container. There are ways, such as the --sysctl net.ipv6.conf.all.disable_ipv6=1 or creating a separate Docker network that has IPv6 disabled (docker network create --ipv6=false...). I opted for the latter as it seemed like a better choice.

Not sure if I understand that correctly: Is disabling IPv6 on the docker network level sufficient to solve this issue or not?

I also encountered another bug/problem while testing which doesn't seem related to any of this: sending a TEXT message to another peer using the Submit button seems broken in recent builds (tested with master branch as well as the live PairDrop.net instance).

bug about disabling/enabling the submit button due to innerText not always equalling \n.

Thanks for reporting these issues! As they are independent from the rest and urgent, I committed fixes for them adding you as co-author: 6e4bda0 & 8a17b82

Would be great if you could fixup your PR only keeping your changes to index.js.

@luckman212
Copy link
Contributor Author

Updated PR is coming shortly.

To answer the other question about "is disabling IPv6 on the docker network level sufficient to solve this issue or not?" - the answer is NO. In my case even when IPv6 was fully disabled at the network level (Docker host and/or container) the proxy headers x-forwarded-for and cf-connecting-ip still contain full IPv6 addresses of the client and thus trigger the discovery issue anyway.

@luckman212
Copy link
Contributor Author

luckman212 commented May 16, 2023

@schlagmichdoch I am new to the process of reverting individual files from a PR, so I hope I did that correctly! Hope the index.js is OK now. I also updated the docs.

Made a bit of a mess by accidentally hitting the Sync Fork button in GitHub 😡 but I believe I was able to revert that cleanly and squash everything down to a single commit.

LMK.

luckman212 added a commit to luckman212/PairDrop that referenced this issue May 16, 2023
luckman212 added a commit to luckman212/PairDrop that referenced this issue May 16, 2023
(squashed, docs updated)
@schlagmichdoch
Copy link
Owner

the answer is NO. In my case even when IPv6 was fully disabled at the network level (Docker host and/or container) the proxy headers x-forwarded-for and cf-connecting-ip still contain full IPv6 addresses of the client and thus trigger the discovery issue anyway.

Thanks for clarifying!

I am new to the process of reverting individual files from a PR, so I hope I did that correctly! Hope the index.js is OK now.

Everything looks good! Thanks for rebasing and tidying up so quickly.

I did some digging and if I understand it correctly, global IPv6 addresses are always comprised of 48bits routing identifier + 16bits subnet identifier:
http://www.tcpipguide.com/free/t_IPv6GlobalUnicastAddressFormat-2.htm
https://www.ibm.com/docs/en/ts3500-tape-library?topic=formats-subnet-masks-ipv4-prefixes-ipv6
All my devices on the same network (that are not behind some kind of vpn) confirm this behaviour.

So even if you define your local network e.g. as /96 you could still identify devices from this (sub)network by using the first 64 bits. In that regard your approach of using the first 4 hexdects of the IPv6 address would be analoguous to using the complete IPv4 address regardless of the used subnet mask (e.g. 255.255.0.0). In that regard I'm thinking whether it would maybe even make sense to use IPV6_LOCALIZE=4 as default and document the possibility of omiting this behaviour by using IPV6_LOCALIZE=false for users that want to use PairDrop in the edge case of a complete intranet with subnets that do not use 64 bits präfixes.

What do you think?

Apart from that I think this is ready to be merged. Thanks a lot for contributing!

@luckman212
Copy link
Contributor Author

luckman212 commented May 16, 2023

Thanks for looking it over. After 10 years or so of monkeying around with IPv6, about the only thing I can say with certainty is: I've seen some crazy stuff. Some ISPs do weird or broken things, so I wouldn't be surprised if there are edge cases we haven't thought of where peers that aren't on the same logical network end up sharing part of a /64 somehow and surprisingly becoming visible to each other.

So, not sure about making IPV6_LOCALIZE=4 the default right out of the gate. It's working well for me so far but maybe tag a release and allow a bit more testing in the wild first?

I pushed 1 more change to explicitly handle IPV6_LOCALIZE=false so that it's declarative.

luckman212 added a commit to luckman212/PairDrop that referenced this issue May 17, 2023
(squashed, docs updated, handle IPV6_LOCALIZE=false)
luckman212 added a commit to luckman212/PairDrop that referenced this issue May 19, 2023
(squashed, docs updated, IPV6_LOCALIZE input validation)
@luckman212
Copy link
Contributor Author

Hey @schlagmichdoch 👋

Just wanted to drop in another data point: I've been using this patched version for a week or so now and it's working great for me among a handful of devices, both on- and off-net. Macs using Safari, iPhones, Windows laptops using Edge browser, IPv4 & v6. All working well (for me at least).

Let me know what you think of the PR in its current form.

@schlagmichdoch
Copy link
Owner

Great! Been busy the last weeks but I will merge this later today. Thanks again! :)

@luckman212
Copy link
Contributor Author

Just a final comment to say I nuked my custom docker image and re-pulled this morning from 1.7.3

All worked smoothly, zero problems.

Repository owner deleted a comment from schlagmichdoch Feb 17, 2024
Repository owner deleted a comment from schlagmichdoch Feb 17, 2024
Repository owner deleted a comment from schlagmichdoch Feb 17, 2024
@schlagmichdoch
Copy link
Owner

The following comments were deleted by GitHub (via hubot) as part of mistakenly marking this account as spam on 17th February 2024. The correct thread order and the creation date is unclear. I decided to manually restore them anyway in order to complete the information this issue holds even though the restored information might be outdated:

Comment by @schlagmichdoch:

Glad to hear this is mostly working for you!

As everything else is working fine, probably here is the issue. For auto discovery, PairDrop groups all devices with the same ip address together. Therefor, the header cf-connecting-ip or x-forward-for must be set properly along every step of the routing process so the clients real ip gets passed along. Then, PairDrop groups all clients together that are behind the same NAT:
https://github.com/schlagmichdoch/PairDrop/blob/master/docs/host-your-own.md#http-server

https://support.cloudflare.com/hc/en-us/articles/200170786-Restoring-original-visitor-IPs

Is there any additional http server between cloudflare and the docker? The docs have examples for apache2 and nginx. Not sure how to do it with cloudflare-one but if you could present your current config I could have a look and add a working version to the docs later.

what’s weird though is, when there is a problem with the cloudflare config normally all devices are mutually visible no matter what their original ip is. In your case every device seems to present a different ip.

It could also be, that devices connect to your server with their ipv6 ip address which is different for every device. You would then have to find a way to prevent users from connecting via ipv6 for auto discovery to work

Comment by @schlagmichdoch:

Looking forward to your findings! If it does not work I should probably add a flag to log the incoming ip addresses for debugging. That way you could easily find out whether PairDrop sees the correct ip address or any of the proxy ip addresses is used instead

Comment by @schlagmichdoch:

Top! I just pushed the debugging bit to add_ip_debugging_flag.
You need to

  1. checkout the branch
  2. To build execute: docker build --pull . -f Dockerfile -t pairdrop
  3. To run execute: docker run --rm --name=pairdrop -e DEBUG_MODE=true -p 127.0.0.1:8081:3000 -it pairdrop npm run start

Then it will log the following on every peer connect:

----DEBUGGING-PEER-IP-START----
remoteAddress: ::ffff:172.17.0.1
x-forwarded-for: 127.0.0.1
cf-connecting-ip: undefined
PairDrop uses: 127.0.0.1
IP is private: false
if IP is private, '127.0.0.1' is used instead
----DEBUGGING-PEER-IP-END----

Would be nice, if you could test this before setting up your new nginx setup to make it work with cloudflare-one :)

I was connecting directly to the docker host's IP over a site to site VPN, so there was no HTTP proxy involved

Thanks for the clarification!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants