Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Option to unwrap tracking URLs #5591

Closed
3 tasks done
chazmcgarvey opened this issue Dec 10, 2021 · 7 comments
Closed
3 tasks done

Option to unwrap tracking URLs #5591

chazmcgarvey opened this issue Dec 10, 2021 · 7 comments

Comments

@chazmcgarvey
Copy link

Checklist

  • I have used the search function to see if someone else has already submitted the same feature request.
  • I will describe the problem with as much detail as possible.
  • This request contains only one single feature, not a list of multiple (related) features.

App version

2.4.0

Where did you get the app from

F-Droid

Problem you may be having, or feature you want

The EasyPrivacy list matches some common trackers used by podcasts. If a system or network gateway is subscribed to this list to block traffic, downloading podcasts in AntennaPod fails with a generic "IO Error".

Suggested solution

AntennaPod could have an optional feature to unwrap tracking URLs. Not only would this allow downloading podcasts on privacy-concerned networks to work at all, but also any user who is interested in not being tracked could turn it on and get some extra privacy.

Personally I'd like to see this as a default-on feature, but opt-in would be better than nothing.

Screenshots / Drawings / Technical details

This is how the URL unwrapping could work. Note: sometimes the tracking URLs are nested, so the URL unwrapper should be recursive. Example:

https://www.podtrac.com/pts/redirect.mp3/pdst.fm/e/chtbl.com/track/XXX/traffic.megaphone.fm/CRMD83478344359?updated=1639156001

After unwrapping podtrac.com the URL becomes:

https://pdst.fm/e/chtbl.com/track/XXX/traffic.megaphone.fm/CRMD83478344359?updated=1639156001

After unwrapping pdst.fm the URL becomes (note: XXX is a placeholder for a tracking identifier):

https://chtbl.com/track/XXX/traffic.megaphone.fm/CRMD83478344359?updated=1639156001

Finally, after unwrapping chtbl.com the URL becomes:

https://traffic.megaphone.fm/CRMD83478344359?updated=1639156001

URLs could be nested in any order, so the unwrapper could just keep unwrapping until the URL fails to match something that the unwrapper knows how to unwrap.

As of today, only pdst.fm and chtbl.com are in the EasyPrivacy list, but www.podtrac.com could be added in the future. It would be a good idea to add support for unwrapping all three of these trackers (plus any others that anyone else is aware of).

@ByteHamster
Copy link
Member

I think attempting something like this will end up being a cat-and-mouse game. They will change the format, AntennaPod will break, we push an update, they change the format again. I would rather not try to start modifying the download URLs.

By the way, AntennaPod automatically clears the cookies every time you launch the app, so there is not too much information that could be collected.

@chazmcgarvey
Copy link
Author

Yes, ensuring your own and your user's privacy is always a cat-and-mouse game. Trackers are relentless. But it might be worth it. Fortunately if a tracker does change their URL pattern -- which doesn't generally happen in practice -- AntennaPod wouldn't really "break", it's just that the anti-tracking feature will allow a tracker (but there should be no stated guarantee that the feature will block all trackers anyway since that's unreasonable).

The feature seems pretty simple and could be built flexibly to make it easy to add new trackers. That said, I find your argument reasonable to consider something like this out-of-scope for the project. Feel free to close wontfix.

@krushia
Copy link

krushia commented Dec 13, 2021

Allowing the user to specify custom cookies could help as well. For example, to opt out of Nielsen tracking per https://www.megaphone.fm/adchoices

@keunes
Copy link
Member

keunes commented Dec 14, 2021

Allowing the user to specify custom cookies could help as well

That sounds like a lot of (implementation & maintenance) work for the little gain (given that cookies get wiped every app launch). Time that goes away from adding/improving core features, so I wouldn't be in favour of implementing that :)

@Mrnofish
Copy link

Mrnofish commented Mar 17, 2022

AntennaPod will break, we push an update, they change the format again.

It doesn't need to go that way:

  1. When the option is turned on, AntennaPod can download a filter file from either this or another repo, a file that the devs need not update themselves
  2. If the replacements cannot be applied successfully, AntennaPod can still fail graciously

There is a third pretty basic option that could be considered either in conjunction to the above, or in isolation, and that would be to let the user edit the podcast URL from inside AntennaPod -- this way the tracking can be removed manually.

BTW:

  1. Tracking does not need cookies, so clearing cookies does not stop trackers from collecting identifiable information that can be used to build a profile, such as IP and other data AntennaPod is providing
  2. AntennaPod is ALREADY broken whenever such filters are in place

PS:

a very rough draft of how the filtering could work:

podcast_url = 'https://www.podtrac.com/pts/redirect.mp3/pdst.fm/e/chtbl.com/track/524GE/traffic.megaphone.fm/VMP7301319640.mp3?updated=1647452013'

repl_table = [
  { match: 'podtrac.com/pts/redirect.mp3/', repl: '' },
  { match: 'pdst.fm/e/', repl: '' },
  { match: 'chtbl.com/track/.+/', repl: '' }
]

Of course the values for populating repl_table would be loaded from a file, as mentioned above.

You'd have a for cycle iterating through repl_table, building each regexp and applying it against podcast_url, checking the return values.

iteration 0: 'https://pdst.fm/e/chtbl.com/track/524GE/traffic.megaphone.fm/VMP7301319640.mp3?updated=1647452013'
iteration 1: 'https://chtbl.com/track/524GE/traffic.megaphone.fm/VMP7301319640.mp3?updated=1647452013'
iteration 2: 'https://traffic.megaphone.fm/VMP7301319640.mp3?updated=1647452013'

Probably not the most elegant approach but it could work and retain a modicum of flexibility.

@ByteHamster
Copy link
Member

Tracking does not need cookies, so clearing cookies does not stop trackers from collecting identifiable information that can be used to build a profile, such as IP and other data AntennaPod is providing

The IP changes regularly and AntennaPod does not provide other data. The user agent string is simply "Antennapod/2.5.0" - not as verbose as the ones on desktop browsers.

@ThreeDeeJay
Copy link

@ByteHamster How about just some simple regex matching for whitelist sanitation instead of blacklist?
That way, the burden would be on the user to find/update a matching pattern
The NPR RSS feed started stacking trackers, but finding a regex expression to the direct file was rather easy, though it could be improved a lot further to cover more cases 👀👌
image
RegEx could also be used to sanity any external URLs like images too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants