Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discord embed bot invalidates one-time pastes #8

Open
Rycoh99 opened this issue Apr 2, 2024 · 10 comments
Open

Discord embed bot invalidates one-time pastes #8

Rycoh99 opened this issue Apr 2, 2024 · 10 comments

Comments

@Rycoh99
Copy link

Rycoh99 commented Apr 2, 2024

Hi! Whenever you send a one-time link for a paste on a Discord chat, the "embed bot" sends a GET request to the paste URL which invalidates it if it's a one-time view paste.

A manual workaround for this is sending the URL but enclosed in "<>" signs, which tells Discord to not embed the URL, but this is mostly inconvenient most of the times.

I assume this could be fixed by adding a blacklist for Discord embed bot's user-agent, to disallow them from parsing the contents of the paste.

Also thanks for this amazing project, I've been using it for years and it's the easiest one to use.

Thanks!

@Rycoh99
Copy link
Author

Rycoh99 commented Apr 2, 2024

An example of what I am talking about
mBSG2uhFLe

@starius
Copy link
Owner

starius commented Apr 2, 2024

Hi! Thank you for feedback!

Many messengers do this. I send Pasta ID as a generic workaround. It is copied to clipboard when you click on it - it is done for this scenario.

Blacklisting user agents of messengers is a good idea! I'll explore it.

A related issue is file download. Duckduckgo browser on Android sends GET request twice: one originally and another one when user confirms saving the file. The second request fails, because the record is removed on the first request. I don't know how to fix this. Google Chrome on mobile works without such a problem.

@Rycoh99
Copy link
Author

Rycoh99 commented Apr 3, 2024

Hi, thanks for your reply & attention! Props for giving attention to the messenger/embed bot crawlers issue! For the DuckDuckGo browser problem: I was wondering if the first request was sent with a different body from the second one, so I sniffed the traffic of the browser app and I found out that the second request which actually downloads the file has a shorter request body than the first request sent, so my first thought was denying the response from the first request but allowing it for the second request which has the different request body, but you'd have to implement a server-side check to check if the request was sent from a DuckDuckGo user-agent (provided below in the pastes), but also if the paste is a file (and if so, deny the request but allow the second one).

This two request issue only affects file pastes from my tests so that's good at least.
Sorry if my explanation was confusing, I'm happy to try to explain again or maybe we can chat somewhere.

The network requests I intercepted with Burp proxy are here:
First request: https://pastacity.nl/sight-grass
And the second request which DuckDuckGo actually sends to download the file: https://pastacity.nl/check-price

Thank you again!

@starius
Copy link
Owner

starius commented Apr 3, 2024

Hi! Thank you very much for looking into duckduckgo issue!

so my first thought was denying the response from the first request but allowing it for the second request which has the different request body

If the server fails the first request, then the browser won't understand, that it is a file download and the whole thing won't work, I think.

I think it can instead show a download page if file download is detected from duckduckgo browser. The download page will tell the user why he sees it (the issue with duckduckgo browser) and provide a link to download URL (the same URL plus some GET parameter to differentiate on server side) and instruction to long press it and select "Download file" option. Hopefully duckduckgo will download it in a single request.

@Rycoh99
Copy link
Author

Rycoh99 commented Apr 3, 2024

Hi! That sounds like a good idea! Reading this gave me perhaps another convinient idea: what if for the first request you return a dummy file (maybe with same amount of bytes as the actual file, if DuckDuckGo cares about that) but return the actual file in the 2nd request (therefore keeping the one-time URL rule present), this might work unless DDG has some integrity checks in place, I might host myself a local server and mess around with this tomorrow, as I am not sure if this would really work.

@starius
Copy link
Owner

starius commented Apr 3, 2024

That is interesting idea, but I am worried about breaking correct scenario (in case DDG will fix itself) in favor of incorrect one. We can end up with a dummy file in the end. Also I think we should push DDG to fix it on their end, rather than cementing their bugs.

@Rycoh99
Copy link
Author

Rycoh99 commented Apr 3, 2024

Definitely the cleanest approach! Worth a shot.

@starius
Copy link
Owner

starius commented Apr 4, 2024

I added crawler detection and block requests from crawlers to one time links. It covers all crawlers from https://github.com/monperrus/crawler-user-agents not only Discord.

I'll keep the issue open until DDG problem is resolved.

@Rycoh99
Copy link
Author

Rycoh99 commented Apr 4, 2024

Hi again! I tested it again and at least Discord embed bot seem to not cause any issues anymore, thank you for implementing this! I will also still look forward to see if the DDG issue will ever get fixed on their end.

Thank you for your attention!

@starius
Copy link
Owner

starius commented Apr 4, 2024

You are welcome!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants