Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New module or option: Urls unshortener #160

Closed
Ilithy opened this issue Dec 21, 2022 · 9 comments
Closed

New module or option: Urls unshortener #160

Ilithy opened this issue Dec 21, 2022 · 9 comments
Labels
enhancement New feature or request

Comments

@Ilithy
Copy link
Collaborator

Ilithy commented Dec 21, 2022

Please describe your feature request.
It will be a question of adding à URLCheck the ability to unshorten urls

This would be very positive, whether for practicality (time saving, no advertising problem or excessive redirections)
only for the safety of the user (the shortened urls make it possible to hide a dangerous url)

  • It would be great if this could be done locally (i.e. without using an online service to unshorten the links)

  • It would be useful to have an option to automatize the unshortening of urls

I absolutely don't know if it's usable, but here are projects aimed at this objective (un-shorten url links)

  1. https://github.com/aliazhar-id/unshorter
  2. https://github.com/nodeca/url-unshort

Additional context

  • If it could be managed like redirects (via a rules editor) it be able to offer maximum customization and flexibility.

Thanks

@Ilithy Ilithy added the enhancement New feature or request label Dec 21, 2022
@TrianguloY
Copy link
Owner

Thanks for creating a new issue, this is a question that I have already received by email and will probably receive again, so I wanted to have it easy to link.


The way an url-shortener works is by keeping a database with all the short-url --> long url relations. The short urls are just keys that you need to find in the database in order to know the original full url, and that database is property of the shortener service. I'm afraid there is no way to 'unshorten' an url locally, without fetching it from the service, as the url itself has no information1. Both projects you link create a petition and returns the redirection uri returned by the server. This is what the status code module does (which interestingly was originally called the redirection module).

A way to configure the status module to perform an automatic fetch for specific services could be a valid new feature that I will consider (not for all urls, fetching any random url can have bad consequences, but in any case it would be configurable).

Footnotes

  1. Note: maybe there is a url-service that makes url shorter by compacting or using an algorithm to encode a string. I'm not aware of such service, which will probably be called 'url encoder' or similar. In that case it will be possible to 'decode' the url with a new module.

@PabloOQ
Copy link
Collaborator

PabloOQ commented Dec 22, 2022

This module/feature would be great. I think there are 2 ways to approach this:

  • Privacy
  • Convenience

For convenience is as simple as finding a database to match an URL redirection pattern then fetch the status, as Triangulo says. These are usually just domains, but I have seen URLs in emails where the final destination can't be extracted as it works on a per user basis, e.g. site.com/redirect?dest_and_trackid=9D5...AF7 , here two people will have different values on dest_and_trackid but both will redirect to the same page.

For privacy there needs to be a database with the exact origin URL and the exact destination URL, either that or a proxy wich does the fetching for the user.

I believe, but I'm not sure, this extension has a database with origin and destination, https://fastforward.team/, here is the documentation on how to get this information, https://fastforwardteam.github.io/serverdocs/#crowd-query (don't know if it works, tried it myself some time ago to no avail), there is also a way to send to contribute to the database with a bypassed URL. In one of the services mentioned by Ilithy (url-unshort) they talk about https://archive.org/details/301works?tab=about which seems to be a database too.

For the proxy solution here are some sites, I did not check the legitimacy of these, I was trying to find a service I used in the past, which worked and had a way to easily request URLs destinations through GET requests, sadly I can't find it anymore, while searching for it I found a lot of these services.

@TrianguloY
Copy link
Owner

Oh, using an external service to do the fetching and possibly caching? That's...a very good alternative that I haven't considered! (And it seems so obvious now)

In fact I have thought about a new module to "modify" any url, a module where you send the url to an external service and it returns a new one. If configurable this could be very powerful indeed, even if you need to perform a petition.

I'll investigate those services, thanks!

@Efreak
Copy link

Efreak commented Jan 29, 2023

On checking status, check if the status also includes a redirect, too.

@TrianguloY
Copy link
Owner

On checking status, check if the status also includes a redirect, too.

If you mean the check status module, it does check for redirection. And if there is one, it allows you to 'follow' it.

@PabloOQ
Copy link
Collaborator

PabloOQ commented Jan 29, 2023

I looked this up again and found out that DuckDuckGo uses unshorten.me

image

More about it here: https://unshorten.me/api

The API is limited to 10 requests per hour for new short URLs.

If this service were to be used this would need to be taken into consideration, we would need to check the domain or match a pattern in the URL to be sure that the URL is a redirection URL before sending a request to the service.

@Ilithy
Copy link
Collaborator Author

Ilithy commented Feb 13, 2023

@TrianguloY
Thank you very much for this new module, for the work and the follow-up of the requests 🙏❤️

@im-not-food
Copy link

Hi, @TrianguloY
I suggest not to use unshorten.me api directly. Use DuckDuckGo proxy instead. The limit is 1000 requests per hour.

Example: https://duckduckgo.com/js/spice/expand_url/goo.gl%2FIGL1lE

Server-side code used by DuckDuckGo:
https://github.com/duckduckgo/zeroclickinfo-spice/blob/master/lib/DDG/Spice/ExpandURL.pm

Client-side code:
https://github.com/duckduckgo/zeroclickinfo-spice/blob/master/share/spice/expand_url/expand_url.js

@Murilogs1910
Copy link
Contributor

I'd love to have that as an option! But I still think that, by default, the api should be accessed directly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

6 participants