Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More annoying URL parameters cleanup + loop cleanup #133

Closed
alexisju opened this issue Mar 3, 2015 · 5 comments
Closed

More annoying URL parameters cleanup + loop cleanup #133

alexisju opened this issue Mar 3, 2015 · 5 comments
Labels
cleanup code cleanup and refactoring easy good place to start contributing enhancement

Comments

@alexisju
Copy link

alexisju commented Mar 3, 2015

Dans index.php, j'ai ajouté ceci pour nettoyer diverses url (ajouts de facebook, scoopit, et divers utm...) : ça fonctionne bien même si cette liste n'est pas exhaustive...

$i=strpos($url,'?fb_'); if ($i!==false) $url=substr($url,0,$i);
$i=strpos($url,'?__scoop'); if ($i!==false) $url=substr($url,0,$i);
$i=strpos($url,'#tk.rss_all?'); if ($i!==false) $url=substr($url,0,$i);
$i=strpos($url,'?utm_campaign='); if ($i!==false) $url=substr($url,0,$i);
$i=strpos($url,'?utm_medium='); if ($i!==false) $url=substr($url,0,$i);

@nodiscc nodiscc changed the title Clean url More annoying URLs cleanup Mar 3, 2015
@nodiscc
Copy link
Member

nodiscc commented Mar 3, 2015

Copying comments from sebsauvage#168:

nodiscc commented 3 hours ago
@alexisju J'aime l'idée de nettoyer les URLs, par contre pense que l'intégration à Shaarli n'est pas la meilleure solution; je serais même d'avis de retirer le nettoyage d'URL complètement (déjà pas mal de cleanup de fait ou en cours sur https://github.com/shaarli/Shaarli - fork communautaire) mais de fournir un userscript qui fasse le nettoyage directement au niveau du navigateur.
On pourrait facilement ajouter cet userscript à la doc et le publier sur https://monkeyguts.com/, https://openuserjs.org/, https://greasyfork.org/en avec un petit lien vers Greasemonkey
Peux tu ouvrir une issue sur https://github.com/shaarli/Shaarli/issues ?

alexisju commented 15 minutes ago
Oui, je comprends logique, mais l'externaliser complètement de shaarli risque d'être moins pratique pour l'utilisateur lambda. Or selon moi, c'est un fonctionnalité importante (personnellement, je déteste archiver des url avec ce genre de traqueurs)... Par ailleurs, même si le système actuel n'est pas parfait, il fonctionne pour une majorité des traqueurs que je rencontre; et ces traqueurs sont relativement stable dans le temps...
Bref, j'ouvre une nouvelle issue sur shaarli/shaarli :)

TL;DR I was proposing to remove URL cleanup features from Shaarli, and moving this to a userscript. It is true that this is less useful for the average user. @alexisju has more (tested) URL filters to propose.

@nodiscc nodiscc changed the title More annoying URLs cleanup More annoying URL parameters cleanup Mar 3, 2015
@e2jk
Copy link

e2jk commented Mar 4, 2015

It is true that this is less useful for the average user.

Not sure I understand exactly what you mean by this, @nodiscc.
Shaarli's primary role is to store URLs for later retrieval. As such, I believe it should store as clean URLs as possible, and that if there are well known/well identifiable patterns that can be removed (as is the case with those that are currently in the code, and the new ones suggested by @alexisju) I do believe removing them has a place in Shaarli itself.

To make that code more manageable, it would make sense to have the strings in a list or array, and then loop over that array to call $i=strpos($url,$pattern); if ($i!==false) $url=substr($url,0,$i);

Note that the current code drops everything that would come after the pattern, which seems to be fine in most cases (the pattern is added at the end) but is definitely not the ideal solution. So as long as we're discussing these patterns, we should look into stripping only the keyword and it's associated value (something like searching for the pattern between (?, & or #) and =, and remove the value between the = and the next & (or end of string, if no & is found).

But this could be done in two steps/PRs: first loop over array and include the new strings, then later refactor code to only strip relevant additons while leaving potential arguments that are part of that URL itself.

All this (I should not have turned the --verbose option on ;) ) to say I don't agree with moving this out into a userscript, which almost nobody would have installed anyway.

@nodiscc
Copy link
Member

nodiscc commented Mar 4, 2015

I don't agree with moving this out into a userscript, which almost nobody would have installed anyway.r

Ok this is what I came to think too (hence "this is less useful for the average user"). Let's refactor the URL cleaning code to use looping over an array, and add @alexisju's suggestions.

Cleaning only the relevant part of the URL can be done in a later PR.

@e2jk
Copy link

e2jk commented Mar 4, 2015

Created separate issue #136 for the selective clean-up (after describing this further, I'm not 100% sure we need to do that, please discuss there).
@alexisju would you like to take a pass on refactoring the current code, and add your new parameters to the list? That give you a good opportunity to get more familiar with the development process of Shaarli, and we'll be able to discuss the changes in your Pull Request. Let us know if you need help/pointers on how to get started, should you not be familiar yet with such a workflow.

@nodiscc nodiscc changed the title More annoying URL parameters cleanup More annoying URL parameters cleanup + loop cleanup Mar 4, 2015
@nodiscc nodiscc added the cleanup code cleanup and refactoring label Mar 4, 2015
@mro
Copy link

mro commented Mar 4, 2015

https://github.com/mro/Shaarli/commit/0e0771ff9d9ab2cbe0939bb20c0104cee8f49839 that's how I do it in my installation.

@nodiscc nodiscc closed this as completed in ad2a397 Mar 5, 2015
virtualtam added a commit that referenced this issue Aug 15, 2015
Relates to #141
Relates to #133

Modifications
 - move URL cleanup to `application/Url.php`
 - rework the cleanup function
   - fragments: `#stuff`
   - GET parameters: `?var1=val1&var2=val2`
 - add documentation (APIs the params belong to)
 - add test coverage

Reference
 - http://php.net/parse_url
 - http://php.net/manual/en/language.oop5.magic.php#language.oop5.magic.tostring

Signed-off-by: VirtualTam <virtualtam@flibidi.net>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cleanup code cleanup and refactoring easy good place to start contributing enhancement
Projects
None yet
Development

No branches or pull requests

4 participants