Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace hardcoded regexes for namespaces(e.g.: (?:[Ii]mage|[Ff]ile) ) by others generated from wgNamespaceIds #103

Closed
he7d3r opened this issue May 19, 2012 · 9 comments
Labels
Module: morebits The morebits.js library Other wikis non-EnWiki issues and language/i18n/l10n stuff

Comments

@he7d3r
Copy link

he7d3r commented May 19, 2012

Currently, there are some instances of the regex "(?:[Ii]mage|[Ff]ile)" and "(?:[Tt]emplate:)" in the code.

It would be better to use all possible aliases of the "File:" namespace ("Template:", respectively) which are in use in a given wiki.

E.g.: On Portuguese Wikipedia, the regex should be "(?:[Ff]icheiro|[Ii]magem|[Aa]rquivo|[Ii]mage|[Ff]ile)". The code below should work on any other wiki as well:

var ids = mw.config.get('wgNamespaceIds'),
    aliases = [],
    first;
for( nsName in mw.config.get('wgNamespaceIds') ){
    if ( ids[nsName] === 6 ) {
        first = nsName.substr(0,1);
        aliases.push( '[' + first.toUpperCase() + first.toLowerCase() + ']' + nsName.substr(1) );
    }
}
alert( '(?:' + aliases.join('|') + ')' );
@atlight
Copy link
Collaborator

atlight commented May 20, 2012

Since Twinkle is English Wikipedia-specific, I don't think it would be useful to fix this, particularly since adding on-the-fly regex creation logic would unnecessarily slow down the code. Anyone localising Twinkle should fix these regexes as they go.

@atlight atlight closed this as completed May 20, 2012
@atlight
Copy link
Collaborator

atlight commented May 20, 2012

Hmm; I didn't see that this was in morebits. I am not sure what to do with morebits - whether to make it fully localisable (with string table, etc) or whether just to leave it for translators to modify. For the moment it will obviously be the latter!

@atlight atlight reopened this May 20, 2012
@he7d3r
Copy link
Author

he7d3r commented May 20, 2012

I believe once the new version of the gadgets extension is available, the localisation would be moved to MediaWiki:messages and the gadget could be moved to the central repo of gadgets (e.g. mediawiki.org). In this sense, the most language-independent we keep the whole code, the better. Or else the chances are we will have the usual proliferation of outdated hacks being copied from one wiki to another...

This was my motivation to report some bugs/request some enhancements.

@atlight
Copy link
Collaborator

atlight commented May 20, 2012

Yes, it would be nice. However, that would only work for core parts of Twinkle, since most modules depend on some kind of local structure (e.g. CSD criteria/tagging templates/notification templates, Welcome templates; ARV page format; Tag templates). Siddhartha Ghai is working on TWG, which is a project similar to Twinkle that is designed with localisation in mind: see [User:Siddhartha Ghai/TWG.js](http://en.wikipedia.org/wiki/User:Siddhartha Ghai/TWG.js).

It is my long-term goal to make Twinkle as localisable as possible (still, modules like XFD will surely require code modification across wikis, but modules such as CSD, Tag, Welcome should be able to work on different wikis). However, it would take a lot of work.

Where is the information about the new version of Gadgets? It's not obvious to me.

@atlight atlight closed this as completed May 20, 2012
@atlight atlight reopened this May 20, 2012
@he7d3r
Copy link
Author

he7d3r commented May 20, 2012

There is some info here:
https://www.mediawiki.org/wiki/ResourceLoader/V2_testing#RL2_in_a_nutshell
https://www.mediawiki.org/wiki/ResourceLoader/Version_2_Design_Specification#Messages
but I've seen comments about Gadgets 3.0 as well,
https://www.mediawiki.org/wiki/Roadmap#MediaWiki_infrastructure
and I'm not sure what exactly will come in that version...
The improvements from last GSoC would be great for customizations:
https://www.mediawiki.org/wiki/User:Salvatore_Ingala/Notes

@Amorymeltzer
Copy link
Collaborator

This is necroposting, but, uh, basically none of the above has yet come to pass.

Regardless, a good first step would be removing the hardcoding of Morebits.wikipedia. MediaWiki-1.16 (circa 2011) added wgNamespaceIds and wgFormattedNamespaces, which should take care of most of the uses of those objects. Morebits.wikipedia.namespaces is basically a carbon copy of wgFormattedNamespaces (with the added advantage that the project name for project is used).

There are only three uses of Morebits.wikipedia.namespaces and Morebits.wikipedia.namespacesFriendly in our codebase. It shouldn't be too difficult to remove them, but I don't know who or what else relies on the objects: the only uses I can find are old, unused copies of old Twinkle code, nothing relying on current gadgets. Morebits.wikipedia.namespacesFriendly seems particularly unlikely to be widely used outside en.wiki; it removes -1 and -2 and uses Wikipedia for 4 and (Article) for 0, which we can deal with (using (Main) should be fine across projects).

Dunno about you @atlight @MusikAnimal but I don't think we need them? We could keep Morebits.wikipedia.namespaces for the sake of backward compatibility(?), I suppose, and just copy wgFormattedNamespaces into it; I certainly think we'd be fine just removing namespacesFriendly altogether.

It's a small step, but should lay the groundwork for the above request. I think there's a little bit in #485 that does this sort of thing, actually.

@siddharthvp
Copy link
Member

siddharthvp commented Mar 28, 2019 via email

@siddharthvp
Copy link
Member

Regarding the issue at hand here: based on the code given here (which also appears in @Siddhartha-Ghai's TWG), it is easy to derive a generalised function for giving namespace name regexes for any namespace:

function namespaceRegex(namespaceNumber) {
	var namespaceRegex = "";
	for ( var alias in mw.config.get('wgNamespaceIds') ) {
		if ( mw.config.get('wgNamespaceIds')[alias] === namespaceNumber ) {
			if (alias[0].toUpperCase() === alias[0].toLowerCase()) {
				namespaceRegex += alias;
			} else {
				namespaceRegex += '[' + alias[0].toUpperCase() + alias[0] + ']' + alias.slice(1);
			}
			namespaceRegex += '|';
		}
	}
	namespaceRegex = namespaceRegex.slice(0,-1).replace(/_/g,'[ _]');
	return namespaceRegex;
}

(The if (alias[0].toUpperCase() === alias[0].toLowerCase() logic is added in interest of non-latin script languages which don't have upper/lower case characters.)

Regarding @atlight's concern that on-the-fly regex creation logic slows the code, the solution is that we pre-compute these regexes (for the required namespaces) in twinkle.js file and store them in variables, for ready access. eg:

Twinkle.file_ns_rgx = namespaceRegex(6);

Regarding whether these changes should be made here is questionable, but this must be there in any attempted internationalized version of Twinkle.

siddharthvp added a commit to siddharthvp/twinkle that referenced this issue Apr 24, 2019
Based on the discussion at wikimedia-gadgets#103, specifically [this comment](wikimedia-gadgets#103 (comment)).

On enwiki, Morebits.wikipedia.namespaces is just a copy of mw.config.get('wgFormattedNamespaces').
On other wikis, it is useless. mw.config.get('wgFormattedNamespaces') is better as it shows localised
namespaces names across different wikis.

Morebits.wikipedia.namespacesFriendly (also enwiki-only) is used just once in the codebase. This usage
has been changed to use wgFormattedNamespaces.

Also removed the 8-year-old comment noting the removal of Twinkle blacklist by someone, because it has
nothing to do with morebits.
siddharthvp added a commit to siddharthvp/twinkle that referenced this issue Apr 24, 2019
Based on the discussion at wikimedia-gadgets#103, specifically wikimedia-gadgets#103 (comment).

On enwiki, Morebits.wikipedia.namespaces is just a copy of mw.config.get('wgFormattedNamespaces'). On other wikis, it is useless. mw.config.get('wgFormattedNamespaces') is better as it shows localised namespaces names across different wikis.

Morebits.wikipedia.namespacesFriendly (also enwiki-only) is used just once in the codebase. This usage has been changed to use wgFormattedNamespaces.

Also removed the 8-year-old comment noting the removal of Twinkle blacklist by someone, because it has nothing to do with morebits.
siddharthvp added a commit to siddharthvp/twinkle that referenced this issue Apr 24, 2019
Based on the discussion at wikimedia-gadgets#103, specifically wikimedia-gadgets#103 (comment).

On enwiki, Morebits.wikipedia.namespaces is just a copy of mw.config.get('wgFormattedNamespaces'). On other wikis, it is useless. mw.config.get('wgFormattedNamespaces') is better as it shows localised namespaces names across different wikis.

Morebits.wikipedia.namespacesFriendly (also enwiki-only) is used just once in the codebase. This usage has been changed to use wgFormattedNamespaces.

Also removed the 8-year-old comment noting the removal of Twinkle blacklist by someone, because it has nothing to do with morebits.
Amorymeltzer added a commit that referenced this issue May 17, 2019
* morebits: remove Morebits.wikipedia

Based on the discussion at #103, specifically #103 (comment).

On enwiki, Morebits.wikipedia.namespaces is just a copy of mw.config.get('wgFormattedNamespaces'). On other wikis, it is useless. mw.config.get('wgFormattedNamespaces') is better as it shows localised namespaces names across different wikis.

Morebits.wikipedia.namespacesFriendly (also enwiki-only) is used just once in the codebase. This usage has been changed to use wgFormattedNamespaces.

Co-Authored-By: siddharthvp <siddharthvp@gmail.com>
Co-Authored-By: Amory Meltzer <Amorymeltzer@gmail.com>
@siddharthvp
Copy link
Member

Resolved in #1262 by @Amorymeltzer. On-the-fly regex creation is being used; don't think the 2012-era concerns of it slowing down the code are applicable any longer as today's JS engines are just so fast.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Module: morebits The morebits.js library Other wikis non-EnWiki issues and language/i18n/l10n stuff
Projects
None yet
Development

No branches or pull requests

4 participants