Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for fetching namespaces #10

Closed
moy opened this issue Jul 9, 2012 · 31 comments
Closed

Support for fetching namespaces #10

moy opened this issue Jul 9, 2012 · 31 comments

Comments

@moy
Copy link
Collaborator

moy commented Jul 9, 2012

We currently fetch the main namespace by default, and the File: namespace if mediaimport is true, but we don't fetch e.g. the Template: or discussion namespaces.

@freephile
Copy link

I'd love to see support for fetching templates and talk pages... two big features of mediawiki that I use extensively. For me, just fetching content from the default namespace (regular articles) is not a good clone of my mediawiki.

@moy
Copy link
Collaborator Author

moy commented Sep 3, 2013

This should be rather straightforward to add, as the code internally works with namespaces already. I won't have time to implement it myself any time soon though.

@MarSoft
Copy link

MarSoft commented Apr 8, 2014

Would be very convenient to have a possibility to work with SemanticForms' namespaces (Template, Form, Property...) because it is pain to work with them thru web interface..
I would implement it myself but I don't know Perl good enough.

@MarSoft
Copy link

MarSoft commented Apr 8, 2014

Well, I managed to make a patch which seems to work. It allows the user to explicitly add some categories and force them to load during repository initializing.
How can I submit the patch?

@moy
Copy link
Collaborator Author

moy commented Apr 8, 2014

How can I submit the patch?

Git-Mediawiki is part of Git, so you should submit patches to the Git
mailing-list, Cc-ing me.

Read this before:
https://github.com/git/git/blob/master/Documentation/SubmittingPatches

@kyv
Copy link
Contributor

kyv commented Dec 8, 2014

@MarSoft , can you share the patch? In a gist or a link to the submission to git? I need this too I can help test and fix some things if necesecary.

@akhuettel
Copy link

@MarSoft I would be really interested in the patch too, and could help cleaning it up for integration!

@akhuettel
Copy link

Just as a note, it's important that different namespaces can contain identically named pages... so probably subdirs of the checkout would be appropriate...

@MarSoft
Copy link

MarSoft commented Feb 17, 2015

Hello. Just found that patch on my computer. Posted it here: https://gist.github.com/MarSoft/ca00cecbd9d426d9e614

@lyubomyr-shaydariv
Copy link

Hi guys. Just want to ask: was it implemented?

@kyv
Copy link
Contributor

kyv commented Aug 22, 2015

Here´s a patch. My git may or may not be in sync with upstream. So the patch may or may not apply directly, but I think you will get the drift of it.

https://gist.github.com/kyv/9e3f4a1b447bf5e8f150

@lyubomyr-shaydariv
Copy link

@kyv, thank you, it's working.

@moy
Copy link
Collaborator Author

moy commented Aug 23, 2015

@kyv: to integrate the patch, you need to follow the normal procedure to contribute to Git. See here: https://github.com/git/git/blob/master/Documentation/SubmittingPatches

Please, Cc: me when you send your patch.

Thanks,

@kyv
Copy link
Contributor

kyv commented Aug 25, 2015

@moy, I no longer use git-mediawiki, so do not have much interest in going through the procedure. I just put it there to be helpful.

@moy
Copy link
Collaborator Author

moy commented Aug 25, 2015

Submitting code to Git is fun, you should do it ;-).

More seriously, if you agree with Git's Developer's Certificate of Origin 1.1, can you add your Sign-off-by: to your patch (see https://github.com/git/git/blob/master/Documentation/SubmittingPatches#L234). This way, someone else (possibly me when I get time) can submit your code.

Thanks,

@kyv
Copy link
Contributor

kyv commented Aug 25, 2015

Ok I´ll do that later then.

@kyv
Copy link
Contributor

kyv commented Aug 28, 2015

@moy, I created a new patch. I signed off on this one. I also generated agains current master and squashed together what previously appeared as two commits in one.

https://gist.github.com/kyv/9e3f4a1b447bf5e8f150

@anarcat
Copy link
Contributor

anarcat commented Oct 17, 2015

@kyv that's great! can you send the patch to the mailing list? i believe you need to send it to git@vger.kernel.org

@anarcat
Copy link
Contributor

anarcat commented Oct 17, 2015

oh, and it seems that "all pages" doesn't actually fetch from all namespaces with that fix, it seems that it's an improvement that could be done on the patch. indentation also seems to be a little off.

@anarcat
Copy link
Contributor

anarcat commented Oct 17, 2015

finally, trying the modified version, i get this when trying to specify a namespace:

$ git  -c remote.origin.namespaces=Talk clone mediawiki::http://...
Clonage dans '...'...
[...]
3: apunknown_apnamespace: Unrecognized value for parameter 'apnamespace': Talk

@Grumbel
Copy link
Contributor

Grumbel commented Jan 12, 2016

There seems to be a problem with the patch, running it vanilla gave me:

git clone -c remote.origin.namespaces=Talk mediawiki::http://supertux.lethargik.org/wiki/
Cloning into 'wiki'...
Searching revisions...
No previous mediawiki revision found, fetching from beginning.
Fetching & writing export data by pages...
Listing pages on remote wiki...
3: apunknown_apnamespace: Unrecognized value for parameter 'apnamespace': Talk
Checking connectivity... fatal: bad object 0000000000000000000000000000000000000000
fatal: remote did not send all necessary objects

apnamespace seems to expect an id not a string, changing the line:

 apnamespace => $local_namespace,

to:

 apnamespace => get_mw_namespace_id($local_namespace),

seems to make things work.

With that change in place, the way namespaces are split up would also need adapting, as namespaces frequently contain spaces and splitting is currently done by space or newline (e.g. it fails with "File talk"):

my @tracked_namespaces = split(/[ \n]/, run_git("config --get-all remote.${remotename}.namespaces"));

@anarcat
Copy link
Contributor

anarcat commented Jan 13, 2016

@Grumbel i updated the patch in https://gist.github.com/anarcat/f821fa285c6b8b6b16a5

but i am not sure i covered all the changes you described, could you clarify how the last change is done?

then we do need someone to carry this to the git mailing list...

@Grumbel
Copy link
Contributor

Grumbel commented Jan 13, 2016

I just did it the quick and dirty way and replace the space in the regex with a comma:

my @tracked_namespaces = split(/[ \n]/, run_git("config --get-all remote.${remotename}.namespaces"));

to:

my @tracked_namespaces = split(/[,\n]/, run_git("config --get-all remote.${remotename}.namespaces"));

That was enough to make it work for my uses, but I don't know what the valid characters for namespaces are and comma might be one of them, so there might be a better way to handle the splitting.

@anarcat
Copy link
Contributor

anarcat commented Jan 13, 2016

hmm... the documentation in the file there says:

# Accept both space-separated and multiple keys in config file.
# Spaces should be written as _ anyway because we'll use chomp.

so it seems to me that the space-separated idea should stay... besides it would break every other config out there...

@anarcat
Copy link
Contributor

anarcat commented Jan 13, 2016

also, re-reading https://github.com/git/git/blob/master/Documentation/SubmittingPatches - we will need unit tests before this gets merged in, unfortunately.

@Grumbel
Copy link
Contributor

Grumbel commented Jan 13, 2016

The issue is that with the current code you can't checkout namespaces that have spaces in them:

git clone -c "remote.origin.namespaces=File talk" mediawiki::http://supertux.lethargik.org/wiki/ 

will make it look for File and talk namespaces, not the namespace File talk. Using "_" instead of space doesn't help with namespaces, as:

git clone -c "remote.origin.namespaces=File_talk" mediawiki::http://supertux.lethargik.org/wiki/ 

will complain about File_talk not being found. The reason for that is that get_mw_namespace_id() checks for the canonical name and the canonical name contains a space, not a _ (not sure if that is guaranteed for all namespaces or just the case with the default ones).

curl "http://supertux.lethargik.org/wiki/api.php?action=query&format=json&meta=siteinfo&siprop=namespaces" |  python -m json.tool

        "9": {
            "*": "MediaWiki talk",
            "canonical": "MediaWiki talk",
            "case": "first-letter",
            "id": 9,
            "subpages": ""
        }

A fix for this would be to take the namespaces in the File_talk notation and then translate them to their canonical representation by replacing all _ with spaces:

my @tracked_namespaces = split(/[ \n]/, run_git("config --get-all remote.${remotename}.namespaces"));
for (@tracked_namespaces) { s/_/ /g; }
chomp(@tracked_namespaces);

@akhuettel
Copy link

Patch added to Gentoo git patchset (even if it's not perfect yet :) )

@johannesloetzsch
Copy link

Thank you for implementing this feature :)

What is the recommended way to clone some user namespaces + the main namespace?
Since I was not able doing it with the patch of @anarcat, I added some minor changes:
https://gist.github.com/johannesloetzsch/910155f3ba70b6582906

@anarcat
Copy link
Contributor

anarcat commented Oct 29, 2017

hi all

started looking into this again, and got tired of the gisting... i published a branch on my fork here:

https://github.com/anarcat/git/tree/mediawiki-namespaces

which tries to merge in the patches from @kyv, @Grumbel and my own, along with a way to fetch the "main" namespace, an idea suggested by @johannesloetzsch but i used a slightly different approach: anarcat/git@17e1d97

with my approach, you specify "(Main)" as normal in the list of namespaces and it's simply treated differently in the namespace processor (because, dumbly, the MW API doesn't know how to translate that name). i used "(Main)" instead of "MAIN" because that is the name used in the documentation.

i seem able to fetch a full wiki with all namespaces with that approach.

and honestly, i think that should just be the default already - but that's another patch... there's a hint of how that could be done in anarcat/git@a624e45#diff-d1ae99a08192b4b3e5ad8570fdb59aa0R1337 - as soon as we fetched the namespace/id mapping, we know all the namespaces and we could just use that as a default. but meh. at this point, it's easier to just copy-paste the list...

@anarcat
Copy link
Contributor

anarcat commented Oct 29, 2017

i have also sent a modified patch series to the mailing list, in the hope of getting more traction on this:

https://public-inbox.org/git/20171029160857.29460-1-anarcat@debian.org/T/#m4c55498911654e05a3a84ab0754a34737a2d72ce

hopefully, we'll finally get this somewhere!

@anarcat
Copy link
Contributor

anarcat commented Nov 18, 2017

i believe this was merged to git master, so this can be closed.

@anarcat anarcat closed this as completed Nov 18, 2017
hexmode added a commit to hexmode/Git-Mediawiki that referenced this issue Feb 19, 2022
On a wiki where MediaWiki:Sidebar does not start with rev 1:

    $ git clone -c remote.origin.pages=MediaWiki:Sidebar mediawiki::https://www.wiki.org/w/
		…
		page 1/1: MediaWiki:Sidebar
		  Found 0 revision(s).
    You appear to have cloned an empty MediaWiki.
		fatal: could not read ref refs/mediawiki/origin/master

After this patch

    $ git clone -c  remote.origin.pages=MediaWiki:Sidebar mediawiki::https://www.wiki.org/w/
		…
		page 1/1: MediaWiki:Sidebar
		  Found 115 revision(s).
    Namespace MediaWiki not found in cache, querying the wiki ...
		1/115: Revision Git-Mediawiki#7 of MediaWiki:Sidebar
    2/115: Revision Git-Mediawiki#8 of MediaWiki:Sidebar
    3/115: Revision Git-Mediawiki#9 of MediaWiki:Sidebar
		4/115: Revision Git-Mediawiki#10 of MediaWiki:Sidebar
		5/115: Revision Git-Mediawiki#11 of MediaWiki:Sidebar
		6/115: Revision Git-Mediawiki#12 of MediaWiki:Sidebar

Fixes Git-Mediawiki#70
hexmode added a commit to hexmode/Git-Mediawiki that referenced this issue Feb 19, 2022
On a wiki where MediaWiki:Sidebar does not start with rev 1:

    $ git clone -c remote.origin.pages=MediaWiki:Sidebar mediawiki::https://www.wiki.org/w/
		…
		page 1/1: MediaWiki:Sidebar
		  Found 0 revision(s).
    You appear to have cloned an empty MediaWiki.
		fatal: could not read ref refs/mediawiki/origin/master

After this patch

    $ git clone -c  remote.origin.pages=MediaWiki:Sidebar mediawiki::https://www.wiki.org/w/
		…
		page 1/1: MediaWiki:Sidebar
		  Found 115 revision(s).
    Namespace MediaWiki not found in cache, querying the wiki ...
		1/115: Revision Git-Mediawiki#7 of MediaWiki:Sidebar
    2/115: Revision Git-Mediawiki#8 of MediaWiki:Sidebar
    3/115: Revision Git-Mediawiki#9 of MediaWiki:Sidebar
		4/115: Revision Git-Mediawiki#10 of MediaWiki:Sidebar
		5/115: Revision Git-Mediawiki#11 of MediaWiki:Sidebar
		6/115: Revision Git-Mediawiki#12 of MediaWiki:Sidebar

Fixes Git-Mediawiki#70
hexmode added a commit to hexmode/Git-Mediawiki that referenced this issue Feb 19, 2022
On a wiki where MediaWiki:Sidebar does not start with rev 1:

    $ git clone -c remote.origin.pages=MediaWiki:Sidebar mediawiki::https://www.wiki.org/w/
		…
		page 1/1: MediaWiki:Sidebar
		  Found 0 revision(s).
		You appear to have cloned an empty MediaWiki.
		fatal: could not read ref refs/mediawiki/origin/master

After this patch

    $ git clone -c  remote.origin.pages=MediaWiki:Sidebar mediawiki::https://www.wiki.org/w/
		…
		page 1/1: MediaWiki:Sidebar
		  Found 115 revision(s).
		Namespace MediaWiki not found in cache, querying the wiki ...
		1/115: Revision Git-Mediawiki#7 of MediaWiki:Sidebar
		2/115: Revision Git-Mediawiki#8 of MediaWiki:Sidebar
		3/115: Revision Git-Mediawiki#9 of MediaWiki:Sidebar
		4/115: Revision Git-Mediawiki#10 of MediaWiki:Sidebar
		5/115: Revision Git-Mediawiki#11 of MediaWiki:Sidebar
		6/115: Revision Git-Mediawiki#12 of MediaWiki:Sidebar

Fixes Git-Mediawiki#70
hexmode added a commit to hexmode/Git-Mediawiki that referenced this issue Feb 19, 2022
On a wiki where MediaWiki:Sidebar does not start with rev 1:

	$ git clone -c remote.origin.pages=MediaWiki:Sidebar mediawiki::https://www.wiki.org/w/
	…
	page 1/1: MediaWiki:Sidebar
	  Found 0 revision(s).
	You appear to have cloned an empty MediaWiki.
	fatal: could not read ref refs/mediawiki/origin/master

After this patch

	$ git clone -c  remote.origin.pages=MediaWiki:Sidebar mediawiki::https://www.wiki.org/w/
	…
	page 1/1: MediaWiki:Sidebar
	  Found 115 revision(s).
	Namespace MediaWiki not found in cache, querying the wiki ...
	1/115: Revision Git-Mediawiki#7 of MediaWiki:Sidebar
	2/115: Revision Git-Mediawiki#8 of MediaWiki:Sidebar
	3/115: Revision Git-Mediawiki#9 of MediaWiki:Sidebar
	4/115: Revision Git-Mediawiki#10 of MediaWiki:Sidebar
	5/115: Revision Git-Mediawiki#11 of MediaWiki:Sidebar
	6/115: Revision Git-Mediawiki#12 of MediaWiki:Sidebar

Fixes Git-Mediawiki#70
hexmode added a commit to hexmode/Git-Mediawiki that referenced this issue Feb 22, 2022
On a wiki where MediaWiki:Sidebar does not start with rev 1:

    $ git clone -c remote.origin.pages=MediaWiki:Sidebar mediawiki::https://www.wiki.org/w/
		…
		page 1/1: MediaWiki:Sidebar
		  Found 0 revision(s).
    You appear to have cloned an empty MediaWiki.
		fatal: could not read ref refs/mediawiki/origin/master

After this patch

    $ git clone -c  remote.origin.pages=MediaWiki:Sidebar mediawiki::https://www.wiki.org/w/
		…
		page 1/1: MediaWiki:Sidebar
		  Found 115 revision(s).
    Namespace MediaWiki not found in cache, querying the wiki ...
		1/115: Revision Git-Mediawiki#7 of MediaWiki:Sidebar
    2/115: Revision Git-Mediawiki#8 of MediaWiki:Sidebar
    3/115: Revision Git-Mediawiki#9 of MediaWiki:Sidebar
		4/115: Revision Git-Mediawiki#10 of MediaWiki:Sidebar
		5/115: Revision Git-Mediawiki#11 of MediaWiki:Sidebar
		6/115: Revision Git-Mediawiki#12 of MediaWiki:Sidebar

Fixes Git-Mediawiki#70
hexmode added a commit to hexmode/Git-Mediawiki that referenced this issue Feb 22, 2022
On a wiki where MediaWiki:Sidebar does not start with rev 1:

    $ git clone -c remote.origin.pages=MediaWiki:Sidebar mediawiki::https://www.wiki.org/w/
		…
		page 1/1: MediaWiki:Sidebar
		  Found 0 revision(s).
		You appear to have cloned an empty MediaWiki.
		fatal: could not read ref refs/mediawiki/origin/master

After this patch

    $ git clone -c  remote.origin.pages=MediaWiki:Sidebar mediawiki::https://www.wiki.org/w/
		…
		page 1/1: MediaWiki:Sidebar
		  Found 115 revision(s).
		Namespace MediaWiki not found in cache, querying the wiki ...
		1/115: Revision Git-Mediawiki#7 of MediaWiki:Sidebar
		2/115: Revision Git-Mediawiki#8 of MediaWiki:Sidebar
		3/115: Revision Git-Mediawiki#9 of MediaWiki:Sidebar
		4/115: Revision Git-Mediawiki#10 of MediaWiki:Sidebar
		5/115: Revision Git-Mediawiki#11 of MediaWiki:Sidebar
		6/115: Revision Git-Mediawiki#12 of MediaWiki:Sidebar

Fixes Git-Mediawiki#70
moy pushed a commit that referenced this issue Feb 22, 2022
On a wiki where MediaWiki:Sidebar does not start with rev 1:

    $ git clone -c remote.origin.pages=MediaWiki:Sidebar mediawiki::https://www.wiki.org/w/
		…
		page 1/1: MediaWiki:Sidebar
		  Found 0 revision(s).
		You appear to have cloned an empty MediaWiki.
		fatal: could not read ref refs/mediawiki/origin/master

After this patch

    $ git clone -c  remote.origin.pages=MediaWiki:Sidebar mediawiki::https://www.wiki.org/w/
		…
		page 1/1: MediaWiki:Sidebar
		  Found 115 revision(s).
		Namespace MediaWiki not found in cache, querying the wiki ...
		1/115: Revision #7 of MediaWiki:Sidebar
		2/115: Revision #8 of MediaWiki:Sidebar
		3/115: Revision #9 of MediaWiki:Sidebar
		4/115: Revision #10 of MediaWiki:Sidebar
		5/115: Revision #11 of MediaWiki:Sidebar
		6/115: Revision #12 of MediaWiki:Sidebar

Fixes #70
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants