Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge back contrib module from git.git #46

Merged
merged 127 commits into from
Nov 20, 2017
Merged

Conversation

anarcat
Copy link
Contributor

@anarcat anarcat commented Nov 18, 2017

This merges the MediaWiki extension from the contrib/mw-to-git/
directory from core git back into its own repository. We keep the
history, but only for that directory.

The following procedure was used:

    git clone https://github.com/git/git/ git-import
    cd git-import
    git filter-branch --subdirectory-filter contrib/mw-to-git/ -- master
    git remote rm origin
    git for-each-ref --format='delete %(refname)' refs/original | git update-ref --stdin
    git for-each-ref --format='delete %(refname)' refs | grep tags | git update-ref --stdin
    git reflog expire --expire=now --all
    git gc --prune=now
    git clone https://github.com/Git-Mediawiki/Git-Mediawiki/
    cd ../Git-Mediawiki
    git remote add import ../git-import/
    git fetch import
    git merge --allow-unrelated-histories import/master

This also cleans up the README.md consequently, by reusing the existing .txt file and merging it with the existing README.md file.

Close #34

Jeremie Nikaes and others added 30 commits September 1, 2011 15:52
Implement a gate between git and mediawiki, allowing git users to push
and pull objects from mediawiki just as one would do with a classic git
repository thanks to remote-helpers.

The following packages need to be installed (available on common
repositories):

  libmediawiki-api-perl
  libdatetime-format-iso8601-perl

Use remote helpers in order to be as transparent as possible to the git
user.

Download Mediawiki revisions through the Mediawiki API and then
fast-import into git.

Mediawiki revision number and git commits are linked thanks to notes
bound to commits.

The import part is done on a refs/mediawiki/<remote> branch before
coming to refs/remote/origin/master (Huge thanks to Jonathan Nieder
for his help)

We use UTF-8 everywhere: use encoding 'utf8'; does most of the job, but
we also read the output of Git commands in UTF-8 with the small helper
run_git, and write to the console (STDERR) in UTF-8. This allows a
seamless use of non-ascii characters in page titles, but hasn't been
tested on non-UTF-8 systems. In particular, UTF-8 encoding for filenames
could raise problems if different file systems handle UTF-8 filenames
differently. A uri_escape of mediawiki filenames could be imaginable, and
is still to be discussed further.

Partial cloning is supported using one of:

git clone -c remote.origin.pages='A_Page  Another_Page' mediawiki::http://wikiurl

git clone -c remote.origin.categories='Some_Category' mediawiki::http://wikiurl

git clone -c remote.origin.shallow='True' mediawiki::http://wikiurl

Thanks to notes metadata, it is possible to compare remote and local last
mediawiki revision to warn non-fast forward pushes and "everything
up-to-date" case.

When allowed, push looks for each commit between remotes/origin/master
and HEAD, catches every blob related to these commit and push them in
chronological order. To do so, it uses git rev-list --children HEAD and
travels the tree from remotes/origin/master to HEAD through children. In
other words:

* Shortest path from remotes/origin/master to HEAD
* For each commit encountered, push blobs related to this commit

Signed-off-by: Jérémie Nikaes <jeremie.nikaes@ensimag.imag.fr>
Signed-off-by: Arnaud Lacurie <arnaud.lacurie@ensimag.imag.fr>
Signed-off-by: Claire Fousse <claire.fousse@ensimag.imag.fr>
Signed-off-by: David Amouyal <david.amouyal@ensimag.imag.fr>
Signed-off-by: Matthieu Moy <matthieu.moy@grenoble-inp.fr>
Signed-off-by: Sylvain Boulmé <sylvain.boulme@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Push can not set the commit note "mediawiki_revision:" and update the
remote reference. This avoids having to "git pull --rebase" after each
push, and is probably more natural. Make it the default, but let it be
configurable with mediawiki.dumbPush or remote.<remotename>.dumbPush.

Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a whitespace issue (no space before :) and remove unused %status in
mw_push.

Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
…licts

We already have a check that no new revisions are on the wiki at the
beginning of the push, but this didn't handle concurrent accesses to the
wiki.

Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When the wiki uses e.g. LDAP for authentication, the web interface shows
a popup to allow the user to chose an authentication domain, and we need
to use lgdomain in the API at login time.

Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
On the MediaWiki side, the author information is just the MediaWiki login
of the contributor. The import turns it into login@$wiki_name to create
the author's email address on the wiki side. But we don't want this to
include the HTTP password if it's present in the URL ...

Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The previous version implemented the possibility to log in a wiki, but
the username and password had to be provided as configuration variables.
We add the possibility to use the Git credential system to prompt
the password.

The support if implemented with generic functions that mimic the C API,
designed to be usable from other contexts in the future (i.e. they may
migrate to Git.pm if someone is interested).

While we're there, do a bit of refactoring in mw_connect_maybe.

Based on patch by: Javier Roucher Iglesias <Javier.Roucher-Iglesias@ensimag.imag.fr>

Signed-off-by: Junio C Hamano <gitster@pobox.com>
While we're there, simplify the code a bit: since log --format=%s anyway
shows the subject line as a single line, no need to split to take the
first line.

Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The use of this statement is generally discouraged, and is too intrusive
for us: it forces the HTTP requests made by the API to contain only valid
UTF-8 characters. This would break the upload of binary files.

Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The current version of the git-remote-mediawiki supports only import and
export of plain wiki pages. This patch adds the functionality to export
file attachments (i.e. the content of the File: MediaWiki namespace),
which are also exposed by MediaWiki API.

This requires a recent version of MediaWiki::API (Version 0.37 works.
Version 0.34 doesn't).

Signed-off-by: Pavel Volek <Pavel.Volek@ensimag.imag.fr>
Signed-off-by: NGUYEN Kim Thuat <Kim-Thuat.Nguyen@ensimag.imag.fr>
Signed-off-by: ROUCHER IGLESIAS Javier <roucherj@ensimag.imag.fr>
Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add the symmetrical feature to the "File:" export support in the previous
patch. Download files from the wiki as needed, and feed them into the
fast-import stream. Import both the file itself, and the corresponding
description page.

Signed-off-by: Pavel Volek <Pavel.Volek@ensimag.imag.fr>
Signed-off-by: NGUYEN Kim Thuat <Kim-Thuat.Nguyen@ensimag.imag.fr>
Signed-off-by: ROUCHER IGLESIAS Javier <roucherj@ensimag.imag.fr>
Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Mediafiles can live in namespaces with names different from Image
and File. While at it, rework the code to make it simpler and easier
to read.

Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
install_wiki.sh allows the user to install a MediaWiki instance in a
single shell command. Like "git instaweb", it configures and launches
lighttpd without requiring root priviledges. To simplify database
management, it uses SQLite, which doesn't require a running daemon, and
allows reseting the database by simply replacing a single file. This
allows install_wiki to also defines a function wiki_reset which clear all
content of the previously created wiki, which will be very useful to run
several indepenant tests on the same wiki.

Note those functionnalities are made to be used from the user command
line in the directory git/contrib/mw-to-git/t/

Signed-off-by: Simon CATHEBRAS <Simon.Cathebras@ensimag.imag.fr>
Signed-off-by: Julien KHAYAT <Julien.Khayat@ensimag.imag.fr>
Signed-off-by: Simon Perrat <simon.perrat@ensimag.imag.fr>
Signed-off-by: Charles ROUSSEL <Charles.Roussel@ensimag.imag.fr>
Signed-off-by: Guillaume SASDY <Guillaume.Sasdy@ensimag.imag.fr>
Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In order to test git-remote-mediawiki, a set of functions is needed to
manage a MediaWiki: edit a page, remove a page, fetch a page, fetch all
pages on a given wiki.

A few helper function are also provided to check the content of
directories.

In addition, this patch provides Makefiles to execute tests.
See the README file for more details.

Signed-off-by: Simon CATHEBRAS <Simon.Cathebras@ensimag.imag.fr>
Signed-off-by: Julien KHAYAT <Julien.Khayat@ensimag.imag.fr>
Signed-off-by: Simon Perrat <simon.perrat@ensimag.imag.fr>
Signed-off-by: Charles ROUSSEL <Charles.Roussel@ensimag.imag.fr>
Signed-off-by: Guillaume SASDY <Guillaume.Sasdy@ensimag.imag.fr>
Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Simon CATHEBRAS <Simon.Cathebras@ensimag.imag.fr>
Signed-off-by: Julien KHAYAT <Julien.Khayat@ensimag.imag.fr>
Signed-off-by: Simon Perrat <simon.perrat@ensimag.imag.fr>
Signed-off-by: Charles ROUSSEL <Charles.Roussel@ensimag.imag.fr>
Signed-off-by: Guillaume SASDY <Guillaume.Sasdy@ensimag.imag.fr>
Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This patch provides a set of tests for the pull and push fonctionnality
of git-remote-mediawiki. The actual tests are kept in a separate function
to allow further tests to re-run the same set of commands with different
push and pull strategies.

Signed-off-by: Simon CATHEBRAS <Simon.Cathebras@ensimag.imag.fr>
Signed-off-by: Julien KHAYAT <Julien.Khayat@ensimag.imag.fr>
Signed-off-by: Simon Perrat <simon.perrat@ensimag.imag.fr>
Signed-off-by: Charles ROUSSEL <Charles.Roussel@ensimag.imag.fr>
Signed-off-by: Guillaume SASDY <Guillaume.Sasdy@ensimag.imag.fr>
Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
…racters

Non-ascii encoding create many particular cases when used in page
content, name, and edit/commit message. Test these cases.

Signed-off-by: Simon CATHEBRAS <Simon.Cathebras@ensimag.imag.fr>
Signed-off-by: Julien KHAYAT <Julien.Khayat@ensimag.imag.fr>
Signed-off-by: Simon Perrat <simon.perrat@ensimag.imag.fr>
Signed-off-by: Charles ROUSSEL <Charles.Roussel@ensimag.imag.fr>
Signed-off-by: Guillaume SASDY <Guillaume.Sasdy@ensimag.imag.fr>
Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This will be used for testing git-remote-mediawiki's import feature on a
wiki containing media files.

Signed-off-by: Simon CATHEBRAS <Simon.Cathebras@ensimag.imag.fr>
Signed-off-by: Julien KHAYAT <Julien.Khayat@ensimag.imag.fr>
Signed-off-by: Simon Perrat <simon.perrat@ensimag.imag.fr>
Signed-off-by: Charles ROUSSEL <Charles.Roussel@ensimag.imag.fr>
Signed-off-by: Guillaume SASDY <Guillaume.Sasdy@ensimag.imag.fr>
Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Pavel Volek <Pavel.Volek@ensimag.imag.fr>
Signed-off-by: NGUYEN Kim Thuat <Kim-Thuat.Nguyen@ensimag.imag.fr>
Signed-off-by: ROUCHER IGLESIAS Javier <roucherj@ensimag.imag.fr>
Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The previous version was returning the list of pages to be fetched, but
we are going to need an efficient membership test (i.e. is the page
$title tracked), hence exposing a hash will be more convenient.

Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Without changing the behavior, we turn the foreach loop on an array of
revisions into a loop on an array of integer. It will be easier to
implement other strategies as they will only need to produce an array of
integer instead of a more complex data-structure.

Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The only way to fetch new revisions from a wiki before this patch was to
query each page for new revisions. This is good when tracking a small set
of pages on a large wiki, but very inefficient when tracking many pages
on a wiki with little activity.

Implement a new strategy that queries the wiki for its last global
revision, queries each new revision, and filter out pages that are not
tracked.

Signed-off-by: Simon Perrat <simon.perrat@ensimag.imag.fr>
Signed-off-by: Simon CATHEBRAS <Simon.Cathebras@ensimag.imag.fr>
Signed-off-by: Julien KHAYAT <Julien.Khayat@ensimag.imag.fr>
Signed-off-by: Charles ROUSSEL <Charles.Roussel@ensimag.imag.fr>
Signed-off-by: Guillaume SASDY <Guillaume.Sasdy@ensimag.imag.fr>
Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"mediawiki" remote helper (in contrib/) learned to handle file
attachments.

* mm/mediawiki-file-attachments:
  git-remote-mediawiki: improve support for non-English Wikis
  git-remote-mediawiki: import "File:" attachments
  git-remote-mediawiki: split get_mw_pages into smaller functions
  git-remote-mediawiki: send "File:" attachments to a remote wiki
  git-remote-mediawiki: don't "use encoding 'utf8';"
  git-remote-mediawiki: don't compute the diff when getting commit message
* mm/mediawiki-tests:
  git-remote-mediawiki: be more defensive when requests fail
  git-remote-mediawiki: more efficient 'pull' in the best case
  git-remote-mediawiki: extract revision-importing loop to a function
  git-remote-mediawiki: refactor loop over revision ids
  git-remote-mediawiki: change return type of get_mw_pages
  git-remote-mediawiki (t9363): test 'File:' import and export
  git-remote-mediawiki: support for uploading file in test environment
  git-remote-mediawiki (t9362): test git-remote-mediawiki with UTF8 characters
  git-remote-mediawiki (t9361): test git-remote-mediawiki pull and push
  git-remote-mediawiki (t9360): test git-remote-mediawiki clone
  git-remote-mediawiki: test environment of git-remote-mediawiki
  git-remote-mediawiki: scripts to install, delete and clear a MediaWiki
Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
jrn and others added 27 commits September 24, 2013 23:19
* maint:
  git-remote-mediawiki: bugfix for pages w/ >500 revisions
Running "make clean" after a successful "make install" should not
result in a broken mediawiki remote helper.

Reported-by: Thorsten Glaser <t.glaser@tarent.de>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Acked-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
So now you can run

	DESTDIR=$(pwd)/tmp make -Ccontrib/mw-to-git install

to install the mediawiki remote helper, git-mw tool, and Git::Mediawiki
perl module under tmp/ as preparation for zipping it up and extracting
on another machine.

While at it, make sure the directory that should contain Git::Mediawiki
exists before putting a file there.  Without this patch, the makefile
uses DESTDIR when installing git-mw and git-remote-mediawiki but not
the perl module, resulting in errors from "make install" if the
$(INSTLIBDIR)/Git directory does not exist:

 install: cannot create regular file \
 '/usr/share/perl/5.18.1/Git/Mediawiki.pm': No such file or directory

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Acked-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
On some machines, the most usable 'install' tool is named
'ginstall'.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Acked-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Quote DESTDIR and INSTLIBDIR for the shell in the same way as is done in
the toplevel Makefile to avoid confusion in case they contain shell
metacharacters.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Acked-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Build and installation procedure clean-up.

* jn/mediawiki-makefile-updates:
  git-remote-mediawiki build: handle DESTDIR/INSTLIBDIR with whitespace
  git-remote-mediawiki build: make 'install' command configurable
  git-remote-mediawiki: honor DESTDIR in "make install"
  git-remote-mediawiki: do not remove installed files in "clean" target
…titution

The Git CodingGuidelines prefer the $(...) construct for command
substitution instead of using the backquotes `...`.

The backquoted form is the traditional method for command
substitution, and is supported by POSIX.  However, all but the
simplest uses become complicated quickly.  In particular, embedded
command substitutions and/or the use of double quotes require
careful escaping with the backslash character.

The patch was generated by:

for _f in $(find . -name "*.sh")
do
   sed -i 's@`\(.*\)`@$(\1)@g' ${_f}
done

and then carefully proof-read.

Signed-off-by: Elia Pinto <gitter.spiros@gmail.com>
Reviewed-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
…itution

The Git CodingGuidelines prefer the $(...) construct for command
substitution instead of using the backquotes `...`.

The backquoted form is the traditional method for command
substitution, and is supported by POSIX.  However, all but the
simplest uses become complicated quickly.  In particular, embedded
command substitutions and/or the use of double quotes require
careful escaping with the backslash character.

The patch was generated by:

for _f in $(find . -name "*.sh")
do
   sed -i 's@`\(.*\)`@$(\1)@g' ${_f}
done

and then carefully proof-read.

Signed-off-by: Elia Pinto <gitter.spiros@gmail.com>
Reviewed-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
…ubstitution

The Git CodingGuidelines prefer the $(...) construct for command
substitution instead of using the backquotes `...`.

The backquoted form is the traditional method for command
substitution, and is supported by POSIX.  However, all but the
simplest uses become complicated quickly.  In particular, embedded
command substitutions and/or the use of double quotes require
careful escaping with the backslash character.

The patch was generated by:

for _f in $(find . -name "*.sh")
do
   sed -i 's@`\(.*\)`@$(\1)@g' ${_f}
done

and then carefully proof-read.

Signed-off-by: Elia Pinto <gitter.spiros@gmail.com>
Reviewed-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The Git CodingGuidelines prefer the $(...) construct for command
substitution instead of using the backquotes `...`.

The backquoted form is the traditional method for command
substitution, and is supported by POSIX.  However, all but the
simplest uses become complicated quickly.  In particular, embedded
command substitutions and/or the use of double quotes require
careful escaping with the backslash character.

The patch was generated by:

for _f in $(find . -name "*.sh")
do
   sed -i 's@`\(.*\)`@$(\1)@g' ${_f}
done

and then carefully proof-read.

Signed-off-by: Elia Pinto <gitter.spiros@gmail.com>
Reviewed-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Previously, the user had to launch a complete re-install after a lighttpd
stop (e.g. a reboot).

Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When a media file contains valid UTF-8, git-remote-mediawiki tried to be
too clever about the encoding, and the call to utf8::downgrade() on the
downloaded content was failing with

  Wide character in subroutine entry at git-remote-mediawiki line 583.

Instead, use $response->decode() to apply decoding linked to the
Content-Encoding: header, and return the content without attempting any
charset decoding.

Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Adjust shell scripts to use $(cmd) instead of `cmd`.

* ep/shell-command-substitution: (41 commits)
  t5000-tar-tree.sh: use the $( ... ) construct for command substitution
  t4204-patch-id.sh: use the $( ... ) construct for command substitution
  t4119-apply-config.sh: use the $( ... ) construct for command substitution
  t4116-apply-reverse.sh: use the $( ... ) construct for command substitution
  t4057-diff-combined-paths.sh: use the $( ... ) construct for command substitution
  t4038-diff-combined.sh: use the $( ... ) construct for command substitution
  t4036-format-patch-signer-mime.sh: use the $( ... ) construct for command substitution
  t4014-format-patch.sh: use the $( ... ) construct for command substitution
  t4013-diff-various.sh: use the $( ... ) construct for command substitution
  t4012-diff-binary.sh: use the $( ... ) construct for command substitution
  t4010-diff-pathspec.sh: use the $( ... ) construct for command substitution
  t4006-diff-mode.sh: use the $( ... ) construct for command substitution
  t3910-mac-os-precompose.sh: use the $( ... ) construct for command substitution
  t3905-stash-include-untracked.sh: use the $( ... ) construct for command substitution
  t1050-large.sh: use the $( ... ) construct for command substitution
  t1020-subdirectory.sh: use the $( ... ) construct for command substitution
  t1004-read-tree-m-u-wf.sh: use the $( ... ) construct for command substitution
  t1003-read-tree-prefix.sh: use the $( ... ) construct for command substitution
  t1002-read-tree-m-u-2way.sh: use the $( ... ) construct for command substitution
  t1001-read-tree-m-2way.sh: use the $( ... ) construct for command substitution
  ...
    <BAD>                     <CORRECTED>
    accidently                accidentally
    commited                  committed
    dependancy                dependency
    emtpy                     empty
    existance                 existence
    explicitely               explicitly
    git-upload-achive         git-upload-archive
    hierachy                  hierarchy
    indegee                   indegree
    intial                    initial
    mulitple                  multiple
    non-existant              non-existent
    precendence.              precedence.
    priviledged               privileged
    programatically           programmatically
    psuedo-binary             pseudo-binary
    soemwhere                 somewhere
    successfull               successful
    transfering               transferring
    uncommited                uncommitted
    unkown                    unknown
    usefull                   useful
    writting                  writing

Signed-off-by: Ville Skyttä <ville.skytta@iki.fi>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Ville Skyttä <ville.skytta@iki.fi>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
mediawiki pages can have names longer than NAME_MAX (generally 255)
characters, which will fail on checkout. we simply strip out extra
characters, which may mean one page's content will overwrite another
(the last editing winning).

ideally, we would do a more clever system to find unique names, but
that would be more difficult and error prone for a situation that
should rarely happen in the first place.

Signed-off-by: Antoine Beaupré <anarcat@debian.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This introduces a new remote.origin.namespaces argument that is a
space-separated list of namespaces. The list of pages extract is then
taken from all the specified namespaces.

Reviewed-by: Antoine Beaupré <anarcat@debian.org>
Signed-off-by: Antoine Beaupré <anarcat@debian.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
we still want to use spaces as separators in the config, but we should
allow the user to specify namespaces with spaces, so we use underscore
for this.

Reviewed-by: Antoine Beaupré <anarcat@debian.org>
Signed-off-by: Antoine Beaupré <anarcat@debian.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If we fail to find a requested namespace, we should tell the user
which ones we know about, since those were already fetched. This
allows users to fetch all namespaces by specifying a dummy namespace,
failing, then copying the list of namespaces in the config.

Eventually, we should have a flag that allows fetching all namespaces
automatically.

Reviewed-by: Antoine Beaupré <anarcat@debian.org>
Signed-off-by: Antoine Beaupré <anarcat@debian.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Virtual namespaces do not correspond to pages in the database and are
automatically generated by MediaWiki. It makes little sense,
therefore, to fetch pages from those namespaces and the MW API doesn't
support listing those pages.

According to the documentation, those virtual namespaces are currently
"Special" (-1) and "Media" (-2) but we treat all negative namespaces
as "virtual" as a future-proofing mechanism.

Signed-off-by: Antoine Beaupré <anarcat@debian.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we specify a list of namespaces to fetch from, by default the MW
API will not fetch from the default namespace, refered to as "(Main)"
in the documentation:

https://www.mediawiki.org/wiki/Manual:Namespace#Built-in_namespaces

I haven't found a way to address that "(Main)" namespace when getting
the namespace ids: indeed, when listing namespaces, there is no
"canonical" field for the main namespace, although there is a "*"
field that is set to "" (empty). So in theory, we could specify the
empty namespace to get the main namespace, but that would make
specifying namespaces harder for the user: we would need to teach
users about the "empty" default namespace. It would also make the code
more complicated: we'd need to parse quotes in the configuration.

So we simply override the query here and allow the user to specify
"(Main)" since that is the publicly documented name.

Signed-off-by: Antoine Beaupré <anarcat@debian.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Ideally, we'd process them in numeric order since that is more
logical, but we can't do that yet since this is where we find the
numeric identifiers in the first place. Lexicographic order is a good
compromise.

Signed-off-by: Antoine Beaupré <anarcat@debian.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Without this, the fetch process seems hanged while we fetch page
listings across the namespaces. Obviously, it should be possible to
silence this with -q, but that's an issue already present everywhere
in the code and should be fixed separately:

Git-Mediawiki#30

Signed-off-by: Antoine Beaupré <anarcat@debian.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The remote-helper for talking to MediaWiki has been updated to
work with mediawiki namespaces.

* ab/mediawiki-namespace:
  remote-mediawiki: show progress while fetching namespaces
  remote-mediawiki: process namespaces in order
  remote-mediawiki: support fetching from (Main) namespace
  remote-mediawiki: skip virtual namespaces
  remote-mediawiki: show known namespace choices on failure
  remote-mediawiki: allow fetching namespaces with spaces
  remote-mediawiki: add namespace support
This merges the MediaWiki extension from the contrib/mw-to-git/
directory from core git back into its own repository. We keep the
history, but only for that directory.

The following procedure was used:

    git clone https://github.com/git/git/ git-import
    cd git-import
    git filter-branch --subdirectory-filter contrib/mw-to-git/ -- master
    git remote rm origin
    git for-each-ref --format='delete %(refname)' refs/original | git update-ref --stdin
    git for-each-ref --format='delete %(refname)' refs | grep tags | git update-ref --stdin
    git reflog expire --expire=now --all
    git gc --prune=now
    git clone https://github.com/Git-Mediawiki/Git-Mediawiki/
    cd ../Git-Mediawiki
    git remote add import ../git-import/
    git fetch import
    git merge --allow-unrelated-histories import/master

See Git-Mediawiki#34
@anarcat anarcat merged commit 9e9182a into Git-Mediawiki:master Nov 20, 2017
@anarcat anarcat deleted the import branch November 20, 2017 18:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.