Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adjusted the configs and several functionality, add git importer and exporter #12

Closed
wants to merge 55 commits into from

Conversation

danny0838
Copy link
Contributor

I made a rather big change for this plugin, here's a simple list of features:

  • Configs change: add autoCommit, importMetaMsg, backupSuffix, gitPath, repoBase, dataBase, gitBranch; removed repoWorkDir
  • Changed the default path of git repo and git working dir. All git manipulation are now under an independent directory that will never bother the real dokuwiki data.
  • Fixed an issue that non-ASCII characters in the commit message are stripped.
  • Add git importer and exporter (command line interface). Supports history hiding and external commit.
  • Replaced the lib/Git.php with a more customized one helper/git.php.

I've done most tests on my private server and it worked fine. Though further investigation and experiments would never be too much. :)


Below is a simple documentation about the importer and the exporter:

The importer imports all page and media revision into the git repo.

  • Importing must be atomic. You should lock the wiki and the git repo from writing during the import process in case of data inconsistency.
  • Only changes later than the top commit (of the specified branch) in the git repo are imported (unless you specified --full-history). You can reset your repo top to a previous time and re-import those dokuwiki changes later than that.
  • Hidden pages, media, or meta (history entry) are supported. Move the item to the corresponding folder with a suffix (eg. attic/blah.123456789.txt.gz => attic.bak/blah.123456789.txt.gz) and they will be recognized during the importing process.
  • Meta files other than *.changes, *.indexed, *.meta can be imported for backup (if --no-meta is not set), but their timestamp will be reset in the repo and be changed on export.
  • Importing process is chronological, non-time-ordered history in *.changes will be resorted.
  • Plugins recording data to extra non-meta files are not imported. You have to backup them separately.

The exporter exports history records in the git repo into the DokuWiki data.

  • Exporting must be atomic. You should lock the wiki and the git repo from writing during the export process in case of data inconsistency.
  • All wiki pages, media, and meta files (or limited to *.changes, *.indexed, and *.meta if --no-meta is specified) are removed before exporting, making a backup is recommanded in case something go wrong.
  • After exporting, index (and meta data) rebuilding is required to get all things work right.
  • Exporting process follows git revision order, non-time-ordered history is available, but they'll be messed up if you re-import them to git.
  • External commiting (commits other than wiki import) and history rewriting are supported. You can import datain to git, rewrite git history, and them export it back to DokuWiki. For example, to rename a page, media, or a directory (namespace), simply git mv and then commit.
  • To hide the content of a history entry, add "cmd: hide data" to the last line below dokuwiki in the commit message. Commits with this command will have attics saved to the corresponding backup folder (attic.bak or media_attic.bak).
  • To hide an entire history entry, add "cmd: hide change" to the last line below dokuwiki in the commit message. Commits with this command will have entry in *.changes saved to the corresponding backup folder (meta.bak or media_meta.bak).
  • Do not hide an entry that is exactly the current revision, or you'll get some problem after next import. Instead, delete the page in wiki and then hide the old histories.
  • Only files in pages/, meta/, attics/, media/, media_meta/, and media_attic/ (may change by workDir) will be recognized and exported to DokuWiki. Other files are git-only.

1.periodic pull and edit commits now uses independent timestamped directories as the working dir, and locks the git on processing;
2.use repoBase and dataBase instead of repoWorkDir that improves the flexibility of directory structure;
3.import meta now uses current time as author date
4.exporter now prints the current manipulation
This was referenced Oct 28, 2013
@woolfg
Copy link
Owner

woolfg commented Oct 28, 2013

Wow, thanks, great work. Are your changes compatible to the current version?

@danny0838
Copy link
Contributor Author

There may be problems due to changes of default configs.

To transfer the configs from the previous version (those came to my mind, may not be completed yet):

  1. turn autoCommit on
  2. set all dokuwiki data paths (those under data/) to the default path
  3. if used default repoPath, move the git repo to the new default folder (./data.git)
  4. to import pages and media to the same path as old default, set repoBase to blank and dataBase to 'data'

However, instead of 3 and 4, It's more recommended to use new dafult configs, discard the old repo, and import data to the new default repo data.git (importer.php --run is enough) if the wiki still have complete data.

@woolfg
Copy link
Owner

woolfg commented Nov 26, 2013

At the moment we have to major pull requests #12 and #14 which adapt the git library. Did you see #14 and what is your opinion about it?

@danny0838
Copy link
Contributor Author

lib/Git.php was totally discarded by this work. Though other improvements could be merged in.

@danny0838 danny0838 mentioned this pull request Nov 3, 2014
@danny0838 danny0838 closed this Nov 9, 2014
@danny0838
Copy link
Contributor Author

This is become messy and needs to much rework, so I closed it. The importing (to git) feature can now be better done via dokuwiki2git (original project and my fork with lots of improvements). The exporting feature is pending since it's not in such a hurry, though currently I think it would probably be done better via extending dokuwiki2git instead of gitbacked.

There are still other features that could be contained in gitbacked. Here are some I'm thinking about:

  1. Get rid of the lib/Git.php and develop a framework on our own (it should be not too complicated according to my evaluation). Therefore we can largely optimize the codes.
  2. Switch to a new technique that directly writes files to the git index, so that a working directory is needless, importing to bare repos can be supported, and the git repo is totally independent with the dokuwiki data directory so any potential interference could be avoided.
    Accordingly, repoWorkDir config is canceled, and pages and media are always saved in the 'pages' and 'media' directory in the git repo no matter where the .git is or where dokuwiki configs of savedir, mediadir, etc are. This should be more flexible and portable for users.
    This technique probably requires a framework redesign, i.e. the point 1.
  3. There would be a data inconsistency if there is a data change not monitored by gitbacked, such as a revert manipulation via the revert plugin in the admin backend, or other admin plugins.
    It seems that we can hardly solve this. Should we put a notice and maintain a incompatible list (with other plugins) for this?
  4. Add customization of the commit author and email, and more placeholders support, such as %user%, %ip%, %userip% (returns ip if the user name is empty), %email%, %item% (fullname of a page or media), %itemns% (namespace), %itembase% (basename), %nl% (newline), some of which has been introduced in Improve message templates & Git environment support (easier pushing) #14.

How do you about them?

@danny0838 danny0838 deleted the devel branch November 11, 2014 13:49
@woolfg
Copy link
Owner

woolfg commented Dec 29, 2014

sry for my late repsone: great ideas, contributions are always welcome

  1. low priority I would say, but it makes sense
  2. so you mean to call "git add" on every file/change?
  3. if you know such plugins a list would be great. fell free to change the wiki or send a pull request
  4. would be great!

@danny0838
Copy link
Contributor Author

  1. No. Say we have a dokuwiki data directory at /home/www/dokuwiki/data and the git repo at /home/www/dokuwiki/repo with a /home/www/dokuwiki/repo/.git. Traditionally we have to copy /home/www/dokuwiki/data/* to /home/www/dokuwiki/repo/data/* and then git add and git commit, or we have to limit the .git/.. directory to contain the /home/www/dokuwiki/data.

We now can use a new technique that directly writes /home/www/dokuwiki/data/* files into the git object database, which will resides in the data/* as the git internal path.

This technique can be found at http://stackoverflow.com/questions/19616758/commit-to-git-with-a-different-path or some otherwhere. The main idea is a combination of git hash-object and git update-index for adding and git rm --cached for removing.

@woolfg
Copy link
Owner

woolfg commented Dec 29, 2014

is there any advantage over links?

@danny0838
Copy link
Contributor Author

What do you mean by 'links'?

  • If symbolic links: git do not support adding files beyond a symbolic link.
  • If hard links: hard links do not support folders, and creating folders and creating all mapping files probably don't have optimized performance. Besides, hard links cannot cross a device.
  • If mount: a sudo auth of the system is required.

Additionally, a bare repo cannot hold any link.

@woolfg
Copy link
Owner

woolfg commented Dec 29, 2014

sounds very convincing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants