-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
恢复数据的方法讨论 #22
Labels
Comments
第一轮恢复的脚本第一轮仅处理版面文章,精华区文件夹、个人信件、还有一些元数据(例如注册成功、提醒文件等等)暂不做处理。 记录日志的 #!/usr/bin/env nodejs
var pg = require('pg');
var conString = "postgres://bmy:bmybbs@202.117.3.62/bmy-fix-log";
var client = new pg.Client(conString);
client.connect(function(err, client, done) {
if(err) {
return console.error('cannot connet', err);
}
client.query('INSERT INTO log(path, status, result) VALUES (\'' + process.argv[2] + '\', \'' + process.argv[3] + '\', \'' + process.argv[4] + '\')', function(err, result) {
if(err) {
return console.error('insert error', err);
}
// console.log('insert successfully');
client.end();
});
}); 处理时间转换的 #!/usr/bin/php
<?php
echo(strtotime($argv[1]));
?> 执行任务的 #!/bin/bash
BASE=/home/ironblood/boards
if [ -f "$1/count.person" ] ; then
pglog $1 panding announce
exit 1
fi
for bmyfile in "$1"/*
do
if [ -d $bmyfile ] ; then
bmyfix $bmyfile
else
FIRSTLINE=`sed '1q;d' $bmyfile | iconv -f gbk -t utf8`
if [[ $FIRSTLINE =~ "寄信人" ]] ; then
# handle as mail
pglog $bmyfile panding PersonalMail
elif [[ $FIRSTLINE =~ "信区" ]] ; then
# handle as board post
BOARDNAME=`sed '1q;d' $bmyfile | iconv -f gbk -t utf8 | awk 'BEGIN{FS="信区: "} {print $2}'`
TIME=`sed '3q;d' $bmyfile | iconv -f gbk -t utf8 | cut -d"(" -f2 | cut -d")" -f1`
TIMESTAMP=`pts "$TIME"`
if [ ! -d "$BASE/$BOARDNAME" ] ; then
mkdir $BASE/$BOARDNAME -p
fi
NEWFILENAME=$BASE/$BOARDNAME/M.$TIMESTAMP.A
cp $bmyfile $NEWFILENAME
pglog $bmyfile successful $NEWFILENAME
else
# handle as normal file
pglog $bmyfile panding unknow
fi
fi
done |
第一轮分析结束,现状:
分析:
计划补充如下脚本:
另外:
|
修复用户文件夹名称的方法依据目录下的 #!/bin/bash
for i in {A..Z}; do
FOLDERLIST=`ls $i | grep obj`
for userhome in $FOLDERLIST
do
if [ -f $i/$userhome/register ] ; then
USERNAME=`sed '2q;d' $i/$userhome/register | cut -d" " -f2`
if [ ${#USERNAME} -eq 0 ] ; then
psql -U bmy -h 127.0.0.1 -p 5444 bmy-fix-log -c "INSERT INTO userhomelog(path, status, result) VALUES ('$i/$userhome', 'failed', 'invalid register file');" > /dev/null 2>&1 &
else
mv $i/$userhome $i/$USERNAME
psql -U bmy -h 127.0.0.1 -p 5444 bmy-fix-log -c "INSERT INTO userhomelog(path, status, result) VALUES ('$i/$userhome', 'successful', '$i/$USERNAME');" > /dev/null 2>&1
fi
elif [ -f $i/$userhome/webregister ] ; then
USERNAME=`sed '2q;d' $i/$userhome/webregister | cut -d" " -f2`
if [ ${#USERNAME} -eq 0 ] ; then
psql -U bmy -h 127.0.0.1 -p 5444 bmy-fix-log -c "INSERT INTO userhomelog(path, status, result) VALUES ('$i/$userhome', 'failed', 'invalid webregister file');" > /dev/null 2>&1 &
else
mv $i/$userhome $i/$USERNAME
psql -U bmy -h 127.0.0.1 -p 5444 bmy-fix-log -c "INSERT INTO userhomelog(path, status, result) VALUES ('$i/$userhome', 'successful', '$i/$USERNAME');" > /dev/null 2>&1
fi
else
psql -U bmy -h 127.0.0.1 -p 5444 bmy-fix-log -c "INSERT INTO userhomelog(path, status, result) VALUES ('$i/$userhome', 'failed', 'No register files');" > /dev/null 2>&1 &
fi
done
done 对于 此行为会丢失的数据包括:
|
用户主目录处理数据的结果:
|
Closed
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
目前从老机器上恢复的数据已经拿到。
猜测是文件索引的信息大多数已经损坏,所以绝大部分的目录名、文件名是以
group
、node
、obj
加编号来命名的。完整的文件树参见 data.to.be.fixed.tar.gz。粗鲁查看了部分数据内容,文件的分布尚未发现规律。以 unknown 目录为例:
粗略统计有 1000w 的文本文件需要重命名(并移动到正确的路径)。考虑使用如下的流程来判断并做处理,请评审。
The text was updated successfully, but these errors were encountered: