Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Restore modifications and deletions of sentences #1484
When we used the CSV files to restore data that was lost between February and June, only addition of content was restored (i.e. new sentences, new comments, new tags, etc)
Modifications and deletions were not restored. We cannot restore it for all content (such as comments), but we can do it for sentences, since we have logs for sentences.
For example: https://tatoeba.org/eng/sentences/show/76450
Related Wall thread: https://tatoeba.org/eng/wall/show_message/28429#message_28429
referenced this issue
Sep 17, 2017
I had a look at the file, and the content seems fine. I didn't look in details at the script though.
I guess we could go ahead and replace the text/owner of the sentences in the database by the text/owner of the sentences in the file. I should be able to take care of this next weekend. It would be nice if by then other people could verify the script, or/and the content of the generated file.
Modifications of sentences should be restored now.
I noticed that some sentences had a modification that was not logged. I'm not sure if that modification was simply never logged, or if we lost the logs for it...
For instance https://tatoeba.org/eng/sentences/show/330
The text is now:
But in the logs we only see that it was created as:
Anyway, @ckjpn If possible have a look if the replaced sentences are now okay. I'll close this issue and open new ones for potentially missing logs and for restoring deletions of sentences.
Re-opening, I've reverted the modifications made by script.
There is a problem with sentences that have been modified recently.
For instance sentence 3333 has been modified lately by Aiji.
If I replace the current text by the text from the generated file, I will erase the good version of the sentence.
I'll have to check the script. Maybe checking a real date against \N is the problem.
I could possibly do a quick fix and ignore those comparisons, so you could at least revert many of the sentences. Would you like me to do that?
Of course, that might not be the problem, but I suspect that's what happened.
Maybe I can just convert all the \N data to 0000-00-00 00:00:00 before running the script and it will work.
The problem with the previous file of corrections was caused by the \N in the date fields. Python couldn't correctly compare \N with a date.
Note that in the current exported data that there is a problem with the modification date being \N when no modification has taken place, which was part of the problem.
For these corrections, I used the first export after the crash (2017-07-01), so that corrections made by Aiji and adopted by him correctly give him credit. Some were only adopted I think with no corrections needed. He said on the Wall that some of the sentences he had previously adopted were being adopted by other members. It seems only fair to give these sentences to the person who adopted them first.
Attached are 3 files.
There are 1,319 sentences in the latest export between 1 and 6139244 that were not in the 2017-06-10 export. 6139244 was the last sentence in the 2017-06-10 export.
Probably, it would be safest to not try to delete these, until after the duplicate-merging script has successfully run and merged translations, since some of these deletions might have been done by Horus.
The attached file contains these sentences numbers, if you want to glance through them.
I've deleted sentences that were deleted pre-crash (1396 sentences).
For the record, these are the deleted sentences: