Assorted scripts used for LSH batch upload. The work originally done on the redux branch (now merged) was a first attempt to cleanup the scripts and include the original steps needed to convert the original csv files to those required for the input
Apart from the exceptions mentioned below this was produced by Lokal_Profil. Note however that the majority of the codebase is old and the coding might make you want to cry (I know that is how I feel when I see it).
The SQL scripts were created by Fredrik Andersson at LSH.
BatchUploadTools can be installed via
pip install git+https://github.com/lokal-profil/BatchUploadTools.git
Note: You might have to add the
--process-dependency-links flag to the above
command if you are running a different version of pywikibot from the required one.
This bot does not (yet) support OAuth, hence you must use [[Special:BotPasswords]] with a bot account given the following grants:
- Basic rights
- High-volume editing
- Upload new files
For clean up it is also recommended that it is given:
- Create, edit, and move pages
- Upload, replace, and move files
All of these should be run from the main code directory. Note that up to step 10 all can be done without the actual image files.
- Copy config.example.json to config.json and fill in your username and password you also need to set up user_config.py (from pywikibot) with the same credentials.
python py_listscraper.py old_connections data, to make a copy of the latest Commons mappings
- Unpack the new csv files to the
csv_config.jsonto the new file names
python py_prepCSVData.py, to populate a new
python py_analyseCSVData.pyand fix any errors, repeat until no actionable errors
python py_crunchCSVData.py, to populate a new
- Note that this takes some time and that there are three prompts at the start
- individual log files are found in
python py_filenames.py, to generate the filenames
python py_makeMappings.py, to create mapping tables for Commons
Upload the mapping tables to the right place
Do the actual mappings...
python py_listscraper.py, to populate a new
connectionsdirectory and update
- If filenames are updated then don't run again until Commons table has been updated.
python py_prepUpload.py moveHits ../bilderwhere
../bilderis the relative path to the main image directory. Moves the relevant files to base directories and adds extension to
python py_prepUpload.py makeAndRename ../bilder/m_a batchCatetc. for each of the new image subdirectory. Where
batchCat, if provided is a datestamp (e.g.
2015-11) used for collecting the images in a category. Creates info files and renames files
¤generator.logfor possible problems
python py_prepUpload.py negatives ../bilder/m_aetc. for each of the image subdirectory containing negatives. Creates a positive version and renames correctly
- Series with negatives are A, B, D, E, G, O
¤imageMagick-errors.logfor error reports
python py_prepUpload.py negativeCleanup ../bilder/m_aetc. for each of the image subdirectory where the previous step was run
¤conversion-errors.logfor problematic conversions (fix manually)
python py_Uploader.py -path:../bilder/m_aetc. to upload the files
- Successful uploads end up in the
- Failed uploads in the
- Uploads with warnings in the
- Details on problematic uploads can be found in
¤uploader.log(fix manually, often by just trying again...)
python py_postUpload.py purgeto purge LSH-files in Category:Files with broken file links
- Look at
postAnalysis/BrokenFileLinks.csvto identify any remaining files with broken file links. Add any known renames after the pipe (excluding prefix but including the file extension)
python py_postUpload.py renameto repair file pages linking to renamed files and updating the list of broken links
- These indicate missing files, these can some times be uploaded manually but should otherwise be unlinked.
python py_postUpload.py findMissingto check
filenames.csvfor any files not present on Commons
- This also generates an export file with photo-id to url-links for LSH
Some basic tests have been added to simplify maintenance/improvement of
the code base. Run these with
The maintenance directory contains scripts which may be useful for one-off actions in relation to the batch upload.
upload_dupesreplaces previously uploaded files without changing their description pages.
replace_descriptionsattempts to replace previously uploaded descriptions while preserving any changes made by users since the upload.