Split the supplied ebook files (and the accompanying metadata files if present) into folders with consecutive names where each contains the specified number of files. This is a Python port of split-into-folders.sh from ebook-tools written in shell by na--.
⭐ Other related Python projects based on ebook-tools
:
- convert-to-txt: convert documents (pdf, djvu, epub, word) to txt
- find-isbns: find ISBNs from ebooks (pdf, djvu, epub) or any string given as input to the script
- ocr: run OCR on documents (pdf, djvu, and images)
- organize-ebooks: automatically organize folders with potentially huge amounts of unorganized ebooks. It leverages the three previous Python scripts.
This is the environment on which the script split_into_folders.py was tested:
- Platform: macOS
- Python: version 3.7
To install the split_into_folders package:
$ pip install git+https://github.com/raul23/split-ebooks-into-folders#egg=split-ebooks-into-folders
Test installation
Test your installation by importing
split_into_folders
and printing its version:$ python -c "import split_into_folders; print(split_into_folders.__version__)"
You can also test that you have access to the
split_into_folders.py
script by showing the program's version:$ split_into_folders --version
To uninstall the split_into_folders package:
$ pip uninstall split-ebooks-into-folders
To display the script split_into_folders.py list of options and their descriptions:
$ split_into_folders -h usage: split_into_folders [OPTIONS] {folder_with_books} [{output_folder}] Split the supplied ebook files (and the accompanying metadatafiles if present) into folders with consecutive names that each contain the specified number of files. This script is based on the great ebook-tools written in shell by na-- (See https://github.com/na--/ebook-tools). General options: -h, --help Show this help message and exit. -v, --version Show program's version number and exit. -q, --quiet Enable quiet mode, i.e. nothing will be printed. --verbose Print various debugging information, e.g. print traceback when there is an exception. -d, --dry-run If this is enabled, no file rename/move/symlink/etc. operations will actually be executed. -r, --reverse If this is enabled, the files will be sorted in reverse (i.e. descending) order. By default, they are sorted in ascending order. --log-level {debug,info,warning,error} Set logging level. (default: info) --log-format {console,only_msg,simple} Set logging formatter. (default: only_msg) Split options: -s, --start-number START_NUMBER The number of the first folder. (default: 0) -f, --folder-pattern PATTERN The print format string that specifies the pattern with which new folders will be created. By default it creates folders like 00000000, 00001000, 00002000, ..... (default: %05d000) --fpf, --files-per-folder FILES_PER_FOLDER How many files should be moved to each folder. (default: 100) Input and output options: --ome, --output-metadata-extension EXTENSION This is the extension of the metadata file associated with an ebook. (default: meta) folder_with_books Folder with books which will be recursively scanned for files. The found files (and the accompanying metadata files if present) will be split into folders with consecutive names that each contain the specified number of files. -o, --output-folder PATH The output folder in which all the new consecutively named folders will be created. The default value is the current working directory. (default: /Users/test/split_into_folders/test_installation)
ℹ️ Explaining some of the options/arguments
-d, --dry-run
is a very useful option to simulate how the files will be moved, i.e. the number of folders needed to split them and their names. No moving operations will actually be executed.-o, --output-folder
uses by default the working directory under which the script is running to move all the files.
To split 1000 ebooks into folders with 12 files each:
split_into_folders ~/Data/split/small -o ~/Data/split/output_folder --fpf 12 -s 1
ℹ️ -s 1
will start the folder names with 1 (by default it is 0)
Sample output:
Total number of files to be split into folders: 1000 Number of files per folder: 12 Number of splits: 84 Starting splits... End of splits!
To split 1000 ebooks into folders with 12 files each using the API:
from split_into_folders.lib import split
retcode = split('/Users/test/Data/split/small',
'/Users/test/Data/split/output_folder',
files_per_folder=12, start_number=1)
By default when using the API, the loggers are disabled. If you want to enable them, call the
function setup_log()
(with the desired log level in all caps) at the beginning of your code before
the function split()
:
from split_into_folders.lib import split, setup_log
setup_log(logging_level='INFO')
retcode = split('/Users/test/Data/split/small',
'/Users/test/Data/split/output_folder',
files_per_folder=12, start_number=1)