This script recursively scans a directory and creates a CSV with details of all Movie and TV media in that directory and sub-directories. This is aimed towards Home Theatre enthusiasts and their archives. However it can be used for any directories containing video media.
The script uses ffprobe
or mediainfo
to scan the library. It can automatically detect which libraries are present, or you can define what package you want to use as a CLI argument or globally in the .env
file.
This is a POSIX compliant bash script. It will work on all Linux and Mac systems. Windows users will be able to run the script using WSL (see How to run .sh or Shell Script file in Windows 11/10) (untested).
The script assumes that you have separate archives for TV and movies and automatically detects what kind of archive it is scanning and renders the appropriate archive list. You can also define what kind of list you want as a CLI argument or globally in the .env
file.
You can define what columns you want in the CSV and the order in the settings file. An extra Sort
column is automatically added to the end of the columns, to allow proper sorting (no other columns are 100% reliable).
The resultant CVS contains summary data of Disk Size
, Disk Space used
and Disk Space free
. This allows you to see free disk space after other files in the archive are taken into account.
Note: The scan will be fastest if you have the media info in the filename.
Note: The automatic detection is based on the format of the first video filename that the script parses. If you have extras in your archive, you may want to specify the archive type as a CLI option: -t {tv,movie}
.
Note: ffprobe
is much faster than mediainfo
, however it cannot fetch HDR10
or HDR10+
definitions at the present.
This script is intended for people who want to maintain an archive of legitmately backed up or original videos. However, it does contain possible configuration to list Release type
of a video, this is specific to pirated media (see Pirated movie release types). We do not condone Piracy in any way, it is against the law, and this feature has only been added for completeness and nerdiness.
git
jq
ffprobe
ormediainfo
sudo apt install git jq ffprobe
sudo apt install git jq mediainfo
git clone git@github.com:laughingman77/video_list_csv.git
The .env
file contains the configuration for various options in the script. The example.env
contains all of the default settings. Copy example.env
to .env
and, if needed, configure .env
to your requirements:
cd video_list_csv && cp example.env .env
Maintain a catalogue of your video media in a spreadsheet:
- Copy the
Movie Archive.xlsx
spreadsheet into your home directory. - In your new spreadsheet, duplicate the
Archive Template
sheet, and give it a meaningful name. - Run the script:
Or
sh video_list_csv.sh /path/to/archive/dir/ > ~/archive.csv
./video_list_csv.sh /path/to/archive/dir/ > ~/archive.csv
- Import
archive.csv
into your spreadsheet program. - Copy the cells from the imported CSV data and paste it into your archive sheet at cell
A4
. - Select the cells for the media data and sort by the
Sort
column. - Format the sheet to your preference.
Maintain a catalogue of your video media in Tellico, see tellico/README.md.
CLI options allow you to override the values in .env
:
-a, --trim-release-type
Trim anyRelease type
words from theEdition column
(0 or 1).-b, --trim-resolution
Trim anyResolution
words from theEdition column
(0 or 1).-h, -?, --help
Display the help text.-i, --default-stream
: Display only the default streams for audio and video (0 or 1).-s, --scanner
Set the scanner program (ffprobe
ormediainfo
).-t, --type
Set the archive type (tv
ormovie
).-f, --force
Force detect the media metadata from the file (0 or 1).-d, --detect
Detect the media metadata if not in the filename (0 or 1).-e, --season
Display season only when episode is #1 (0 or 1).-r, --series
Display series only when season is #1 and episode is #1 (0 or 1).-x, --movie_columns
Define the Movie columns.-z, --tv_columns
Define the TV columns.
scanner
: (ffprobe
,mediainfo
) Select the preferred scanning program globally. If not set, then ffprobe takes preference but will fallback to mediainfo if it's not detected.type
: (tv
ormovie
) Set the archive media type globally.detect_if_not_in_filename
: (0 or 1) If the audio/audio formats or resolution are not detected in the filename, then automatically detect them.trim_release_type
: (0 or 1) Trim anyRelease type
words from theEdition column
.trim_resolution
: (0 or 1) Trim anyResolution
words from theEdition column
.default_stream
: Only display the default streams (reverts to diplsaying all streams if no stream set to default). This affects theAudio
,Video
andResolution
columns.force_detect
: (0 or 1) Force detection of the video streams on all videos (this will overridedetect_if_not_in_filename
and ignore any values found in the filename for theResolution
/Video
/Audio
columns).display_season_for_1
: (0 or 1) Only extract the season number if the episode is01
, it makes a TV list more readable.display_series_for_1
: (0 or 1) Only extract the series name if the season and episode are01
, it makes a TV list more readable.tv_columns
: TV archive columns to render, and their order.movie_columns
: Movie archive columns to render, and their order.
By configuring the tv_columns
and movie_columns
, you can dictate which columns are rendered and in what order.
The column names are separated by the |
character.
The possible columns are:
Title
: (Only for Movies) the Movie title.Edition
: (Only for Movies) the release edition, ie.Director's Cut
,Cinematic Cut
,Special Edition
,Unrated
,Uncut
etc.Series
: (Only for TV series) the TV series title.Season
: (Only for TV series) the TV series season.Episode
: (Only for TV series) the TV series episode.Number
: (Only for TV series) the TV series episode (this is forTellico
integration).Year
: Relese date.Production Year
: Relese date (this is forTellico
integration).Resolution
: Video resolution (480p, 720p, 1080p, 2160, etc).Aspect Ratio
: Video resolution (480p, 720p, 1080p, 2160, etc) (this is forTellico
integration).Video
: The video codec and colouration streams, ie.DV
,AVC
,HEVC
,HDR10+
, etc.Video Tracks
: The video codec and colouration streams, ie.DV
,AVC
,HEVC
,HDR10+
, etc. (this is forTellico
integration).Audio
: The audio codec, channel layout and language.Audio Tracks
: The audio codec, channel layout and language (this is forTellico
integration).Subtitles
: (not in the default configuration) The list of subtitle srteam/s.Subtitle Languages
: (not in the default configuration) The list of subtitle srteam/s (this is forTellico
integration).Release Type
: (not in the default configuration) Pirated release type - NOT recommended.Size (GB)
: File size in GB.Size (MB)
: File size in MB.Size (KB)
: File size in KB.Size (B)
: File size in B.Filename
: Filename.Full Path
: Absolute filepath and filename (this includes the mount path if the archive disk is an external disk).
The script is designed for the directory and filenaming structure of Jellyfin, Plex and Kodi.
The script assumes a separator of space
or period
between words in the filename, and will do its best to detect items. Usage of hyphen
could not be added to the detection, due to too many false positives.
All TV episodes should be in the format of S[0-9]{2}E[0-9]{2}
(case-insensitive), examples:
- S01E01
- s01e01
If the script falls-back to probing the video file:
- If there is only one stream, it will list only the codec, as if it were in the filename, eg:
"AVC DV HDR10+ (en)"
- If there are multiple streams, it will list each stream number and its codec in a comma separated list, eg:
"stream_1: DTS 5.1 (en), stream_2: AC3 2.0 (za)"
A script has ben created to manually lint all files, run:
cd video_list_csv
./test.sh
Expected output:
$ ./test.sh
.env does not exist, generating the default .env...
Checking ./archive_list.sh
OK
Checking ./includes/ffprobe.sh
OK
Checking ./includes/functions.sh
OK
Checking ./includes/archive_list.sh
OK
Checking ./includes/mediainfo.sh
OK
Checking ./includes/progressbar.sh
OK
Checking ./video_list_csv.sh
OK
Checking ./test.sh
OK
CI/CD linting is implemented using GitHub Actions. You can run the pipelines locally, using nektos/act:
apt install act
cd video_list_csv
sudo act
This should give output similar to:
...
| beginning shell linting...
| not excluding any dirs
| finding and linting all shell scripts/files via shellcheck...
| [PASS]: shellcheck - successfully linted: ./ffprobe.sh
| [PASS]: shellcheck - successfully linted: ./archive_list.sh
| [PASS]: shellcheck - successfully linted: ./test.sh
| [PASS]: shellcheck - successfully linted: ./mediainfo.sh
| [PASS]: shellcheck - successfully linted: ./progressbar.sh
| finding and linting all files with shell shebangs via shellcheck...
| looking for subdirectories of bin directories that are not usable via PATH...
| looking for programs in PATH that have a filename suffix
| done
...
Awesome online aplications used in development and testing:
- JSONLint: https://jsonlint.com/
- JSON Pretty Print: https://jsonformatter.org/json-pretty-print
- jq kung fu: https://jqkungfu.com/
Technical experts:
- Progressbar inspiration: https://github.com/albertomosconi/posixbar
- Parse command line options for a shell script (POSIX): https://gist.github.com/deshion/10d3cb5f88a21671e17a
- Pseudo arrays: https://gist.github.com/biiont/290341b29657c0bb2df6
- Padding a string: https://stackoverflow.com/a/74964817
- Validation of dependencies: https://stackoverflow.com/questions/592620/how-can-i-check-if-a-program-exists-from-a-bash-script
- Line count in a variable: https://unix.stackexchange.com/questions/482893/how-to-posix-ly-count-the-number-of-lines-in-a-string-variable
- Suppress Permission Denied messages: https://stackoverflow.com/questions/762348/how-can-i-exclude-all-permission-denied-messages-from-find