[feature request] Enhancements for logging #22

calebj · 2021-04-01T01:28:43Z

I've been using this for a while as part of an archiving cronjob, and it's worked great. My one gripe is how chatty the logs are, since the app prints out information for every single episode. Altogether, each run generates over 5MB of output, most of it redundant. I suppose I could diff the logs on each run and store the differences, but it would be great if there was a flag to make podcast-dl only output (and/or act upon) changes in the feed since the last run.

Currently I use the following script to automate the invocation of podcast-dl for each feed:

#!/bin/env python3

PDL_BIN = 'podcast-dl'

from configparser import ConfigParser, ExtendedInterpolation
import subprocess
import sys

import fasteners

cp = ConfigParser(interpolation=ExtendedInterpolation(), delimiters=('=',))

with open('podcast-dl.ini') as f:
    cp.read_file(f)

dl_sections = cp.sections()

for sec in dl_sections:
    print(f"Running for {sec}")
    conf = cp[sec]
    args = [PDL_BIN]
    
    if not conf.get('url'):
        print(f"Missing url for section {sec}")
        continue
    elif not conf.get('outdir'):
        print(f"Missing outdir for section {sec}")
        continue

    args.extend(("--url", conf['url']))
    args.extend(("--out-dir", conf['outdir']))

    if 'archive' in conf:
        args.extend(("--archive", conf['archive']))

    if 'template' in conf:
        args.extend(("--episode-template", conf['template']))

    if 'regex' in conf:
        args.extend(("--episode-regex", conf['regex']))

    if 'extra_options' in conf:
        args.extend(conf['extra_options'].split())        

    lock = fasteners.InterProcessLock(f"{conf['outdir']}/.pdl_lock")

    if not lock.acquire(blocking=False):
        print("lock exists, skipping")
        continue

    with lock:
        subprocess.run(args)

It's nothing special, but makes the cron entry very straightforward. All that's missing is some way to monitor what happens in each run in a concise form, without the repeat entries or fancy progress indicator so it is easier to review the logs.

The text was updated successfully, but these errors were encountered:

lightpohl · 2021-04-01T03:05:55Z

Hey @calebj!

Thanks for opening an issue. I think this is a good feature quest. I've been meaning to implement a --log-level flag with the following settings:

normal: Leave as-is. Default logging output (all logs, warnings, and critical errors).
quiet: Only important info (download started, download complete, critical errors).
silent: Only critical errors that cause the script to fail.

Sounds like --log-level quiet would fit your use case. Let me know what you think!

Edit: If you're saving to the same directory (or using the --archive flag) it should skip over any previously downloaded items. We'll just need to silence that output with the quieter log level. 👍

calebj · 2021-04-07T00:24:29Z

You're right, I think a quiet level would fit perfectly.

The one extra thing that might make it even better isn't related to logging, but a cache of the RSS with a quick check of the Last-Modified and/or ETag could make subsequent runs much quicker for large feeds (assuming the last run fetched all items it needed).

lightpohl · 2021-04-10T18:15:24Z

Added support for the LOG_LEVEL environmental variable in the latest release. Ran it through a few feeds with the quiet setting and I think it should greatly reduce the log output size you saw.

Regarding caching RSS feeds: Let me do some thinking/research on it. Slightly hesitant because it starts to the move the project into more of a "service" category, but I definitely see the value for large feeds. 👍

calebj · 2021-04-10T22:17:48Z

This is much better, thank you!

As for caching, it seems like a free optimization to me. If the server provides the headers that indicate cachable content, I don't see why podcast-dl shouldn't take advantage of it. Conversely, if certain values are present for Cache-Control, the client knows that it shouldn't cache anything. It's reasonable to leave it up to the server and to cache what it allows us to, and I don't think doing so changes the category of the program at all.

lightpohl · 2021-04-10T22:35:17Z

Certainly. I think a good first simple version could exclusively check for the Etag save a podcast.cache.json for additional runs. I'll spin this conversation out into a separate issue for tracking. #23

lightpohl closed this as completed in 3a2deb0 Apr 10, 2021

lightpohl mentioned this issue Apr 10, 2021

Add option to cache Etag/LastModified #23

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[feature request] Enhancements for logging #22

[feature request] Enhancements for logging #22

calebj commented Apr 1, 2021

lightpohl commented Apr 1, 2021 •

edited

calebj commented Apr 7, 2021

lightpohl commented Apr 10, 2021 •

edited

calebj commented Apr 10, 2021

lightpohl commented Apr 10, 2021

[feature request] Enhancements for logging #22

[feature request] Enhancements for logging #22

Comments

calebj commented Apr 1, 2021

lightpohl commented Apr 1, 2021 • edited

calebj commented Apr 7, 2021

lightpohl commented Apr 10, 2021 • edited

calebj commented Apr 10, 2021

lightpohl commented Apr 10, 2021

lightpohl commented Apr 1, 2021 •

edited

lightpohl commented Apr 10, 2021 •

edited