Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feature request] Enhancements for logging #22

Closed
calebj opened this issue Apr 1, 2021 · 5 comments
Closed

[feature request] Enhancements for logging #22

calebj opened this issue Apr 1, 2021 · 5 comments

Comments

@calebj
Copy link
Contributor

calebj commented Apr 1, 2021

I've been using this for a while as part of an archiving cronjob, and it's worked great. My one gripe is how chatty the logs are, since the app prints out information for every single episode. Altogether, each run generates over 5MB of output, most of it redundant. I suppose I could diff the logs on each run and store the differences, but it would be great if there was a flag to make podcast-dl only output (and/or act upon) changes in the feed since the last run.

Currently I use the following script to automate the invocation of podcast-dl for each feed:

#!/bin/env python3

PDL_BIN = 'podcast-dl'

from configparser import ConfigParser, ExtendedInterpolation
import subprocess
import sys

import fasteners

cp = ConfigParser(interpolation=ExtendedInterpolation(), delimiters=('=',))

with open('podcast-dl.ini') as f:
    cp.read_file(f)

dl_sections = cp.sections()

for sec in dl_sections:
    print(f"Running for {sec}")
    conf = cp[sec]
    args = [PDL_BIN]
    
    if not conf.get('url'):
        print(f"Missing url for section {sec}")
        continue
    elif not conf.get('outdir'):
        print(f"Missing outdir for section {sec}")
        continue

    args.extend(("--url", conf['url']))
    args.extend(("--out-dir", conf['outdir']))

    if 'archive' in conf:
        args.extend(("--archive", conf['archive']))

    if 'template' in conf:
        args.extend(("--episode-template", conf['template']))

    if 'regex' in conf:
        args.extend(("--episode-regex", conf['regex']))

    if 'extra_options' in conf:
        args.extend(conf['extra_options'].split())        

    lock = fasteners.InterProcessLock(f"{conf['outdir']}/.pdl_lock")

    if not lock.acquire(blocking=False):
        print("lock exists, skipping")
        continue

    with lock:
        subprocess.run(args)

It's nothing special, but makes the cron entry very straightforward. All that's missing is some way to monitor what happens in each run in a concise form, without the repeat entries or fancy progress indicator so it is easier to review the logs.

@lightpohl
Copy link
Owner

lightpohl commented Apr 1, 2021

Hey @calebj!

Thanks for opening an issue. I think this is a good feature quest. I've been meaning to implement a --log-level flag with the following settings:

  • normal: Leave as-is. Default logging output (all logs, warnings, and critical errors).
  • quiet: Only important info (download started, download complete, critical errors).
  • silent: Only critical errors that cause the script to fail.

Sounds like --log-level quiet would fit your use case. Let me know what you think!

Edit: If you're saving to the same directory (or using the --archive flag) it should skip over any previously downloaded items. We'll just need to silence that output with the quieter log level. 👍

@calebj
Copy link
Contributor Author

calebj commented Apr 7, 2021

You're right, I think a quiet level would fit perfectly.

The one extra thing that might make it even better isn't related to logging, but a cache of the RSS with a quick check of the Last-Modified and/or ETag could make subsequent runs much quicker for large feeds (assuming the last run fetched all items it needed).

@lightpohl
Copy link
Owner

lightpohl commented Apr 10, 2021

Added support for the LOG_LEVEL environmental variable in the latest release. Ran it through a few feeds with the quiet setting and I think it should greatly reduce the log output size you saw.

Regarding caching RSS feeds: Let me do some thinking/research on it. Slightly hesitant because it starts to the move the project into more of a "service" category, but I definitely see the value for large feeds. 👍

@calebj
Copy link
Contributor Author

calebj commented Apr 10, 2021

This is much better, thank you!

As for caching, it seems like a free optimization to me. If the server provides the headers that indicate cachable content, I don't see why podcast-dl shouldn't take advantage of it. Conversely, if certain values are present for Cache-Control, the client knows that it shouldn't cache anything. It's reasonable to leave it up to the server and to cache what it allows us to, and I don't think doing so changes the category of the program at all.

@lightpohl
Copy link
Owner

Certainly. I think a good first simple version could exclusively check for the Etag save a podcast.cache.json for additional runs. I'll spin this conversation out into a separate issue for tracking. #23

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants