Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some minor podcast parsing issues #234

Merged
merged 4 commits into from
Feb 2, 2019

Conversation

mw9
Copy link
Contributor

@mw9 mw9 commented Jan 28, 2019

Herewith three proposed changes to Formats::XML to fix up some podcast browsing issues that I have encountered.

  • Formats::XML - Podcasts: Fix iTunes image tag recognition

The RSS parser has the ability to use an iTunes image tag if an RSS image tag is not present. Except that it currently looks in the wrong place.

  • Formats::XML - Podcasts: Badly formed RSS image URL breaks Jive

I encountered this about 15 months ago, and it was very confusing while it lasted. This change adds a sanitization check to the discovered feed image tag to ensure that we have not been misled by a poorly formed podcast feed and XMLin's interpretation of it.

Try forcing $feed{'image'}={} if you wish to observe Jive's curious, and unwelcome, behaviour in this circumstance. It makes a mess of the feed display.

  • Formats::XML - Podcasts: Use episode specific image

A minor enhancement (depending on your point of view) to make use of iTunes' episode specific images if they are present in the feed. Not many podcasts that I have encountered make use of this, but, where they do, we have the possibility of adding a little variety to Jive's display of the feed. But at the expense of a little more image processing during the rendering phase.

Copy link
Member

@michaelherger michaelherger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot! Looks like a nice little set of improvements. Could you please see my individual comments?

# really *do* have a scalar.
# E.g. a broken podcast that provides an empty 'url' tag is known to
# trigger this unwelcome behaviour.
if (ref $feed{'image'}) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we treat this case in other places and would just pick the first of a list, in case image was a list. Could you please incorporate this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've revised the approach in line with your suggestion.
As I usually muck up revisions to these things, I've prepare a new branch here:
https://github.com/mw9/slimserver/tree/feature/podcast_issues_revised

But I've not had a chance to check them in running code yet...

If you're happy with the revised approach, and they do run as expected, I propose that I then push up the revised branch. Make sense ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've revised the changes to take up what I think you're after.

In the process I've added a "tidy up" commit to the "Lage der Nation" issue where we had a list of images to deal with. I did this because I became confused. I have checked that the tidied up version stills works as expected.

All other changes seem to be working fine.

So you should be seeing four replacement (forced) commits, added today.

I hope I haven't confused matters...

Slim/Formats/XML.pm Outdated Show resolved Hide resolved
An itunes:image tag exists at the channel level of the feed, *not* the
top level.

We retain the existing (historic) check at the top level in case some
feeds still use that approach. But we ensure that we have the expected
hash ref.

Refer:
https://help.apple.com/itc/podcasts_connect/#/itcb54353390
Conventional RSS feeds provide a single ‘image’ element containing,
amongst other things, a URL which points to the image. ‘XMLin’ will
provide us with a hash ref in this circumstance.

Some RSS feeds provide multiple image elements. In this case ‘XMLin’
will provide an array of such ‘image’ elements. We arbitrarily pick
the first one.

Refer original commit: baea48e
A broken podcast may provide a malformed image url tag. If, for any
reason, we get an empty url tag, or worse, XMLin may interpret it as
a (possibly empty) hash reference, or something else other than a scalar
value. This will break podcast browsing in Jive.

So we add some checks to our retrieved image URLs to ensure that they
are, indeed, scalars. The appended (historic) example is illustrative.

An alternative solution might be available by adapting the parameters
passed to XMLin (in xmlToHash) that control its rendering of the XML.
But that may inadvertently break some other aspect of the parsing.

Example:

Observed in September 2017, although no longer present in the errant
feed. The (edited) feed was as follows:

    <channel>
        <title>A title</title>
        <description>Some description</description>
        <link>A_Link.com</link>
        <pubDate>Sometime in Sept 2017</pubDate>
        <image>
            <url> </url>
            <title>Another title</title>
            <link>Another_Link.com</link>
        </image>
    etc.

If the channel/image/url tag is simply empty, as above, xmlToHash will
return $xml->{'channel'}->{'image'}->{'url'} as an *empty hash ref*,
( image->url = {} ) and *not* a zero length string, or 'undef', as one
might expect.

Jive browsing then breaks when attempting to fetch artwork in
SlimBrowserApplet, because it requires that the icon variable be a
string. It breaks on the structured icon object, even though empty. The
podcast browse menu is not properly rendered, and presents a bewildering
display. Not a good user experience.
At present, each episode of a podcast is displayed in the various browse
menus with an accompanying image. This image is taken from the 'channel'
level of the underlying RSS feed, and is identical for each episode. The
standard RSS format makes no provision for images to be provided at the
'item' (episode) level.

But Apple/iTunes extends the format to, optionally, provide a separate
image for each individual episode of a podcast. LMS could make use of
these images, thereby adding a little more variety to the feed display.

This proposed change achieves that.

The episode image, if it exists, is found in the 'href' attribute of an
'itunes:image' tag, at the 'item' level of the feed.

Reference:
https://help.apple.com/itc/podcasts_connect/#/itcb54353390
@mherger mherger merged commit e682f9f into LMS-Community:public/7.9 Feb 2, 2019
@mherger
Copy link
Contributor

mherger commented Feb 2, 2019

Thanks! Looking good to me.

@mw9
Copy link
Contributor Author

mw9 commented Feb 3, 2019

Great, thanks.

@mw9 mw9 deleted the feature/podcast_issues branch February 3, 2019 13:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants