Skip to content

RSS "source" attribute #229

@nash-an

Description

@nash-an

Expected behavior

Expected goFeed.item to have attribute called "Source" (rssFeed.item has this attribute, but goFeed.item does not for some reason)

or

Source attribute to exist in the goFeed.item.Custom[] map

Actual behavior

goFeed.item has no attribute or method called "Source"

goFeed.item.Custom[] is empty

Steps to reproduce the behavior

News aggregate, for example, news.google.com, uses the <source> attribute, as opposed to <title> or <author>.

Note: Please include any links to problem feeds, or the feed content itself!

Activity

mmcdole

mmcdole commented on Jan 2, 2025

@mmcdole
Owner

The Source field is specific to RSS feeds and isn't included in the universal gofeed.Feed type because it's not a concept that exists across all feed formats (Atom and JSON Feed don't have equivalent fields).

The universal feed type is designed to contain only fields that are common across feed formats, providing a consistent interface regardless of the underlying feed type.

If you specifically need RSS-only fields like Source, you can use the RSS parser directly:

rssParser := &rss.Parser{}
feed, _ := rssParser.Parse(reader)
// Now you have access to feed.Items[i].Source
infogulch

infogulch commented on Jan 3, 2025

@infogulch
Contributor

Given that the <source> tag is optional1 and the rss.Item.Source field is nil if the tag is missing (I assume), it might be reasonable to consider the other feed types as having always "omitted" the field.

Source *Source `json:"source,omitempty"`

I guess this comes down to API design philosophy... are the general Feed/Item structs a strict subset of fields that are defined in all formats, or could they include fields that are optional in one format to maximize data preservation?

Footnotes

  1. https://www.rssboard.org/rss-specification#ltsourcegtSubelementOfLtitemgt

mmcdole

mmcdole commented on Jan 3, 2025

@mmcdole
Owner

Yea, both this and #228
As well as a few other issues have me questioning if I should loosen the reigns on the common feed elements.

Necoro

Necoro commented on Jan 3, 2025

@Necoro
Contributor

As well as a few other issues have me questioning if I should loosen the reigns on the common feed elements.

As a random commenter, I find this problematic: Fields in the common API suggest that they indeed have the possibility to be filled by any feed. I personally dislike fields in APIs that sound generally plausible but turn out to be only filled in specific circumstances.

I'd rather have the Atom and RSS items accessible from the general Item so that, if you need such a field, you don't have to know before parsing but rather can do (item.Specific.(*rss.Item)).Source or such.

infogulch

infogulch commented on Jan 3, 2025

@infogulch
Contributor

I personally dislike fields in APIs that sound generally plausible but turn out to be only filled in specific circumstances.

Yeah that's the argument for the other side, but the consequences of these choices are very asymmetric:

When the API is "generous" with the fields it supports, you assume that because a field is present that it is always set (which already doesn't make sense in this scenario since the field is nullable), and then you find out that the feed doesn't have the data in the first place. There's nothing to do, you're just sad. 🤷 Maybe this could be mitigated by documenting on the field that it is only set when it's an RSS type feed.

When the API is strict, then the data may be present, but you can't get it because the generalized feed parser throws it away. So now you have to rewrite the whole feed selection logic from scratch, manually select the correct parser, get the field you want out, then either convert it back into the general Feed struct for processing or 3x duplicate your feed handling logic.

IMO the purpose of the generalized Feed API should not be to educate programmers on what fields are available in all feed types, it should be to abstract data in the specialized APIs and make it available with as little friction as possible.


I wonder if a better way to solve this problem would be to add a private field to the general Feed struct with a pointer to the original item struct, then a method like Item.OriginalItem() any that returns one of rss.Item, atom.Item, json.Item that you can type switch on to extract whatever additional fields you want, but it returns nil it if it's serialized and deserialized to reduce data duplication.

reopened this on Jan 5, 2025
spacecowboy

spacecowboy commented on Jan 6, 2025

@spacecowboy
Contributor

My two cents, as the author of a Feed reader app, I only use the common parse method. I don't care if it's an RSS, Atom, or JSON feed. So I won't ever do ParseAtom or ParseRSS separately.

But if a specific field is available, I want to be able to access it.

Being able to access the underlying RSS/Atom/.. item would be best as some fields are very type-specific and wouldn't make sense to add in the common data structure IMO.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      Participants

      @Necoro@infogulch@spacecowboy@mmcdole@nash-an

      Issue actions

        RSS "source" attribute · Issue #229 · mmcdole/gofeed