Automatically add SkoHub blog posts to team site #485

acka47 · 2022-12-07T08:57:03Z

While working on #484, I've noticed that the last three or four posts from the SkoHub blog are missing at http://lobid.org/product/skohub.(I have added the missing presentations with 36d54d2, though.) As we will be publishing more frequently in the coming months, we should think about automating the addition of these posts.

This could be implemented both by @sroertgen or @fsteeg , I guess.

The text was updated successfully, but these errors were encountered:

sroertgen · 2023-01-25T15:26:50Z

So I had a first look and what we could do is maybe fetch the xml-Feed of each blog and build the publications from there.
This has to happen on the client side then I guess.
Is this the kind of automated addition you have in mind?

fsteeg · 2023-01-26T08:46:47Z

Hm, so I think our goal should be to add files in gatsby/lobid/static/publication to have a uniform data base. That could happen from within the repo here, as you describe, by fetching the feeds and creating the files for them here (if that's what you mean).

However I'd think the cleanest approach would be to keep the creation of these files out of the scope for this repo, and instead create them elsewhere. Maybe triggered by a GitHub action when we push to the blogs, which then calls some conversion and then pushes the files here? Not sure if that makes sense, just some thoughts.

acka47 · 2023-01-26T11:00:25Z

I first liked the RSS approach as it may be independent from the actual blog software (we will have to integrate two Gqatsby and one Jekyll blog). However, after taking a short look at the RSS XML of the SkoHub blog, I am afraid that the RSS doesn't convey important structured data from the YAML frontmatter like author and tags. or am I missing something. If the RSS could be tweaked to include this, the approach might work after all, otherwise we will have to fetch th structured data from elsewhere. Also the HTMl of the blog post does not include structured data. I guess this might be configured with gatsby (a schema.org plugin maybe, see https://snappywebdesign.net/blog/how-to-add-structured-data-to-blog-posts-in-gatsby/). Otherwise we could think about @fsteeg 's approach to fetch it/push it directly from the git repo.

sroertgen · 2023-01-26T13:10:30Z

I am afraid that the RSS doesn't convey important structured data from the YAML frontmatter like author and tags

I think this can be configured, e.g. the lobid-blog contains also author information: https://blog.lobid.org/feed.xml There are no author ids given, I would have to look how far this can be configured.

[...] otherwise we will have to fetch th structured data from elsewhere. Also the HTMl of the blog post does not include structured data.

I think this is a good hint. We should add structured data to the blog posts and then we can use the RSS feeds to get the links and from there we get the structured data.

If you agree, @acka47, I will open issues in our three blog systems (lobid, metafacture, skohub) and add the structured metadata there. Then I will continue on this issue and pull the structured data from there.

@fsteeg I get your point as well, because this will lead to an inconsitent publication database since one does not find every publication there since the blog posts get fetched dynamically. However the approach we want to take depends on how important it is that this database contains all data. If it is kind of authorative we should switch to an approach where these files are created. If it is okay to have all data on the website (we could also think about adding structured data there about all the publications after they got fetched).

I'm open for both thoug I think the RSS approach is easier to implement.

acka47 · 2023-01-27T07:29:22Z

the lobid-blog contains also author information: https://blog.lobid.org/feed.xml There are no author ids given, I would have to look how far this can be configured.

There are no IDs in the YAML frontmatter of the lobid blog so this is fine. See e.g. https://github.com/hbz/lobid-blog/blob/master/_posts/2022-08-19-job-projektkoordinatorin.md

If you agree, @acka47, I will open issues in our three blog systems (lobid, metafacture, skohub) and add the structured metadata there. Then I will continue on this issue and pull the structured data from there.

+1 I think in the lobid blog feed only the tags are missing so not much to be done there.

I'm open for both thoug I think the RSS approach is easier to implement.

@fsteeg let us know if you still have problems with this approach. Then we should schedule a 30 min meeting to discuss this.

fsteeg · 2023-01-27T08:38:37Z

I like the idea of using the RSS, my point was more about what we do with it (create JSON files) and where (not in this repo). I don't think it would be a nice solution to create the publication list on https://lobid.org/team both from files and from RSS feeds, if that's the suggestion, since that whole system is based on the files, the queries against the files etc. But maybe it's worth to reconsider that whole 'knowledge graph' approach to the website.

fsteeg · 2023-01-27T11:18:55Z

Maybe it makes sense to approach this from a different angle: we could set up a new page to list the team publications, which uses the https://lobid.org/team/feed.xml RSS feed, and other feeds like the SkoHub blog, to create a complete list of publications (which we should publish as RSS again).

That way, we basically have two separate things: 1) a list of publications aggregated from different RSS sources and 2) a system to publish JSON files as RSS (our current setup). All sources that already publish RSS could come in via 1), and for all sources that we have no RSS for, we create JSON files in 2).

acka47 assigned fsteeg and sroertgen Dec 7, 2022

acka47 added this to Backlog in lobid board via automation Dec 7, 2022

sroertgen unassigned fsteeg Jan 16, 2023

sroertgen assigned acka47 and unassigned acka47 Jan 27, 2023

acka47 assigned fsteeg and unassigned sroertgen Jan 27, 2023

fsteeg removed their assignment Jan 27, 2023

sroertgen self-assigned this Jan 30, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Automatically add SkoHub blog posts to team site #485

Automatically add SkoHub blog posts to team site #485

acka47 commented Dec 7, 2022 •

edited

sroertgen commented Jan 25, 2023

fsteeg commented Jan 26, 2023

acka47 commented Jan 26, 2023

sroertgen commented Jan 26, 2023

acka47 commented Jan 27, 2023 •

edited

fsteeg commented Jan 27, 2023

fsteeg commented Jan 27, 2023

Automatically add SkoHub blog posts to team site #485

Automatically add SkoHub blog posts to team site #485

Comments

acka47 commented Dec 7, 2022 • edited

sroertgen commented Jan 25, 2023

fsteeg commented Jan 26, 2023

acka47 commented Jan 26, 2023

sroertgen commented Jan 26, 2023

acka47 commented Jan 27, 2023 • edited

fsteeg commented Jan 27, 2023

fsteeg commented Jan 27, 2023

acka47 commented Dec 7, 2022 •

edited

acka47 commented Jan 27, 2023 •

edited