Retrieve metadata from online sources #11

gotson · 2019-10-08T09:25:21Z

Mimicking Plex, Komga could manage metadata for series and books, and retrieve metadata from online providers.

See also #48 for manual metadata edition.

Potential providers:

for Comics:
- ComicVine which offers a public API
- League Of Comic Geeks has no API
for French BD:
- AppBubble which has a nice (private) API, but cannot redistribute its data as its coming from a third-party provider (ORB)
- Bedetheque has no API, but a scrapper exists
for Manga:
- Kitsu
- MyAnimeList

In addition Komga should be able to:

Manually override metadata that was retrieved from online sources

bayang · 2019-10-25T20:37:23Z

I use https://leagueofcomicgeeks.com/ for series information and also for tracking comics (like trakt for comics) but unfortunately, no API provided 😞

gotson · 2019-10-26T00:32:29Z

I use https://leagueofcomicgeeks.com/ for series information and also for tracking comics (like trakt for comics) but unfortunately, no API provided 😞

Indeed, the ajax methods return html directly :(

It also doesn't have the completeness information about a series (whether a series is ongoing, finished, abandoned, or in hiatus), which is one of the most important metadata i am looking after!

The7thSage · 2019-10-26T23:35:31Z

Oh, Just read the last Checkbox, You can disregaurd the rest of this then...
then "maually override" is the same as "editing" metadata, Same result.

This is some Pie-in-The sky stuff, brace yourself....
What about local metadata? As long as it is read from an editable file (not a database?) at some point, you don't have to worry too bad about sources.

As long as you give us a template and location to write the data. That would let the user leverage any data-scraping (or elbow-grease copy-pasting) to write metadata not covered by direct functionality (and hopefully mildly future-proofing the feature).

Another example of a pre-set metadata format would be ComicRack's comicinfo.xml file within .cbz. I am only suggesting this one because "I" Use it alongside a pickier .json (both provided by Hdoujin Down-loader)

Just throwing out some ideas, no pressure.
Not in any hurry, I am using a separate database/reader for tagged manga, and reading in

If I had any idea how making a plugin woks and coding plugins, I'd have made a plugin already (probably leveraging HPX, my other database I'm running alongside this)

gotson · 2019-10-29T03:43:08Z

Hi all, I would be interested to know a bit more about the metadata you are after, and how you are using it, so i get a better idea of what to implement and how to implement it.

Could you give some insights about:

what kind of metadata fields are you interested in (Author, Description...)
is that metadata for individual books, or for a series
how you use (or plan to use) this metadata, and how it impacts your workflow or reading experience

I'll throw in some that are of particular interest for me:

I would like to have the completeness status of a Series, which is whether it is complete (all books published), ongoing, or in hiatus/abandoned, so that I can filter my library and start reading completed series only.
I would like to have the list of all books in a Series, and match this with the list of books I have in my library, so that i can now which books I am missing.

bayang · 2019-10-29T13:03:33Z

For me, if we follow the plex analogy, given a filename/folder hierarchy komga should be able to retrieve automatically series informations and book informations in series.
At least :
For series :
Authors, tile, description/summary, list of all books, completeness status (also like pull lists for comics), and a picture (cover/thumbnail) that a client can display
In series, for each book :
authors, title, description/summary, number in series and a picture (cover/thumbnail) that a client can display

Ideally komga should provide everything a client would need to satisfy at least basic needs.
The client would be in charge of the reading part only.

Considering tracking (like trakt for videos or anilist and kitsu for manga/anime), I'm not sure if it is the responsibility of komga or of the client.

Actually I use yacreader to manage comics library and reading -> but no information about books or series, no metadata, no tracking
And on mobile I use tachiyomi for manga (which also serves my tracking purpose) but a desktop client is missing (I'm currently having a look at https://github.com/xgi/houdoku, and I think komga could be implemented as a plugin for it, I'm currently having a look).

So if I could use komga on desktop AND mobile (with the tachiyomi plugin), for comics AND manga, that would be nice.

gotson · 2019-10-30T02:52:22Z

Thanks for the detailed answer!

For the thumbnail, at the moment Komga generates one from the first page of each book. For a series, it's the thumbnail of the first book in the series. Do you think there would be a need to have a thumbnail coming from external sources ?

Considering tracking (like trakt for videos or anilist and kitsu for manga/anime), I'm not sure if it is the responsibility of komga or of the client.

I will add tracking (read status) in Komga at some point. It might be manual to start with, because the implementation in the clients is not in my hands. For example in Tachiyomi it is not managed as an extension, but in the main app. There is also some questions on how to track, should the tracking be done on the matched series/books, so with recognized IDs like ComicVine, or using the internal Komga ID (but those can change if you move your files on disk for example).

bayang · 2019-10-30T05:26:01Z

No the thumbnails seem fine.
And tracking is indeed hard to get correctly, I don't have much ideas for now.

WillowMist · 2020-01-02T15:06:50Z

I'm not sure where you're currently pulling information from, but there are two fairly common sources to check for, which I would suggest honoring before trying to pull from ComicVine (which would require each installation to get a CV API key, and should be throttled so you don't try to pull metadata for over 200 books in an hour):

ComicInfo.xml may exist inside an archive, especially if Mylar or ComicRack have been involved in the process of curating the books.

Additionally, ComicBookLover tags may exist in the zipfile comments (for CBZ only)

frameset · 2020-01-03T18:14:09Z

Mylar plus a viewer such as Ubooquity or one of the ComicStreamer forks is a common usage scenario for many of us, so our collections come with metadata embedded in the file.

I'd love to replace Ubooquity with Komga as Komga is open source and has a seemingly much friendlier developer. 😉

I've got both running side by side for now, and I'm excited to see Komga get even better.

WillowMist · 2020-01-03T18:16:39Z

Yeah, if we get some control over how the OPDS is presented to the client (like with custom filters, or reading lists, etc) then this will be the perfect complement to Mylar. :)

gotson · 2020-01-03T23:38:21Z

Yeah, if we get some control over how the OPDS is presented to the client (like with custom filters, or reading lists, etc) then this will be the perfect complement to Mylar. :)

Opds is quite flexible, and I plan to add reading lists to it later 😊

MI3Guy · 2020-01-04T14:52:11Z

Personally, I would really like to see ComicInfo.xml support. I prefer having metadata embedded in the files.

gotson · 2020-01-05T01:59:45Z

Personally, I would really like to see ComicInfo.xml support. I prefer having metadata embedded in the files.

Planned in #54

GlassedSilver · 2020-01-13T20:26:59Z

Personally, I would really like to see ComicInfo.xml support. I prefer having metadata embedded in the files.

Planned in #54

Excellent, most importantly, I would really like to have some staging going on.

What I mean is: ComicInfo.xml should always override the online source and maybe the finding of the scanned source should be written into ComicInfo.xml (if there is none) into the folder/zip/etc... Why? Because it would be really important to be able to move that meta data out of Komga easily. This is one of my biggest gripes with Plex.

That way should a source ever return false information your earlier scrapes are always safe.

I usually like to manually verify the metadata, knowing that the information that is shown will not be altered unless I explicitly force override would be nice. With Plex I'm always a bit skeptical.

Another feat: even if I have to rebuild the entire library with a new database, the previous scrapes' metadata will be transferred.

BONUS: another key metadata source is doujinshi.org for all the doujinshi collectors out there. :) (this would also mean another media "kind". Doujinshi typically aren't published by companies, but "circles" and that's pretty much an important nomenclature. Also very important: usually doujinshis are released at conventions like Comiket for example (most famous example) and not only is there a certain numbering system derived from that (C50 for example would be an identifier usually found in the beginning of a doujinshi's digital file name) but also the importance to have a field for which convention it was released at.

If you look at doujinshi.org at sample entries (NSFW warning, it's very mixed content there and there is definitely no setting to view the site in SFW mode :D) you'll find which metadata is important to include.

Gin-no-kami · 2020-03-23T13:01:06Z

Just wanted to put forward a "better" metadata source for manga, MangaUpdates. It has more up to date and detatiled information about manga than Kitsu or MyAnimeList.

GlassedSilver · 2020-03-23T21:10:40Z

Just wanted to put forward a "better" metadata source for manga, MangaUpdates. It has more up to date and detatiled information about manga than Kitsu or MyAnimeList.

No db "has it all". (I realize you didn't try to imply this)
We should probably try to diversify imho. One day a source might die and then you gotta move the tables again anyhow. With a plug-in design one could even go really gung-ho and have a local db to connect to.

omgthegreenranger · 2020-06-19T14:03:21Z

I use Mylar for my comic post-processing, but it uses a modified ComicTagger script internally. I like CT a lot on it's own because it allows more flexibility in data, I'd wonder if you could incorporate that tool into Komga somehow and give us some customization on metadata field mapping. Have a basic default, but let us muck about if we wanted to.

I'd love to see the ability to grab story arc data, including upcoming and previous storyarcs that are pulled from ComicVine and it collects all issues into one - or a collection based on character appearances, etc. CT seems to fill in all of those details (I don't know if Mylar does).

chubits · 2020-07-01T15:26:44Z

It would be great to add a source "doujinshi.org" for " Manga/Doujinshi"

GlassedSilver · 2020-07-01T19:34:55Z

It would be great to add a source "doujinshi.org" for " Manga/Doujinshi"

Indeed! Two more really good sources are nhentai and exhentai.

For reference for those two one can look at the HappyPandaX project, specificall in the plugins repo:

https://github.com/happypandax/plugins

There's also a plugin that reads metadata files created by two very popular downloaders. ("File Metadata" plugin)

Overall a pretty nifty project for any doujinshi lover and I've been using it for a few weeks now. Right now I'm having a few issues with importing, but I think I borked something. I'll sit down for that issue later.

So far my strategy is to use both Komga and HPX, but to have feature overlap would be terrific, there's a lot each project can learn from the other. I'm very happy both exist! <3

Ludo9743 · 2021-03-11T00:25:00Z

Hi!

Could you add the site manga-news as a source of metadata in French about mangas ? It's really complete and has a lot of information about manga sold in French. Unfortunately, the site does not seem to have an API.

Thank you.

MKH-42 · 2021-03-28T11:10:40Z

Read metadata for comics with an ISBN from goolge books.
Comics with a manually or automated filled ISBN should look in google books for the metadata.
It should take over

titel,
authors
publisher
publish date
description
ISBN 13 when only ISBN 10 was the input
language
(page count)

Google Books API:
https://www.googleapis.com/books/v1/volumes?q=isbn:1234567890123

AniUrbz · 2021-05-03T23:14:58Z

Hi, could you add anilist to the list for manga metadata please, in practice anilist has been more complete in manga and mahwa than myanimelist, from oneshots to independent artists.

Here is the documentation and the api of the site. Thanks for your attention.
https://github.com/AniList/ApiV2-GraphQL-Docs

Bitwolfies · 2021-06-03T22:40:29Z

Would this feature in embed the data into the cbz like comic tagger? Or just exist only in komga? (or an option for either)

MKH-42 · 2021-06-03T23:06:09Z

My wish is to add it to Komga only.
Maybe we can also create a feature request for export the metadata to comicinfo.xml and include it into comics.
For me is the automatic request only the initial step during the registration or importing of books.
Than you can also edit it manuelly.

Bitwolfies · 2021-06-03T23:10:19Z

My wish is to add it to Komga only.
Maybe we can also create a feature request for export the metadata to comicinfo.xml and include it into comics.
For me is the automatic request only the initial step during the registration or importing of books.
Than you can also edit it manuelly.

Personally id like the opposite, and would like to embed if possible, especially when his new comic metadata standard is ready. But both should be options.

Kussie · 2021-06-03T23:21:42Z

Easiest approach from a developer stand point would probably be to start with it being added to Komga only first as the first phase, second phase would then probably be to add an export function, that would populate formats like ComicInfo.xml into the book files, third would be to add the ability to automatically export to book files when metadata is changed.

Bitwolfies · 2021-06-03T23:45:58Z

Easiest approach from a developer stand point would probably be to start with it being added to Komga only first as the first phase, second phase would then probably be to add an export function, that would populate formats like ComicInfo.xml into the book files, third would be to add the ability to automatically export to book files when metadata is changed.

Sounds about right, normally I would prefer just komga data, but I feel like books are a format that should be embedded, much like how music should be.

gotson · 2021-06-04T01:48:22Z

Would this feature in embed the data into the cbz like comic tagger

Already requested here: #82

Inervo · 2022-01-12T16:09:18Z

Hi. As Komga is getting better and better with each update, with a nice metadata feature, I'm curious if this feature is still being considered (I hope 🤞 ).
What are the prerequisite necessary you wish to implement/have before having this feature?
Can we help somehow? :)

Thank you for this marvelous software

tomandocubatas · 2022-02-14T10:33:55Z

Hello,
Any progress with this functionality?
I think it would be a giant step in the functionality of the application.
Being able to fill in the metadata based on certain online websites is very, very interesting.

In any case, the work you are doing seems incredible to me. Thanks a lot!!!

Pfuenzle · 2022-02-21T17:36:38Z

As there is no integration in Komga for an Anime Metadata provider, I made my own using the metadata from Anisearch.
https://github.com/Pfuenzle/AnisearchKomga. It supports all languages that are available on Anisearch and pushes the metadata directly to Komga.
If someone wants to help me to port it to Java to implement it in Komga it would be great

Inervo · 2022-02-21T18:21:24Z

As this is no integration in Komga for Metadata provider, I made my own using the metadata from Anisearch. https://github.com/Pfuenzle/AnisearchKomga. It supports all languages that are available on Anisearch and pushes the metadata directly to Komga. If someone wants to help me to port it to Java to implement it in Komga it would be great

Great work!

If there's the same sort of metadata for comics and BD (french/belgium comics), i would love this!

gotson · 2022-02-21T23:59:58Z

If someone wants to help me to port it to Java to implement it in Komga it would be great

Komga is in Kotlin 😉

The metadata retrieval is much more than hitting an api and mapping fields though. That bit is probably only 10% of what I envision for metadata retrieval.

Inervo · 2022-02-22T09:35:33Z

Thanks for your reply gotson :)

Can you share with us the main components or behavior you envision for metadata retrieval?
Maybe some of us can help you a bit for some part of the code ;) And by doing so, speed up the release date of this feature

Inervo · 2023-01-12T13:37:28Z

Thanks for your reply gotson :)

Can you share with us the main components or behavior you envision for metadata retrieval? Maybe some of us can help you a bit for some part of the code ;) And by doing so, speed up the release date of this feature

Hi @gotson. Happy new year !! :)
If that's okay with you, could you share with us your vision regarding the main components for metadata retrieval?
Maybe some of us can help you a bit for some part of the code ;) And by doing so, speed up the release date of this incredible feature and contribute to the great software you created :)

chu-shen · 2023-02-08T11:18:23Z

Bangumi metadata scraper for Komga👉https://github.com/chu-shen/BangumiKomga

Inspired by https://github.com/Pfuenzle/AnisearchKomga Thanks❤️

Inervo · 2023-02-15T22:00:14Z

In the meantime, for our french friends who wish to refresh their BD metadata from Bedetheque, here is a small metadata scrapper i've written 👉 https://github.com/Inervo/BedethequeKomga

Inspired from chu-shen/BangumiKomga and aubustou/bedetheque_scraper. Thanks a lot ❤️

NB: it's been ages since i've written some code, so it's far from perfect. Don't hesitate to raise any issue or to contribute :)

knguyen1 · 2023-12-13T05:39:55Z

If someone wants to help me to port it to Java to implement it in Komga it would be great

Komga is in Kotlin 😉

The metadata retrieval is much more than hitting an api and mapping fields though. That bit is probably only 10% of what I envision for metadata retrieval.

Don't let "perfect" be the enemy of "good". ;) I'm sure if you start something others will contribute.

Lreaper · 2024-04-05T20:07:34Z

I consider this the most crucial feature still missing in Komga. Besides the obvious benefits of metadata scraping this would also greatly assist in tracking the current status of a series.

Blazeflack · 2024-06-01T10:36:06Z

Having this feature built-in would be very nice. I currently use the Komf server and userscript to give me the possibility to identify and import metadata for a series. It can also auto-identify an entire library, but I am much too scared to use that functionality, so I prefer the manual single-identify personally :)

Edit: It would be nice if this feature also has the possibility to merge info from multiple sources when importing metadata. No metadata provider has all wanted information when it comes to manga. While a preferred provider has the best descriptions, it may not present tags for that series, while other providers do. So merging in information like tags and authors/artists from other providers is really helpful. This is something Komf supports right now, and is something Komga would need to be able to do if it wants to shine in this area too :)

gotson added the enhancement New feature or request label Oct 8, 2019

gotson mentioned this issue Dec 13, 2019

Features roadmap #22

Closed

16 tasks

gotson mentioned this issue Dec 20, 2019

Request: Group by Publisher #34

Closed

This was referenced Dec 31, 2019

Numbering for series with unusual issue numbering #43

Closed

Edit metadata manually #48

Closed

gotson mentioned this issue Feb 19, 2020

bdgest.com / bedetheque.com to be added as a scrapper #97

Closed

gotson mentioned this issue May 12, 2020

[Feature Request] ComicVine Taging #157

Closed

gotson mentioned this issue Jun 3, 2020

Restrict content by age rating per user #178

Closed

gotson changed the title ~~Metadata management~~ Retrieve metadata from online sources Jun 10, 2020

gotson added C: Server and removed C: Server labels Jun 29, 2020

gotson added the pinned label Mar 22, 2021

gotson mentioned this issue Mar 28, 2021

[Feature Request] Read Metadata from Google Books using ISBN #484

Closed

gotson mentioned this issue Jul 22, 2021

[Feature Request] Show in book view the information includes the book a comicinfo.xml #574

Closed

Kussie mentioned this issue Aug 25, 2021

[Feature Request] Add plugin system to support downloading chapters and metadata from manga websites #633

Closed

Jerrk mentioned this issue Oct 18, 2021

[Feature Request] Add tracking support for AL/MU/Kitsu/MAL/... #711

Closed

gotson removed the pinned label May 2, 2022

This comment was marked as off-topic.

Sign in to view

Repository owner locked and limited conversation to collaborators Jun 4, 2024

gotson converted this issue into discussion #1577 Jun 4, 2024

This issue was moved to a discussion.

Retrieve metadata from online sources #11

Retrieve metadata from online sources #11

Comments

gotson commented Oct 8, 2019 • edited Loading

bayang commented Oct 25, 2019

gotson commented Oct 26, 2019

The7thSage commented Oct 26, 2019 • edited Loading

gotson commented Oct 29, 2019

bayang commented Oct 29, 2019

gotson commented Oct 30, 2019

bayang commented Oct 30, 2019

WillowMist commented Jan 2, 2020 • edited Loading

frameset commented Jan 3, 2020 • edited Loading

WillowMist commented Jan 3, 2020

gotson commented Jan 3, 2020

MI3Guy commented Jan 4, 2020

gotson commented Jan 5, 2020

GlassedSilver commented Jan 13, 2020

Gin-no-kami commented Mar 23, 2020

GlassedSilver commented Mar 23, 2020 • edited Loading

omgthegreenranger commented Jun 19, 2020

chubits commented Jul 1, 2020

GlassedSilver commented Jul 1, 2020

Ludo9743 commented Mar 11, 2021

MKH-42 commented Mar 28, 2021

AniUrbz commented May 3, 2021

Bitwolfies commented Jun 3, 2021

MKH-42 commented Jun 3, 2021 • edited Loading

Bitwolfies commented Jun 3, 2021

Kussie commented Jun 3, 2021

Bitwolfies commented Jun 3, 2021

gotson commented Jun 4, 2021

Inervo commented Jan 12, 2022

tomandocubatas commented Feb 14, 2022

Pfuenzle commented Feb 21, 2022 • edited Loading

Inervo commented Feb 21, 2022 • edited Loading

gotson commented Feb 21, 2022

Inervo commented Feb 22, 2022

Inervo commented Jan 12, 2023

chu-shen commented Feb 8, 2023

Inervo commented Feb 15, 2023

knguyen1 commented Dec 13, 2023

Lreaper commented Apr 5, 2024

This comment was marked as off-topic.

Blazeflack commented Jun 1, 2024 • edited Loading

This issue was moved to a discussion.

gotson commented Oct 8, 2019 •

edited

Loading

The7thSage commented Oct 26, 2019 •

edited

Loading

WillowMist commented Jan 2, 2020 •

edited

Loading

frameset commented Jan 3, 2020 •

edited

Loading

GlassedSilver commented Mar 23, 2020 •

edited

Loading

MKH-42 commented Jun 3, 2021 •

edited

Loading

Pfuenzle commented Feb 21, 2022 •

edited

Loading

Inervo commented Feb 21, 2022 •

edited

Loading

Blazeflack commented Jun 1, 2024 •

edited

Loading