Request: Bookmark Summary #80

iamhenry · 2024-04-12T19:00:40Z

This will probably require a lot of work but for years i've been looking for an app that can take my bookmarks and create a summary from them

use case: take all my bookmarks from audio to text and transcibe them. max duration 60 secs to transcribe

Basically what Snipd podacst app is doing. It takes all my bookmarks from a podcast and generates a list and provides them as notes for me to review and dive deeper into that topic.

lmk what you think 😊

rasmuslos · 2024-04-13T13:25:54Z

The idea is pretty cool but this would depend on ABS providing transcriptions. I have looked into whisper and whisper.cpp to transcribe audio files, but I have not found the time to implement anything yet. But would have to be added to ABS first, then transcriptions in the now playing view, and after that bookmark summaries.

I would also recommend opening an issue in the ABS repo for this feature, as this should probably be implemented server-side, too.

iamhenry · 2024-04-19T20:49:54Z

is that the only solution? is it possible to use an llm API via the cloud to generate it on the fly without having transcriptions?

iamhenry · 2024-04-19T21:02:04Z

looks like there's a discussion around it that's a bit stale due to lack of eng resources

someone does mention Snipd which is exactly what i was hoping we could have for ABS/ShelfPlayer

advplyr/audiobookshelf#1723

rasmuslos · 2024-04-20T06:50:43Z

While it is possible to upload the audio file to a LLM provider like OpenAI and prompt it to generate a short summary it's really not ideal.
I am pretty sure this gets expensive real fast if you upload large audio files, which is required to give the model enough context. Also I am not sure about the legal implications of this, e.g. if you are even allowed to upload copyrighted works.

I have looked into whisper & whisper.cpp, things that can be used to transcribe an item, and they work pretty well. While word synced transcripts are not really possible, extracting timestamped sentences works pretty well. But I could not find the time to implement anything in audiobookshelf yet.
Using something like https://github.com/jzhang38/TinyLlama would probably suffice to then create summaries, but this requires the transcripts to exist in the first place.

And including a open source multi modal model to do the transcripts locally is not really an option. The app is around 15MB right now, including even a small one would inflate that to at least 6GB.

iamhenry · 2024-05-03T22:48:35Z

i think someone in the ABS community will be attempting to solve this issue with an initial prototype

i've been tracking the convo here advplyr/audiobookshelf#1723 (comment)

iamhenry · 2024-07-12T20:53:42Z

snipd just released a huge update related to this. was curious to see what you thought and if you have any aspirations to add this feature? https://x.com/snipd_app/status/1811024587292864948

The feature allows me to upload any audio file and convert it to chapters/transcript while also having the ability to create highlights while autogenerating AI titles

i understand this is a huge task but no other app i have checked is even thinking about the enhancement and could be a game changer for this app

attaching a few screenshots of the highlights and generated chapters

rasmuslos · 2024-07-16T10:48:04Z

I think the actual features are easy enough to implement. Generating a transcript using whisper and then feeding it, together with a timestamp and a good prompt into an LLM like Llama is not that hard, the question is where do you run these AIs?

The sniped app is around 120 MB but I don't think the models are included (Whisper Base is around 140 MB, llama even bigger) so including the in the app binary is not possible. The memory consumption is also considerable (500 MB for whisper and multiple GB for llama).
Sniped runs them on their servers, which is why they are charing you for a subscription, a business model not suitable for ShelfPlayer. The AI features would have to be implemented in ABS, where large binaries, huge memory consumption and long program runtimes are possible. I look into doing this but was so unfamiliar with the codebase that I didn't pull through.
I may try again in the winter but until someone adds these features to ABS its not feasible to add them to ShelfPlayer.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Request: Bookmark Summary #80

Request: Bookmark Summary #80

iamhenry commented Apr 12, 2024

rasmuslos commented Apr 13, 2024

iamhenry commented Apr 19, 2024

iamhenry commented Apr 19, 2024

rasmuslos commented Apr 20, 2024

iamhenry commented May 3, 2024 •

edited

Loading

iamhenry commented Jul 12, 2024 •

edited

Loading

rasmuslos commented Jul 16, 2024

Request: Bookmark Summary #80

Request: Bookmark Summary #80

Comments

iamhenry commented Apr 12, 2024

rasmuslos commented Apr 13, 2024

iamhenry commented Apr 19, 2024

iamhenry commented Apr 19, 2024

rasmuslos commented Apr 20, 2024

iamhenry commented May 3, 2024 • edited Loading

iamhenry commented Jul 12, 2024 • edited Loading

rasmuslos commented Jul 16, 2024

iamhenry commented May 3, 2024 •

edited

Loading

iamhenry commented Jul 12, 2024 •

edited

Loading