-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
why flatpak app is so big? #52
Comments
Exactly, flatpak package contains most (but not all) of the dependencies. This includes dozens of python libraries, CUDA and partial ROCm frameworks etc. Libs have to be shipped inside Speech Note package because flatpak sandbox blocks any use of system libraries. In details it looks like this (the biggest ones):
The most problematic are Python libraries. They have tendency to be ridiculously huge.
Good idea but not so easy to implement. I really don't want to create separate "packages system" only for this app. Maybe the solution is to distribute in not-sandboxed package format in which the app would use libraries installed in the system. Actually, here is a pull request adding AUR package. It works quite well I must say. Maybe next will be a package for Debian. |
Cool. Thank you for this overview. Very interesting. |
right now the app is too big. I dont have nvidia. and my gpu is internal (amd apu) so I don't think ROCm is even applicable to me. about intergration I am not a flatpak dev but I see in flatpak stuff like this when I try to install steam client. app/com.valvesoftware.Steam/x86_64/stable as if steam is the main app and other are addons you can add. so maybe you can separate those cuda and stuff to addons in flatpak and they would get integrated (or can be integrated) with main app like is user install them. another way to do it is use flatpak for main app but put cuda libs and others as separate downloaded hosted on github that user downloads inside the app (not system installed) just like lang model data is right now. |
Thanks for pointing that out. I wasn't aware of this possibility. In flatpak vocabulary it is called "extensions". I will investigate what can be done. |
flatpak/flatpak-docs#18 (comment) when I search for extension and flatpak I got a lot of gnome.extensions ,which wasn't helpful. this is way better https://old.reddit.com/r/flatpak/comments/hoeenw/example_of_extension_packages_with_extradata/ |
@rezad1393 Thank you for sharing. There is very little documentation regarding this functionality. |
I've pushed 4.3.0 release to Fluthub. Updated package is even bigger than the previous one - sorry. I'm still working on implementing "modular approach" but it is not yet ready. To address the problem, I'm publishing "Tiny" Flatpak package as well. You can download it from the Releases page. This "Tiny" version is much smaller and contains only the basic features. Comparison between "Fluthub" and "Tiny" Flatpak packages:
|
thank you. I am not in a hurry and I am playing with speech to text part. I can wait if you are gonna release a modular version. btw, sometimes github will ban/censor you for hosting large files to add (like the addons) so please consider hosting them as addons on flathub. this has the benefit of putting customized and working versions of addons (engines and data). |
"Modular" version has been released 🎉 It looks like this (section from README): Starting from v4.4.0, the app distributed via Flatpak (published on Flathub) consists of the following packages:
Comparison between Base, Tiny and Add-ons Flatpak packages:
|
thank you for the new release. btw can you add the tiny version to flathub? also what is "Punctuation restoration" exactly? |
Adding it as a new app to Flathub would be problematic. This "Tiny" package is de facto "Base" without everything that depends on Python. Actually, I'm thinking about creating addition add-on with all Python dependencies, so "Tiny" would become "Base". This might work 🤔 On the other hand, I don't want to make too many add-ons to not confuse users too much.
Yeah, the function of this is definitely not well explained in the app. "Punctuation restoration" is an extra text processing after STT. This processing uses additional ML model (you need to download it in the model browser) to guess the right punctuation marks (,.?!:) in the text. This is only needed and only enabled for DeepSpeech, Vosk and few April models. Whisper and Faster Whisper natively support punctuation therefore this feature is not used for these models. I've added "Punctuation restoration" as an option (vs. always enabled) because currently configured ML model for restoring punctuation is quite slow and "memory hungry". If you are looking for speed, this feature should be disabled. |
thanks for the answer. is there anyone of them that is better than other ones? I use whisper because I saw that first and others that I tried where not good even with English. |
License is not the same for all models. In v4.4.0 you can check individual license in the model browser. Not all models have license information because I wasn't able to label all of them. Simply, there are too many models! In general, models can be divided into two groups: "free to use" and "free to use only for non commercial purposes". Non-commercial model should have correct license attached in the model browser. When license is missing, most likely model is "free to use". Which TTS model is the best? As everything, it depends :) May favorite is |
I incorrectly said tts. is there any model that gives the best results to english for speech to text? thank you for the tts part of answer. |
Right 😄 I won't tell you anything you don't know. The best STT models are from "Whisper" family. Usually I use "Whisper Large-v2" because it has outstanding accuracy and, with GPU acceleration enabled, it is perfectly usable in terms of speed. If you can't use GPU, I would go with "Distil-FasterWhisper Small". |
why flatpak app is so big?
does it have whisper and other apps included?
is it not better to move those to download mode just like language data?
The text was updated successfully, but these errors were encountered: