-
-
Notifications
You must be signed in to change notification settings - Fork 105
Optimise Event Processing: First Frame as Key Frame, Improved Memory Prompt, and Gemini Error catching #251
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
1. saves event to timeline no matter how important it it. 2. start analysis in parallel, to avoid missing initial clips.
|
Added as draft as still doing some real life testing with the new version |
|
Appreciate all your work! |
Absolutely love to, but I'm keen to hear your thoughts on hardcoding the initial frame as a key frame. The existing "important" feature at this moment would significantly add 1-2 seconds delay to the stream analyzer unless it is handled concurrently either using Blueprints or python. Lmk your thoughts! |
|
I think always including the first frame for analysis is a great idea, and using the first frame as keyframe as well! Also really appreciate the translation! You're right the Let me know what you think, and thank you for your work, it is much appreciated! |
|
That's great to hear! to further personalise it, I believe running some basic CV before sending the image and prompt to LLM could greatly improve the accuracy and usability. It would also be good if error handling could be added to blueprints (my free gemini model usually exceeds quota). Edit: |
|
I'm sure using CV before sending would increase accuracy, but it would also increase latency. I think it's best to keep the integration minimal and focus only on one thing. Personally I use Frigate (which does all the CV processing e.g. detecting people) and it would only trigger the automation if Frigate detects something. The other problem with more sophisticated preprocessing is raw performance. I think a big part of CV is optimizing it to run well on different hardware with all the different vendor-specific hardware acceleration. In my opinion it's best to maybe offer better integration with tools like Frigate that already do that. I 100% agree on the error handling. I'm actually pretty new to blueprints and so I don't know if there is a proper way to catch errors. So far I haven't found anything. Graceful error handling is definitely something that would be nice! The Gemini issue you mentioned doesn't seem limited to this integration but also the native 'Google Generative AI' integration. There is an issue here: #262 (another reason error handling would indeed be great!) |
|
Great insights!
Do you mind sharing your email address or discord? I'm happy to connect and talk about it in other applications if you are keen.
***@***.***
…________________________________
From: Valentin Fröhlich ***@***.***>
Sent: Thursday, March 20, 2025 2:00:24 AM
To: valentinfrlch/ha-llmvision ***@***.***>
Cc: Lee ***@***.***>; Author ***@***.***>
Subject: Re: [valentinfrlch/ha-llmvision] Optimise Event Processing: Parallel Stream Analysis, Improved Memory Prompt, and Gemini Fixes (PR #251)
I'm sure using CV before sending would increase accuracy, but it would also increase latency. I think it's best to keep the integration minimal and focus only on one thing. Personally I use Frigate (which does all the CV processing e.g. detecting people) and it would only trigger the automation if Frigate detects something.
The other problem with more sophisticated preprocessing is raw performance. I think a big part of CV is optimizing it to run well on different hardware with all the different vendor-specific hardware acceleration. In my opinion it's best to maybe offer better integration with tools like Frigate that already do that.
I 100% agree on the error handling. I'm actually pretty new to blueprints and so I don't know if there is a proper way to catch errors. So far I haven't found anything. Graceful error handling is definitely something that would be nice!
The Gemini issue you mentioned doesn't seem limited to this integration but also the native 'Google Generative AI' integration. There is an issue here: #262<#262> (another reason error handling would indeed be great!)
—
Reply to this email directly, view it on GitHub<#251 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ALW3VKQBIRROQ7WFDG3Z7YD2VGBARAVCNFSM6AAAAABYNQ5DKOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDOMZWHE3DINBZG4>.
You are receiving this because you authored the thread.Message ID: ***@***.***>
[valentinfrlch]valentinfrlch left a comment (valentinfrlch/ha-llmvision#251)<#251 (comment)>
I'm sure using CV before sending would increase accuracy, but it would also increase latency. I think it's best to keep the integration minimal and focus only on one thing. Personally I use Frigate (which does all the CV processing e.g. detecting people) and it would only trigger the automation if Frigate detects something.
The other problem with more sophisticated preprocessing is raw performance. I think a big part of CV is optimizing it to run well on different hardware with all the different vendor-specific hardware acceleration. In my opinion it's best to maybe offer better integration with tools like Frigate that already do that.
I 100% agree on the error handling. I'm actually pretty new to blueprints and so I don't know if there is a proper way to catch errors. So far I haven't found anything. Graceful error handling is definitely something that would be nice!
The Gemini issue you mentioned doesn't seem limited to this integration but also the native 'Google Generative AI' integration. There is an issue here: #262<#262> (another reason error handling would indeed be great!)
—
Reply to this email directly, view it on GitHub<#251 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ALW3VKQBIRROQ7WFDG3Z7YD2VGBARAVCNFSM6AAAAABYNQ5DKOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDOMZWHE3DINBZG4>.
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
|
Sure! valentinfrlch on Discord. |
|
Done!
…________________________________
From: Valentin Fröhlich ***@***.***>
Sent: Friday, March 21, 2025 5:40:21 AM
To: valentinfrlch/ha-llmvision ***@***.***>
Cc: Lee ***@***.***>; Author ***@***.***>
Subject: Re: [valentinfrlch/ha-llmvision] Optimise Event Processing: First Frame as Key Frame, Improved Memory Prompt, and Gemini Error catching (PR #251)
Sure! valentinfrlch on Discord.
—
Reply to this email directly, view it on GitHub<#251 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ALW3VKW5PUE7QDVIIDO4VYL2VMDRLAVCNFSM6AAAAABYNQ5DKOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDONBRGM2TKNRQGI>.
You are receiving this because you authored the thread.Message ID: ***@***.***>
[valentinfrlch]valentinfrlch left a comment (valentinfrlch/ha-llmvision#251)<#251 (comment)>
Sure! valentinfrlch on Discord.
—
Reply to this email directly, view it on GitHub<#251 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ALW3VKW5PUE7QDVIIDO4VYL2VMDRLAVCNFSM6AAAAABYNQ5DKOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDONBRGM2TKNRQGI>.
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
valentinfrlch
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! LGTM.
Unrelated, but while I was looking through this, I realized there are way too many consts. One API_KEY would be enough (instead of every provider having their own const). I'll fix that soon.
Blueprints Updates
Fixed issues with Gemini model selection
Added a default prompt to enhance the "memory" function
Tried
stream_analyzerto run in parallel with theimage_analyzerthat classifies "important" but no luck, so removed "important" and "image_analyzer" feature.stream_analyzeronly starts processing after "image_analyzer" finished classifying "important or not" .Media Handler Changes