Skip to content

Conversation

@lich2000117
Copy link
Contributor

@lich2000117 lich2000117 commented Mar 6, 2025

Blueprints Updates

  1. Fixed issues with Gemini model selection

    • The default model should be empty to prevent incorrect behavior.
      image
  2. Added a default prompt to enhance the "memory" function

    • This ensures better AI assistance when identifying persons/pets.
  3. Tried stream_analyzer to run in parallel with the image_analyzer that classifies "important" but no luck, so removed "important" and "image_analyzer" feature.

  • Ensures no initial clips/frames are missed since originally, stream_analyzer only starts processing after "image_analyzer" finished classifying "important or not" .

Media Handler Changes

  1. Prioritized the first frame of the stream as the most important, by hard coding the similarity score
    • Since the first frame triggers the automation, it should always be considered important.
    • This reduces edge cases where objects that appear for a short duration might otherwise be missed.

1. saves event to timeline no matter how important it it.
2. start analysis in parallel, to avoid missing initial clips.
@lich2000117
Copy link
Contributor Author

Added as draft as still doing some real life testing with the new version

@valentinfrlch
Copy link
Owner

Appreciate all your work!
Just wanted to check if you still want to contribute to this repository, or maintain your own fork. Both is fine of course. Also let me know if you require any assistance!

@lich2000117
Copy link
Contributor Author

Appreciate all your work! Just wanted to check if you still want to contribute to this repository, or maintain your own fork. Both is fine of course. Also let me know if you require any assistance!

Absolutely love to, but I'm keen to hear your thoughts on hardcoding the initial frame as a key frame.

The existing "important" feature at this moment would significantly add 1-2 seconds delay to the stream analyzer unless it is handled concurrently either using Blueprints or python.

Lmk your thoughts!

@valentinfrlch
Copy link
Owner

I think always including the first frame for analysis is a great idea, and using the first frame as keyframe as well! Also really appreciate the translation!

You're right the important feature adds some delay. I personally don't use it myself. I think maybe it could be improved, though because sending the actual notification depends on what important returns, I am not sure if it can be parallelized. The feature is a little bit of a gimmick (which is why it is labelled as 'experimental') but I like the idea. It is optional after all.

Let me know what you think, and thank you for your work, it is much appreciated!

@lich2000117
Copy link
Contributor Author

lich2000117 commented Mar 19, 2025

That's great to hear! to further personalise it, I believe running some basic CV before sending the image and prompt to LLM could greatly improve the accuracy and usability.

It would also be good if error handling could be added to blueprints (my free gemini model usually exceeds quota).

Edit:
BTW, the link and author information in manifest.json was changed for my local testing of the integration (setup my own HACS integration and trying to pull updates from my forked branch), I have reverted them back and will help create a PR soon.

@valentinfrlch
Copy link
Owner

I'm sure using CV before sending would increase accuracy, but it would also increase latency. I think it's best to keep the integration minimal and focus only on one thing. Personally I use Frigate (which does all the CV processing e.g. detecting people) and it would only trigger the automation if Frigate detects something.

The other problem with more sophisticated preprocessing is raw performance. I think a big part of CV is optimizing it to run well on different hardware with all the different vendor-specific hardware acceleration. In my opinion it's best to maybe offer better integration with tools like Frigate that already do that.

I 100% agree on the error handling. I'm actually pretty new to blueprints and so I don't know if there is a proper way to catch errors. So far I haven't found anything. Graceful error handling is definitely something that would be nice!

The Gemini issue you mentioned doesn't seem limited to this integration but also the native 'Google Generative AI' integration. There is an issue here: #262 (another reason error handling would indeed be great!)

@lich2000117
Copy link
Contributor Author

lich2000117 commented Mar 19, 2025 via email

@lich2000117 lich2000117 changed the title Optimise Event Processing: Parallel Stream Analysis, Improved Memory Prompt, and Gemini Fixes Optimise Event Processing: First Frame as Key Frame, Improved Memory Prompt, and Gemini Error catching Mar 19, 2025
@lich2000117 lich2000117 marked this pull request as ready for review March 19, 2025 23:29
@valentinfrlch
Copy link
Owner

Sure! valentinfrlch on Discord.

@lich2000117
Copy link
Contributor Author

lich2000117 commented Mar 23, 2025 via email

@valentinfrlch valentinfrlch changed the base branch from main to 1.4.2-beta March 25, 2025 10:58
Copy link
Owner

@valentinfrlch valentinfrlch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! LGTM.

Unrelated, but while I was looking through this, I realized there are way too many consts. One API_KEY would be enough (instead of every provider having their own const). I'll fix that soon.

@valentinfrlch valentinfrlch merged commit a43d126 into valentinfrlch:1.4.2-beta Mar 25, 2025
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants