Skip to content

Conversation

regexer
Copy link
Contributor

@regexer regexer commented Sep 25, 2025

Copy link
Member

@jrobble jrobble left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jrobble reviewed 3 of 3 files at r1, all commit messages.
Reviewable status: all files reviewed, 7 unresolved discussions (waiting on @regexer)


python/LlamaVideoSummarization/plugin-files/descriptor/descriptor.json line 74 at r1 (raw file):
Update to:

timeline within the "desired" range of TIMELINE_CHECK_TARGET_THRESHOLD


python/LlamaVideoSummarization/tests/test_llama_video_summarization.py line 416 at r1 (raw file):

        component = LlamaVideoSummarizationComponent()

        drone_timeline_segment_1 = copy.deepcopy(DRONE_TIMELINE_SEGMENT_1)

Good catch on needing the copy as not to modify the original dicts.


python/LlamaVideoSummarization/tests/test_llama_video_summarization.py line 441 at r1 (raw file):

        self.assertEqual(mpf.DetectionError.DETECTION_FAILED, cm.exception.error_code)
        self.assertIn("Timeline event start time occurs too soon before segment start time. (179.9798 - 0.0) > 20.", str(cm.exception))

Why remove this check?


python/LlamaVideoSummarization/tests/test_llama_video_summarization.py line 449 at r1 (raw file):

        self.assertEqual(mpf.DetectionError.DETECTION_FAILED, cm.exception.error_code)
        self.assertIn("Timeline event end time occurs too late after segment stop time. (381.17 - 299.96633333333335) > 20.", str(cm.exception))

Why remove this check?


python/LlamaVideoSummarization/llama_video_summarization_component/__init__.py line 222 at r1 (raw file):

                         f'abs({segment_start_time} - {min_event_start}) > {target_threshold}.'))
                if abs(segment_start_time - min_event_start) > accept_threshold:
                    acceptable_checks['after_seg_start'] = False

A better name than "after_seg_start" would be "near_seg_start" since we're comparing against an absolute value.


python/LlamaVideoSummarization/llama_video_summarization_component/__init__.py line 231 at r1 (raw file):

                            f'abs({max_event_end} - {segment_stop_time}) > {target_threshold}.'))
                if abs(max_event_end - segment_stop_time) > accept_threshold:
                    acceptable_checks['before_seg_stop'] = False

A better name than "before_seg_stop" would be "near_seg_stop since we're comparing against an absolute value.


python/LlamaVideoSummarization/llama_video_summarization_component/__init__.py line 234 at r1 (raw file):

        if not hard_error:
            if list({v for v in acceptable_checks.values()}) == [True]:

According to AI this can be simplified using all(). See first example:

# Example dictionary with all truthy values
dict1 = {"key1": True, "key2": 1, "key3": "hello"}
result1 = all(dict1.values())
print(f"Are all values in dict1 truthy? {result1}")

# Example dictionary with at least one falsy value
dict2 = {"keyA": True, "keyB": 0, "keyC": "world"}
result2 = all(dict2.values())
print(f"Are all values in dict2 truthy? {result2}")

# Example dictionary with only falsy values
dict3 = {"keyX": False, "keyY": 0, "keyZ": ""}
result3 = all(dict3.values())
print(f"Are all values in dict3 truthy? {result3}")

# Example of an empty dictionary (returns True)
dict4 = {}
result4 = all(dict4.values())
print(f"Are all values in dict4 truthy? {result4}")

python/LlamaVideoSummarization/llama_video_summarization_component/__init__.py line 235 at r1 (raw file):

        if not hard_error:
            if list({v for v in acceptable_checks.values()}) == [True]:
                self._store_acceptable_response(response_json)

It's a little odd that this is stored as a class variable. Consider instead returning a second value from this method, a boolean that indicates if the response was acceptable or not. Use that to keep track of the acceptable response json in the calling method:

response_json = {}
acceptable_response_json = {}  # NEW
error = None
while max(attempts.values()) < max_attempts:

Copy link
Member

@jrobble jrobble left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 8 unresolved discussions (waiting on @regexer)


python/LlamaVideoSummarization/llama_video_summarization_component/__init__.py line 121 at r1 (raw file):

            acceptable_timeline = self._fetch_acceptable_response()
            if acceptable_timeline is not None:
                return acceptable_timeline

Log a warning here saying the target criteria couldn't be reached so falling back to acceptable criteria.

@jrobble
Copy link
Member

jrobble commented Sep 26, 2025

python/LlamaVideoSummarization/llama_video_summarization_component/__init__.py line 121 at r1 (raw file):

Previously, jrobble (Jeff Robble) wrote…

Log a warning here saying the target criteria couldn't be reached so falling back to acceptable criteria.

Actually, make that an info line.

Copy link
Contributor Author

@regexer regexer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@regexer reviewed 3 of 3 files at r1, all commit messages.
Reviewable status: all files reviewed, 8 unresolved discussions (waiting on @jrobble)


python/LlamaVideoSummarization/tests/test_llama_video_summarization.py line 441 at r1 (raw file):

Previously, jrobble (Jeff Robble) wrote…

Why remove this check?

The new logic to handle acceptable thresholds, required soft errors (that fail through to the acceptable timelines), and hard errors, that fail always. With these changes in timeline validation, the previous test timeline entries were no longer caught by the previous expected error.

I was unable to create a test timeline that would fail to these specific errors, but I did not want to remove the error logic.


python/LlamaVideoSummarization/tests/test_llama_video_summarization.py line 449 at r1 (raw file):

Previously, jrobble (Jeff Robble) wrote…

Why remove this check?

See above.

@jrobble
Copy link
Member

jrobble commented Sep 26, 2025

python/LlamaVideoSummarization/llama_video_summarization_component/__init__.py line 207 at r1 (raw file):

            if (timestamp_end - segment_stop_time) > target_threshold:
                soft_error = (f'Timeline event end time occurs too late after segment stop time. '

After further discussion, this check may no longer be needed. It seems to be covered by:

"Max timeline event end time not close enough to segment stop time."

@jrobble
Copy link
Member

jrobble commented Sep 26, 2025

python/LlamaVideoSummarization/llama_video_summarization_component/__init__.py line 200 at r1 (raw file):

            
            if (segment_start_time - timestamp_start) > target_threshold:
                soft_error = (f'Timeline event start time occurs too soon before segment start time. '

After further discussion, this check may no longer be needed. It seems to be covered by:

"Min timeline event start time not close enough to segment start time."

@jrobble
Copy link
Member

jrobble commented Sep 26, 2025

python/LlamaVideoSummarization/llama_video_summarization_component/__init__.py line 200 at r1 (raw file):

Previously, jrobble (Jeff Robble) wrote…

After further discussion, this check may no longer be needed. It seems to be covered by:

"Min timeline event start time not close enough to segment start time."

The new code returns hard errors first. The old code could return soft errors first. Whenever this soft error would have occurred the hard error would have also occurred. It's just that the old code never got there.

@jrobble
Copy link
Member

jrobble commented Sep 26, 2025

python/LlamaVideoSummarization/llama_video_summarization_component/__init__.py line 200 at r1 (raw file):

Previously, jrobble (Jeff Robble) wrote…

The new code returns hard errors first. The old code could return soft errors first. Whenever this soft error would have occurred the hard error would have also occurred. It's just that the old code never got there.

Scratch that. This isn't related to hard vs. soft. It has more to do with this code:

if len(minmax_errors) > 0:
            soft_error = minmax_errors.pop()

The minmax errors are returned over all other soft errors.

Still though, a minmax error should always occur if a "Timeline event start time occurs too soon before segment start time." is to occur.

@jrobble
Copy link
Member

jrobble commented Sep 29, 2025

python/LlamaVideoSummarization/llama_video_summarization_component/__init__.py line 222 at r1 (raw file):

Previously, jrobble (Jeff Robble) wrote…

A better name than "after_seg_start" would be "near_seg_start" since we're comparing against an absolute value.

After further discussion, we don't need this check. "Min timeline event start time not close enough to segment start time."

@jrobble
Copy link
Member

jrobble commented Oct 1, 2025

python/LlamaVideoSummarization/llama_video_summarization_component/__init__.py line 121 at r1 (raw file):

Previously, jrobble (Jeff Robble) wrote…

Actually, make that an info line.

I pushed this change.

Copy link
Member

@jrobble jrobble left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jrobble reviewed 2 of 2 files at r2, all commit messages.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved (waiting on @regexer)


python/LlamaVideoSummarization/llama_video_summarization_component/__init__.py line 200 at r1 (raw file):

Previously, jrobble (Jeff Robble) wrote…

Scratch that. This isn't related to hard vs. soft. It has more to do with this code:

if len(minmax_errors) > 0:
            soft_error = minmax_errors.pop()

The minmax errors are returned over all other soft errors.

Still though, a minmax error should always occur if a "Timeline event start time occurs too soon before segment start time." is to occur.

I addressed this in the changes I pushed.


python/LlamaVideoSummarization/llama_video_summarization_component/__init__.py line 207 at r1 (raw file):

Previously, jrobble (Jeff Robble) wrote…

After further discussion, this check may no longer be needed. It seems to be covered by:

"Max timeline event end time not close enough to segment stop time."

I pushed this change.


python/LlamaVideoSummarization/llama_video_summarization_component/__init__.py line 222 at r1 (raw file):

Previously, jrobble (Jeff Robble) wrote…

After further discussion, we don't need this check. "Min timeline event start time not close enough to segment start time."

I pushed this change.


python/LlamaVideoSummarization/llama_video_summarization_component/__init__.py line 231 at r1 (raw file):

Previously, jrobble (Jeff Robble) wrote…

A better name than "before_seg_stop" would be "near_seg_stop since we're comparing against an absolute value.

I pushed this change.


python/LlamaVideoSummarization/llama_video_summarization_component/__init__.py line 234 at r1 (raw file):

Previously, jrobble (Jeff Robble) wrote…

According to AI this can be simplified using all(). See first example:

# Example dictionary with all truthy values
dict1 = {"key1": True, "key2": 1, "key3": "hello"}
result1 = all(dict1.values())
print(f"Are all values in dict1 truthy? {result1}")

# Example dictionary with at least one falsy value
dict2 = {"keyA": True, "keyB": 0, "keyC": "world"}
result2 = all(dict2.values())
print(f"Are all values in dict2 truthy? {result2}")

# Example dictionary with only falsy values
dict3 = {"keyX": False, "keyY": 0, "keyZ": ""}
result3 = all(dict3.values())
print(f"Are all values in dict3 truthy? {result3}")

# Example of an empty dictionary (returns True)
dict4 = {}
result4 = all(dict4.values())
print(f"Are all values in dict4 truthy? {result4}")

I pushed this change.


python/LlamaVideoSummarization/llama_video_summarization_component/__init__.py line 235 at r1 (raw file):

Previously, jrobble (Jeff Robble) wrote…

It's a little odd that this is stored as a class variable. Consider instead returning a second value from this method, a boolean that indicates if the response was acceptable or not. Use that to keep track of the acceptable response json in the calling method:

response_json = {}
acceptable_response_json = {}  # NEW
error = None
while max(attempts.values()) < max_attempts:

I pushed this change.


python/LlamaVideoSummarization/plugin-files/descriptor/descriptor.json line 74 at r1 (raw file):

Previously, jrobble (Jeff Robble) wrote…

Update to:

timeline within the "desired" range of TIMELINE_CHECK_TARGET_THRESHOLD

I pushed this change.


python/LlamaVideoSummarization/llama_video_summarization_component/__init__.py line 174 at r1 (raw file):

        # start with passing checks, then fail as secondary checks are conducted
        acceptable_checks = dict(

I changed the logic so these init to false and must be changed to true. That way we can avoid potential logic errors in our code where we don't check any of these and the default is to assume the response is acceptable.


python/LlamaVideoSummarization/llama_video_summarization_component/__init__.py line 384 at r1 (raw file):

def _get_timestamp_value(seconds: Any) -> float:
    if isinstance(seconds, str):
        if re.match(r"\s*\d+(\.\d*)?\s*[Ss]?", seconds):

This regex doesn't work for "7:12", which is used by one of the unit tests. The reason is because the "7" and "12" are each detected as a valid match. To get around this, I had to use the "^" and "$" symbols at the beginning and end of the regex to ensure it matches the entire string.

@jrobble jrobble merged commit 89d79b0 into master Oct 1, 2025
2 checks passed
@jrobble jrobble deleted the hf/llama-acceptable-threshold branch October 1, 2025 21:34
hhuangMITRE pushed a commit that referenced this pull request Oct 14, 2025
…HOLD (#409)

* Validate timestamps.

---------

Co-authored-by: jrobble <jrobble@mitre.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants