# Execution in Google Colab
Welcome to this Jupyter notebook! The primary goal of this notebook is to facilitate the calculation of delay between the visual occurrence of an explosion and its corresponding sound in a video. By understanding this delay, we can estimate the distance between the explosion's location and the camera's position, aiding in geolocation tasks.

Advantages of using this notebook:

Simplicity: No need for a local Python setup. Run everything directly in Google Colab.
Accessibility: Easily calculate distance based on sound and visual delays without diving deep into the codebase.

http://colab.research.google.com/github/davidnewschool/sound-delay/blob/develop/colab.ipynb

### Initial Setup
To ensure we're using the most updated and accurate methods for our analysis, we'll clone a specific repository that contains the latest code for distance calculation. Cloning ensures you're working with the most recent and optimized version of the code.

In [None]:
!git clone -b develop https://github.com/davidnewschool/sound-delay.git

In [None]:
import os
if not os.getcwd().endswith('/sound-delay'):
    %cd sound-delay

# Show the files
!ls

In [None]:
!pip install -r colab.txt

## Distance Calculation
The central premise here is the difference in speed between light and sound. When an explosion occurs, we typically see the explosion (light travels faster) before we hear it (sound travels slower). By analyzing this delay, we can make an educated guess about how far the camera was from the explosion, helping in geolocation efforts.

### Analyze the Delay in a video

In [None]:
# remove the # on the next line if you run it local and want the mathplotlib window as a popup to interact with. Does not work on Google Colab.
# %matplotlib tk

# The script requires a video file to process. You can specify the video file in two ways:
# Directly within the script (as shown below):
# %run plot_delay.py example/video.mp4

# If you don't specify here, don't worry! The script will prompt you for the file path later.
%run plot_delay.py

## Experimental Analysis
In this segment, we venture into experimental territories. We're attempting to automate two significant aspects:

1. Visual Detection: Identifying the exact moment the explosion is seen in the video.
2. Auditory Detection: Pinpointing when the explosion sound is captured in the audio.

This experimental approach applies certain mathematical concepts to achieve automation. However, it's essential to approach the results with caution. The current methods might not be entirely reliable, especially with videos that are handheld or zoomed during recording.

#### Mathematical Concepts Used in Automation and Audio Analysis

The notebook employs specific mathematical techniques to automate the identification of visual and auditory cues in videos:

##### 1. Rate of Change (Derivative) for Visual Cues

- The visual detection primarily relies on the rate of change in the data. This is achieved by calculating the derivative of the data concerning time.
- The derivative helps in determining how data values change over a small interval. In the context of video data, a significant spike in the derivative might indicate a sudden change in visual intensity, such as the occurrence of an explosion.
- Refer to the `compute_derivative` function for the implementation of this concept, where the rate of change of data values over time is computed.

##### 2. Standard Deviation for Loudness Spikes Detection in Auditory Cues

- The notebook uses the concept of standard deviation to detect significant spikes in loudness. Standard deviation measures the dispersion or variability of a set of values. In this context, values with a high deviation from the mean could indicate significant events like the sound of an explosion.
  
- Specifically, a threshold is determined based on the standard deviation of the loudness derivative:
    ```python
    threshold = 3 * np.std(loudness_derivative)
    ```
  This threshold helps detect rapid changes in sound intensity. A value in the loudness derivative that surpasses this threshold is considered a significant spike, potentially indicating the auditory cue of the explosion.

##### 3. Derivative Calculation for Rapid Changes Detection in Auditory Cues

- The derivative is also employed in the audio analysis to identify rapid changes in the data. The function `compute_derivative` calculates the derivative of the loudness concerning time, helping detect moments where the loudness changes abruptly.
    ```python
    loudness_derivative, _ = compute_derivative(adjusted_loudness, adjusted_time_audio)
    ```
  In the context of audio analysis, a significant spike in the derivative might indicate a sudden increase in volume, such as the sound of an explosion.

By using these mathematical techniques, the notebook can automatically identify the visual occurrence of an explosion and its corresponding sound, facilitating the calculation of the delay between them.


### Experimental Visual Analysis

In [None]:
import plotly.graph_objects as go

In [None]:
# Get min and max of time axis
time_min = time_audio[0]
time_max = time_audio[-1]

# Get min and max of
amp_min = np.min(red_intensity) - 0.1*( np.max(red_intensity) - np.min(red_intensity) )
amp_max = np.max(red_intensity) + 0.1*( np.max(red_intensity) - np.min(red_intensity) )

In [None]:
def add_highlight_rectangle(fig, x0, x1, fillcolor="grey", opacity=0.3, layer="below", line_width=0):
    """
    Add a rectangular shape to a plotly figure to highlight a region.

    Parameters:
    - fig (plotly.graph_objs._figure.Figure): The figure to add the rectangle to.
    - x0 (float): The starting x-coordinate of the rectangle.
    - x1 (float): The ending x-coordinate of the rectangle.
    - fillcolor (str): The color of the rectangle. Default is "grey".
    - opacity (float): The opacity of the rectangle. Default is 0.3.
    - layer (str): Whether to place the rectangle below or above the traces. Default is "below".
    - line_width (int): The width of the rectangle's line. Default is 0.
    """
    
    fig.add_shape(
        type="rect",
        xref="x",
        yref="paper",  # relative to the entire height of the plot
        x0=x0,
        x1=x1,
        y0=0,
        y1=1,
        fillcolor=fillcolor,
        opacity=opacity,
        layer=layer,
        line_width=line_width,
    )

# Example usage:
# add_highlight_rectangle(fig, slowdown_time, first_spike_time)


In [None]:
def configure_frame_ticks(fig, time_audio, frame_rate):
    """
    Configures the x-axis of a plotly figure to have minor gridlines for every frame and major gridlines for every second.

    Parameters:
    - fig (plotly.graph_objs._figure.Figure): The figure to update.
    - time_audio (list or array-like): A list or array containing time points for the audio. The last item is assumed to represent the total duration.
    - frame_rate (float): Frame rate of the audio.
    """
    
    # Update the layout
    fig.update_layout(
        xaxis=dict(
            ticklen=10,  # Length of major ticks
            showgrid=True,  # Gridlines
        )
    )

    # Calculate total duration in seconds (round up)
    total_seconds = int(np.ceil(time_audio[-1]))

    # Update x-axis for minor ticks representing each frame within a second when zoomed in
    fig.update_xaxes(
        minor_tickmode="linear",
        minor_tick0=0,
        minor_dtick=1/frame_rate,
        minor_ticklen=0,  # Length of minor ticks
        minor_showgrid=True,
        minor_nticks=int(frame_rate * total_seconds)  # Maximum number of minor ticks
    )

# Example usage:
# configure_frame_ticks(fig, time_audio_list, frame_rate_value)

In [None]:
def compute_derivative(data, time):
    """Compute the derivative and return the adjusted time."""
    derivative = np.diff(data) / np.diff(time)
    adjusted_time = time[:-1]
    return derivative, adjusted_time

def normalize_data(data):
    """Normalize data between 0 and 1 based on its range."""
    return (data - np.min(data)) / (np.max(data) - np.min(data))

In [None]:
# Create a figure
fig = go.Figure()

# Add Loudness and Red Intensity traces
fig.add_trace(go.Scatter(x=time_audio, y=loudness, mode='lines', name='loudness', line=dict(color='blue'), yaxis='y2'))
fig.add_trace(go.Scatter(x=time_video, y=red_intensity, mode='lines', name='red intensity', line=dict(color='red')))

# Update the layout
fig.update_layout(
    xaxis=dict(title="Time [seconds]", range=[time_min, time_max], rangeslider=dict(visible=True), type='linear'),
    yaxis_title="Red Intensity",
    yaxis_range=[amp_min, amp_max],
    yaxis2=dict(title="Loudness", overlaying='y', side='right'),
    showlegend=True,
    title="Comparison of Loudness and Red Intensity over Time"
)

configure_frame_ticks(fig, time_audio, frame_rate)

# Save the plot to the same path/name as the input video
# output_image_path = video_path[:-3] + 'png'
# fig.write_image(output_image_path)  # Requires plotly to be installed with the "orca" extra: pip install plotly[orca]

# Display the plot
fig.show()

In [None]:
# ---- BOOM ----

# Derive red_intensity to detect rapid changes which may correspond to visual anomalies (e.g., flash from an explosion).
red_intensity_derivative, time_video_deriv = compute_derivative(red_intensity, time_video)

# Identify the spike in the derivative, which likely indicates the start of the explosion.
first_spike_deriv_red_index = np.argmax(red_intensity_derivative)

# Detect the point where the increase in intensity begins to slow down post-explosion.
first_decreasing_point_after_spike = np.argmax(red_intensity_derivative[first_spike_deriv_red_index:] < red_intensity_derivative[first_spike_deriv_red_index])
slowdown_time = time_video_deriv[first_spike_deriv_red_index + first_decreasing_point_after_spike]

# ---- SOUND ----

# Adjust the audio data to focus on the timeframe after the visual anomaly was detected.
start_idx_audio = np.where(time_audio >= slowdown_time)[0][0]
adjusted_time_audio, adjusted_loudness = time_audio[start_idx_audio:], loudness[start_idx_audio:]

# Derive the loudness to detect rapid changes in sound intensity.
loudness_derivative, _ = compute_derivative(adjusted_loudness, adjusted_time_audio)

# Define a threshold to detect the significant spike in loudness which likely corresponds to the sound from the explosion.
threshold = 3 * np.std(loudness_derivative)
first_spike_index = np.argmax(loudness_derivative > threshold)
first_spike_time, first_spike_value = adjusted_time_audio[first_spike_index], adjusted_loudness[first_spike_index]

In [None]:
# ---- PLOTTING AND ANNOTATING ----

# Create a figure
fig = go.Figure()

# Add Red Intensity and Loudness traces
fig.add_trace(go.Scatter(x=time_video, y=red_intensity, mode='lines', name='Red Intensity', line=dict(color='red')))
fig.add_trace(go.Scatter(x=time_audio, y=loudness, mode='lines', name='Loudness', line=dict(color='blue'), yaxis='y2'))

# Add annotations
fig.add_annotation(x=slowdown_time, y=red_intensity[np.where(time_video == slowdown_time)[0][0]], 
                   text=f'Derivative starts decreasing at {slowdown_time:.2f} s', showarrow=True, arrowhead=2, yref="y")
fig.add_annotation(x=first_spike_time, y=first_spike_value, 
                   text=f'First Loudness Spike at {first_spike_time:.2f} s', showarrow=True, arrowhead=2, yref="y2")

# Update the layout
fig.update_layout(
    xaxis=dict(title="Time [seconds]", rangeslider=dict(visible=True), type='linear'),
    yaxis_title="Red Intensity",
    yaxis_range=[amp_min, amp_max],
    yaxis2=dict(title="Loudness", overlaying='y', side='right'),
    showlegend=True,
    title="Comparison of Loudness and Red Intensity over Time"
)

add_highlight_rectangle(fig, slowdown_time, first_spike_time)
configure_frame_ticks(fig, time_audio, frame_rate)

# Display the visual analysis.
fig.show()

### Experimental: Further Sound Analysis

In [None]:
from scipy.io import wavfile
from scipy.signal import spectrogram
from plotly.subplots import make_subplots

In [None]:
# Load the audio signal from the audio_path
fs, sig = wavfile.read(audio_path)

# Check if the audio signal has multiple channels (e.g., stereo). If so, use only one channel for simplicity.
if len(sig.shape) > 1:
    sig = sig[:, 0]

# Compute the spectrogram using scipy
frequencies, times, Sxx = spectrogram(sig, fs=fs, nperseg=4096, noverlap=2048, scaling='density')

# Convert amplitude to dB
Sxx_db = 10 * np.log10(Sxx)

# Plot the original signals and the spectrogram
fig_spec = make_subplots(rows=2, cols=1, shared_xaxes=True, subplot_titles=('Original Signals', 'Spectrogram'))

# Add traces for the original signals
fig_spec.add_trace(go.Scatter(x=time_audio, y=loudness, mode='lines', name='Loudness', line=dict(color='blue')), row=1, col=1)

# Add the spectrogram heatmap
fig_spec.add_trace(go.Heatmap(x=times, y=frequencies, z=Sxx_db, colorscale='Viridis'), row=2, col=1)

# Update layout
fig_spec.update_layout(title="Original Signals and Spectrogram")
fig_spec.update_yaxes(title_text="Frequency (Hz)", row=2, col=1)
fig_spec.update_xaxes(title_text="Time (s)",rangeslider=dict(visible=True), row=2, col=1)

fig_spec.show()

In [None]:
# Extract data for frequencies below the 'frequency' threshold (e.g., 200 Hz, 300 Hz) - where you would only expect sounds like explosions.
# Note: Adjust the 'frequency' variable as needed.
frequency = 200
mask = frequencies < frequency
low_freq_amplitudes = Sxx[mask, :]

In [None]:
# Compute the average amplitude for frequencies below the 'frequency' threshold for each time step.
avg_low_freq_amplitude = np.mean(low_freq_amplitudes, axis=0)

# Plot the original signals, spectrogram, and the low-frequency amplitude
fig_combined = make_subplots(rows=3, cols=1, shared_xaxes=True, 
                             subplot_titles=('Original Signals', 'Spectrogram', f'Amplitude (<{frequency} Hz)'))

# Add traces for the original signals
fig_combined.add_trace(go.Scatter(x=time_audio, y=loudness, mode='lines', name='Loudness', line=dict(color='blue')), row=1, col=1)
fig_combined.add_trace(go.Scatter(x=times, y=avg_low_freq_amplitude, mode='lines', name=f'Amplitude (<{frequency} Hz)', line=dict(color='green')), row=3, col=1)

# Add the spectrogram heatmap
fig_combined.add_trace(go.Heatmap(x=times, y=frequencies, z=Sxx_db, colorscale='Viridis', showscale=False), row=2, col=1)

# Update layout
fig_combined.update_layout(title="Original Signals, Spectrogram, and Low-Frequency Amplitude")
fig_combined.update_yaxes(title_text="Frequency (Hz)", row=2, col=1)
fig_combined.update_xaxes(title_text="Time (s)", row=3, col=1)

fig_combined.show()

In [None]:
# Normalize the avg_low_freq_amplitude to match loudness amplitude range
avg_low_freq_amplitude_norm = (avg_low_freq_amplitude - np.min(avg_low_freq_amplitude)) / (np.max(avg_low_freq_amplitude) - np.min(avg_low_freq_amplitude))
avg_low_freq_amplitude_norm = avg_low_freq_amplitude_norm * (np.max(loudness) - np.min(loudness)) + np.min(loudness)

# Plot the original signals, spectrogram, and the normalized low-frequency amplitude
fig_combined = make_subplots(rows=3, cols=1, shared_xaxes=True, 
                             subplot_titles=('Original Signals', 'Spectrogram', f'Amplitude (<{frequency} Hz)'))

# Add traces for the original signals
fig_combined.add_trace(go.Scatter(x=time_audio, y=loudness, mode='lines', name='Loudness', line=dict(color='blue')), row=1, col=1)
fig_combined.add_trace(go.Scatter(x=times, y=avg_low_freq_amplitude_norm, mode='lines', name=f'Normalized Amplitude (<{frequency} Hz)', line=dict(color='green')), row=3, col=1)

# Add the spectrogram heatmap
fig_combined.add_trace(go.Heatmap(x=times, y=frequencies, z=Sxx_db, colorscale='Viridis', showscale=False), row=2, col=1)

# Update layout
fig_combined.update_layout(title="Original Signals, Spectrogram, and Normalized Low-Frequency Amplitude")
fig_combined.update_yaxes(title_text="Frequency (Hz)", row=2, col=1)
fig_combined.update_xaxes(title_text="Time (s)", row=3, col=1)

fig_combined.show()

In [None]:
# ---- BOOM ----

# Derive red_intensity to detect rapid changes which may correspond to visual anomalies (e.g., flash from an explosion).
red_intensity_derivative, time_video_deriv = compute_derivative(red_intensity, time_video)

# Normalize the derivative so it's in the same amplitude range as the original signals
red_intensity_derivative_norm = (red_intensity_derivative - np.min(red_intensity_derivative)) / (np.max(red_intensity_derivative) - np.min(red_intensity_derivative))
red_intensity_derivative_norm = red_intensity_derivative_norm * (np.max(red_intensity) - np.min(red_intensity)) + np.min(red_intensity)

# Identify the spike in the derivative, which likely indicates the start of the explosion.
first_spike_deriv_red_index = np.argmax(red_intensity_derivative)

# Detect the point where the increase in intensity begins to slow down post-explosion.
first_decreasing_point_after_spike = np.argmax(red_intensity_derivative[first_spike_deriv_red_index:] < red_intensity_derivative[first_spike_deriv_red_index])
slowdown_time = time_video_deriv[first_spike_deriv_red_index + first_decreasing_point_after_spike]

# ---- SOUND ----

loudness_low = avg_low_freq_amplitude_norm
start_idx_audio = np.where(times >= slowdown_time)[0][0]
adjusted_time_audio, adjusted_loudness_low = times[start_idx_audio:], loudness_low[start_idx_audio:]

# Compute derivative and threshold for loudness
loudness_low_derivative = np.diff(adjusted_loudness_low) / np.diff(adjusted_time_audio)
threshold = 3 * np.std(loudness_low_derivative)

# Detect first significant spike
first_spike_index = np.argmax(loudness_low_derivative > threshold)
first_spike_time, first_spike_value = adjusted_time_audio[first_spike_index], adjusted_loudness_low[first_spike_index]

# Calculate standard deviation
std_dev_loudness_low = np.std(adjusted_loudness_low[:first_spike_index])

# ---- PLOTTING AND ANNOTATING ----

fig_all = go.Figure()

# Plot signals
fig_all.add_trace(go.Scatter(x=times, y=loudness_low, mode='lines', name=f'Amplitude (<{frequency} Hz)', line=dict(color='green'), yaxis='y2'))
fig_all.add_trace(go.Scatter(x=time_video, y=red_intensity, mode='lines', name='Red Intensity', line=dict(color='red')))
fig_all.add_trace(go.Scatter(x=time_audio, y=loudness, mode='lines', name='Loudness', line=dict(color='blue'), yaxis='y2'))

# Annotations
derivative_y = red_intensity_derivative_norm[first_spike_deriv_red_index + first_decreasing_point_after_spike]
fig_all.add_annotation(x=slowdown_time, y=derivative_y, text=f'Derivative starts decreasing at {slowdown_time:.2f} s', showarrow=True, arrowhead=2, yref="y", bgcolor="rgba(255,255,255,0.7)")
fig_all.add_annotation(x=first_spike_time, y=first_spike_value, text=f'First Amplitude Spike at {first_spike_time:.2f} s', showarrow=True, arrowhead=2, yref="y2", bgcolor="rgba(255,255,255,0.7)")

# Layout
fig_all.update_layout(
    title="Comparison of Loudness, Amplitude below Frequency, and Red Intensity Over Time",
    xaxis=dict(title="Time (s)", rangeslider=dict(visible=True)),
    yaxis=dict(title="Red Intensity"),
    yaxis2=dict(title="Sound", overlaying='y', side='right'),
    legend=dict(x=0.5, y=-0.7, xanchor='center', orientation='h')
)

# Display the plot
fig_all.show()


In [None]:
add_highlight_rectangle(fig_all, slowdown_time, first_spike_time)
configure_frame_ticks(fig_all, time_audio, frame_rate)

# Display the updated plot
fig_all.show()

In [None]:
slowdown_time, first_spike_time

In [None]:
%run distance.py

## Quick Helper for handling video

### Download a mp4 file from a URL 
You can enter any URL (that links directly to a mp4 file) to download into the file drive of this running colab. Files will not be stored longterm and be deleted after you stop the runtime (or timeout)

Use Online Services like TwitterVideoDownloader.com or ttvdl.com (TikTok) to get a .mp4 link.

In [None]:
import requests
from IPython.display import display, HTML

# Ask user for the URL
url = input("Please enter the URL of the MP4 file: ")

# Define a suitable filename based on the URL
filename = url.split('/')[-1]  # This will take the last part of the URL as the filename. 

response = requests.get(url)
with open(filename, 'wb') as f:
    f.write(response.content)

# Display a success message in the notebook
display(HTML(f"<span style='color: green;'>File downloaded successfully as <b>{filename}</b></span>"))

### Trimming video
This Python script enables you to trim a video by specifying the start and end times, creating a new video clip containing only the desired segment. You can use this script to extract specific portions of a video for further editing or sharing.

How to Use
Input Video File: The script will prompt you to enter the path of the MP4 video file you want to trim. Please provide the full file path, including the file extension (e.g: example/video.mp4).

Video Duration: After loading the video, the script will display the total duration of the video in seconds. This information helps you determine the range for trimming.

Specify Trimming Times: Enter the start and end times (in seconds) for the portion of the video you want to keep. The script will cut the video from the specified start time to the specified end time.

Output Video File: The trimmed video will be saved with a "-cut" suffix appended to the original filename. For example, if the original file was named "video.mp4," the trimmed video will be named "video-cut.mp4."

In [None]:
from moviepy.video.io.VideoFileClip import VideoFileClip

# Ask the user for the path of the MP4 file
video_path = input("Please enter the path of the MP4 file: ")

# Load the video clip
video_clip = VideoFileClip(video_path)

# Give the user information about the duration of the clip
video_duration = video_clip.duration
print(f"The video duration is {video_duration:.2f} seconds.")

# Define the start and end times for trimming (in seconds)
start_input = input("Please enter the start time where you want to cut (default is 0): ")
start_time = float(start_input) if start_input else 0  # Start time of the trimmed portion

end_input = input(f"Please enter the end time where you want to cut (default is video duration, {video_duration:.2f}): ")
end_time = float(end_input) if end_input else video_duration  # End time of the trimmed portion

# Trim the video clip
trimmed_clip = video_clip.subclip(start_time, end_time)

# Generate the output path with a "-cut" suffix
output_path = video_path.replace(".mp4", f"-cut_{start_time}_{end_time}.mp4")

# Save the trimmed video with audio
trimmed_clip.write_videofile(output_path, codec="libx264")

# Close the original video clip
video_clip.close()

### Video Cropping
This Python script allows you to crop a video by specifying the percentage of the frame to cut from the top, bottom, left, and right sides. You can use this script to customize the framing of a video, removing unwanted portions to focus on specific content.

How to Use
Input Video File: The script will prompt you to enter the path of the MP4 video file you want to edit. Provide the full file path, including the file extension (e.g: example/video.mp4).

Percentage to Cut: You will be asked to specify the percentage (0-100) of each side (top, bottom, left, and right) that you want to cut. Higher percentages will result in more cropping, while lower percentages will retain more of the original frame.

Output Video File: The edited video will be saved with the specified cropping percentages appended to the filename. For example, if you entered 10% for top, 5% for bottom, 15% for left, and 20% for right, the output file would be named something like original-video-edit_10_5_15_20.mp4.

In [None]:
from moviepy.video.io.VideoFileClip import VideoFileClip
import moviepy.video.fx.all as vfx

# Input video file path
input_file_path = input("Please enter the path of the MP4 file: ")

# Load the video clip
video_clip = VideoFileClip(input_file_path)

# Get the dimensions of the video frame
frame_width, frame_height = video_clip.size

# Ask the user for the percentage to cut from each side
top_percentage = float(input("Enter the percentage to cut from the top (0-100): "))
bottom_percentage = float(input("Enter the percentage to cut from the bottom (0-100): "))
left_percentage = float(input("Enter the percentage to cut from the left (0-100): "))
right_percentage = float(input("Enter the percentage to cut from the right (0-100): "))

# Output video file path with percentages
output_file_path = input_file_path.replace(".mp4", f"-edit_{top_percentage}_{bottom_percentage}_{left_percentage}_{right_percentage}.mp4")

# Calculate the pixel values to cut
top_cut = int(frame_height * (top_percentage / 100))
bottom_cut = int(frame_height * (bottom_percentage / 100))
left_cut = int(frame_width * (left_percentage / 100))
right_cut = int(frame_width * (right_percentage / 100))

# Crop the video clip
cropped_clip = video_clip.crop(y1=top_cut, y2=frame_height - bottom_cut, x1=left_cut, x2=frame_width - right_cut)

# Write the edited video to the output file
cropped_clip.write_videofile(output_file_path, codec="libx264")

print("Video editing complete. Saved as", output_file_path)