# Script pipeline

## Before starting

Collect your videos in MP4 format, give them a unique commercial ID (make sure their names are the same of the corresponding `commerical_id` values) and put them in the `videos` folder.

Then fill in the CSV file `initial_data/commercials_initial_metadata.csv` with the metadata of each video:

- `commercial_id`
- `title`
- `brand`
- `nice_class`
- `product_type_key`
- `year`
- `lustrum`
- `source`

## 1. Color and Thumb Extraction

- Analyze each video and collects new data (`avg_frame_rate`, `aspect_ratio`), add them to `commercials_initial_metadata.csv` and save it as `general/commercials.csv`.
- Split each video in ‚Äúscenes‚Äù.
- For each scene, extract the representative median frame and save it in small size (with height of 180 px), in WEBP format, in a folder named after the `commercial_id` in the `thumbnails` folder.
- From each median frame, extract a color palette of maximum 32 colours and save them in a CSV file named
  `general/commercial_palettes.csv` with these data:
    - `commercial_id`
    - `scene`: the progressive number ID of the scene.
    - `scene_size`: the scene duration measured in frames.
    - `start_frame`: the initial frame number of the scene.
    - `end_frame`: the final frame number of the scene.
    - `hex_code`: the hexadecimal representation of the original colour extracted.
    - `frequency_within_the_scene`: the frequency of the original colour in the scene.
    - `closest_color_ext_pal`: the closet colour from the extended palette.
    - `closest_color_ess_pal`: the closet colour from the essential palette.
    - `closest_color_bas_pal`: the closet colour from the basic palette.
    - `scene_size_norm`: the normalized scene size.
    - `frequency_within_the_commercial`: the frequency of the original colour within the video (`frequency_within_the_scene` √ó `scene_size_norm`)
    - `tf`: the term frequency of the `closest_color_ext_pal` value in the video.
- save the info about each scene detected in each video into `general/scenes.csv`.

In [1]:
%run '1_color_and_thumb_extraction.py'

  from .autonotebook import tqdm as notebook_tqdm



#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*
üîµ 1. Parse the videos
‚û°Ô∏è Get video info, split the video in scenes, save screenshots,
‚û°Ô∏è extract colors and save them in a .pal.csv file.
*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#
------------------------------------------------
üìº Video 1/4: 7cf5ef492fe149249fb0743058b4b7a9_030


[swscaler @ 0x150470000] No accelerated colorspace conversion found from yuv420p to bgr24.
[swscaler @ 0x150698000] No accelerated colorspace conversion found from yuv420p to bgr24.
[swscaler @ 0x150470000] No accelerated colorspace conversion found from yuv420p to bgr24.
[swscaler @ 0x150698000] No accelerated colorspace conversion found from yuv420p to bgr24.
[swscaler @ 0x150470000] No accelerated colorspace conversion found from yuv420p to bgr24.
[swscaler @ 0x150698000] No accelerated colorspace conversion found from yuv420p to bgr24.
[swscaler @ 0x150470000] No accelerated colorspace conversion found from yuv420p to bgr24.
[swscaler @ 0x150698000] No accelerated colorspace conversion found from yuv420p to bgr24.
[swscaler @ 0x150470000] No accelerated colorspace conversion found from yuv420p to bgr24.
[swscaler @ 0x150698000] No accelerated colorspace conversion found from yuv420p to bgr24.
[swscaler @ 0x150470000] No accelerated colorspace conversion found from yuv420p to bgr24.

üé® Saved `commercial_palettes/7cf5ef492fe149249fb0743058b4b7a9_030.pal.csv`
------------------------------------------------
üìÑ Updated `general/commercials.csv`
------------------------------------------------
üìº Video 2/4: 732dc471d65b4b14bd64321c1a537c88_030


[swscaler @ 0x130270000] No accelerated colorspace conversion found from yuv420p to bgr24.
[swscaler @ 0x140018000] No accelerated colorspace conversion found from yuv420p to bgr24.
[swscaler @ 0x140378000] No accelerated colorspace conversion found from yuv420p to bgr24.
[swscaler @ 0x140018000] No accelerated colorspace conversion found from yuv420p to bgr24.
[swscaler @ 0x3002a0000] No accelerated colorspace conversion found from yuv420p to bgr24.
[swscaler @ 0x3002d0000] No accelerated colorspace conversion found from yuv420p to bgr24.
[swscaler @ 0x300458000] No accelerated colorspace conversion found from yuv420p to bgr24.
[swscaler @ 0x3002d0000] No accelerated colorspace conversion found from yuv420p to bgr24.
[swscaler @ 0x300458000] No accelerated colorspace conversion found from yuv420p to bgr24.
[swscaler @ 0x328288000] No accelerated colorspace conversion found from yuv420p to bgr24.
[swscaler @ 0x3283c8000] No accelerated colorspace conversion found from yuv420p to bgr24.

üé® Saved `commercial_palettes/732dc471d65b4b14bd64321c1a537c88_030.pal.csv`
------------------------------------------------
üìÑ Updated `general/commercials.csv`
------------------------------------------------
üìº Video 3/4: gSnVME7YCVQ


[swscaler @ 0x328238000] No accelerated colorspace conversion found from yuv420p to bgr24.
[swscaler @ 0x320088000] No accelerated colorspace conversion found from yuv420p to bgr24.
[swscaler @ 0x3201c8000] No accelerated colorspace conversion found from yuv420p to bgr24.
[swscaler @ 0x320088000] No accelerated colorspace conversion found from yuv420p to bgr24.
[swscaler @ 0x3201c8000] No accelerated colorspace conversion found from yuv420p to bgr24.
[swscaler @ 0x320088000] No accelerated colorspace conversion found from yuv420p to bgr24.
[swscaler @ 0x3201c8000] No accelerated colorspace conversion found from yuv420p to bgr24.
[swscaler @ 0x320088000] No accelerated colorspace conversion found from yuv420p to bgr24.
[swscaler @ 0x3202f8000] No accelerated colorspace conversion found from yuv420p to bgr24.
[swscaler @ 0x320088000] No accelerated colorspace conversion found from yuv420p to bgr24.
[swscaler @ 0x3202f8000] No accelerated colorspace conversion found from yuv420p to bgr24.

üé® Saved `commercial_palettes/gSnVME7YCVQ.pal.csv`
------------------------------------------------
üìÑ Updated `general/commercials.csv`
------------------------------------------------
üìº Video 4/4: qI6VEpzHQ-M


[swscaler @ 0x120610000] No accelerated colorspace conversion found from yuv420p to bgr24.
[swscaler @ 0x120ab0000] No accelerated colorspace conversion found from yuv420p to bgr24.
[swscaler @ 0x120610000] No accelerated colorspace conversion found from yuv420p to bgr24.
[swscaler @ 0x120ab0000] No accelerated colorspace conversion found from yuv420p to bgr24.
[swscaler @ 0x120610000] No accelerated colorspace conversion found from yuv420p to bgr24.
[swscaler @ 0x120ab0000] No accelerated colorspace conversion found from yuv420p to bgr24.
[swscaler @ 0x120610000] No accelerated colorspace conversion found from yuv420p to bgr24.
[swscaler @ 0x120ab0000] No accelerated colorspace conversion found from yuv420p to bgr24.
[swscaler @ 0x120610000] No accelerated colorspace conversion found from yuv420p to bgr24.
[swscaler @ 0x120ab0000] No accelerated colorspace conversion found from yuv420p to bgr24.
[swscaler @ 0x120610000] No accelerated colorspace conversion found from yuv420p to bgr24.

üé® Saved `commercial_palettes/qI6VEpzHQ-M.pal.csv`
------------------------------------------------
üìÑ Updated `general/commercials.csv`

#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*
üîµ 2. Enrich the .pal.csv file with reference colors
‚û°Ô∏è Assign the closest reference palette colors to each color extracted,
‚û°Ô∏è remove the first and/or the last scene if `black` (from essential palette) is the predominant color,
‚û°Ô∏è add for each color the frequency within the scene,
‚û°Ô∏è update the .pal.csv file.
*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#
------------------------------------------------
üìº Video 1/4: 7cf5ef492fe149249fb0743058b4b7a9_030
üé® Updated `commercial_palettes/7cf5ef492fe149249fb0743058b4b7a9_030.pal.csv`
------------------------------------------------
üìº Video 2/4: 732dc471d65b4b14bd64321c1a537c88_030
üé® Updated `commercial_palettes/732dc471d65b4b14bd64321c1a537c88_030.pal.csv`
------------------------------------------------
üìº Video 3/4: g

## 2. Reference Palette Idf Calculation

Calculate the idfs (Inverse Document Frequencies) of each color for each reference palette and save them as:
- `colors/basic_palette_idfs.csv`
- `colors/essential_palette_idfs.csv`
- `colors/extended_palette_idfs.csv`

In [2]:
%run '2_ref_palette_idf_calculation.py'


#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*
üîµ Calculate the idfs (Inverse Document Frequencies) of each color for each reference palette
*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#
‚û°Ô∏è Calculate the idfs,
‚û°Ô∏è save them in a CSV file (one for each reference palette).
*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#
------------------------------------------------
üìÑ Saved `colors/basic_palette_idfs.csv`
------------------------------------------------
üìÑ Saved `colors/essential_palette_idfs.csv`
------------------------------------------------
üìÑ Saved `colors/extended_palette_idfs.csv`


## 3. Audio Feature Extraction

Export the 19 audio features of each video in the folder `audio/features`.

In [3]:
%run '3_audio_feature_extraction.py'


#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*
üîµ Export the 19 audio features of each video
*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#
------------------------------------------------
üìº Video 1/4: qI6VEpzHQ-M


  y, sr = librosa.load(audio_path, sr=None)  # Load audio
	Deprecated as of librosa version 0.10.0.
	It will be removed in librosa version 1.0.
  y, sr_native = __audioread_load(path, offset, duration, dtype)
  y, sr = librosa.load(file, sr=11025)
	Deprecated as of librosa version 0.10.0.
	It will be removed in librosa version 1.0.
  y, sr_native = __audioread_load(path, offset, duration, dtype)


[1m1/1[0m [32m‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ[0m[37m[0m [1m0s[0m 179ms/step
üé∂ Saved `audio/features/chroma_stft_files/7cf5ef492fe149249fb0743058b4b7a9_030.chroma_stft.npz`
üé∂ Saved `audio/features/chroma_cqt_files/7cf5ef492fe149249fb0743058b4b7a9_030.chroma_cqt.npz`
üé∂ Saved `audio/features/chroma_cens_files/7cf5ef492fe149249fb0743058b4b7a9_030.chroma_cens.npz`
üé∂ Saved `audio/features/melspectrogram_files/7cf5ef492fe149249fb0743058b4b7a9_030.melspectrogram.npz`
üé∂ Saved `audio/features/mfcc_files/7cf5ef492fe149249fb0743058b4b7a9_030.mfcc.npz`
üé∂ Saved `audio/features/rms_files/7cf5ef492fe149249fb0743058b4b7a9_030.rms.npz`
üé∂ Saved `audio/features/spectral_centroid_files/7cf5ef492fe149249fb0743058b4b7a9_030.spectral_centroid.npz`
üé∂ Saved `audio/features/spectral_bandwidth_files/7cf5ef492fe149249fb0743058b4b7a9_030.spectral_bandwidth.npz`
üé∂ Saved `audio/features/spectral_contrast_files/7cf5ef492fe149249fb0743058b4b7a9_030.spectra

  y, sr = librosa.load(audio_path, sr=None)  # Load audio


[1m1/1[0m [32m‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ[0m[37m[0m [1m0s[0m 36ms/step
üé∂ Saved `audio/features/chroma_stft_files/732dc471d65b4b14bd64321c1a537c88_030.chroma_stft.npz`
üé∂ Saved `audio/features/chroma_cqt_files/732dc471d65b4b14bd64321c1a537c88_030.chroma_cqt.npz`
üé∂ Saved `audio/features/chroma_cens_files/732dc471d65b4b14bd64321c1a537c88_030.chroma_cens.npz`
üé∂ Saved `audio/features/melspectrogram_files/732dc471d65b4b14bd64321c1a537c88_030.melspectrogram.npz`
üé∂ Saved `audio/features/mfcc_files/732dc471d65b4b14bd64321c1a537c88_030.mfcc.npz`
üé∂ Saved `audio/features/rms_files/732dc471d65b4b14bd64321c1a537c88_030.rms.npz`
üé∂ Saved `audio/features/spectral_centroid_files/732dc471d65b4b14bd64321c1a537c88_030.spectral_centroid.npz`
üé∂ Saved `audio/features/spectral_bandwidth_files/732dc471d65b4b14bd64321c1a537c88_030.spectral_bandwidth.npz`
üé∂ Saved `audio/features/spectral_contrast_files/732dc471d65b4b14bd64321c1a537c88_030.spectral

## 4. Audio Transcription and Lemmatization

- Find the ‚ÄúSpeech‚Äù presence in each video, save it as `audio/speech_class_confidence_score.csv` and transcribe the found speech. All transcriptions are saved in `text/transcriptions.csv`
- Lemmatize each transcription and save lemmas (alphabetically ordered by video) as `text/lemmas.csv`.
- Calculate the tf-idf values for each lemma and update `text/lemmas.csv`.

In [4]:
%run '4_audio_transcription_and_lemmatization.py'

  from pkg_resources import parse_version



#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*
üîµ 1. Audio transcription
‚û°Ô∏è Find the ‚ÄúSpeech‚Äù presence in each video,
‚û°Ô∏è transcribe the found speech.
*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#
------------------------------------------------
üìÇÔ∏è Create `tmp_wav` folder
------------------------------------------------
üìÇÔ∏è Create `text` folder
------------------------------------------------
üìÇÔ∏è Create `text/transcriptions` folder
------------------------------------------------
‚è≥Ô∏è Load YAMNet model from TensorFlow Hub (please wait‚Ä¶)
------------------------------------------------
‚è≥Ô∏è Initialize Whisper model (please wait‚Ä¶)
------------------------------------------------
üìº Video 1/4: 7cf5ef492fe149249fb0743058b4b7a9_030
üìÑ Exported `wav_file_path`
üìù Found 1 segment(s)
[Segment 1] Ci sono persone che non possono permettersi di stare a casa anche quando fa brutto tempo. Hai primi sintomi di raffreddore o di influenza? Presto Aspirina.

Downloading https://raw.githubusercontent.com/stanfordnlp/stanza-resources/main/resources_1.10.0.json: 434kB [00:00, 25.0MB/s]                    
2025-10-28 22:09:35 INFO: Downloaded file to /Users/dfadda/stanza_resources/resources.json
INFO:stanza:Downloaded file to /Users/dfadda/stanza_resources/resources.json
2025-10-28 22:09:35 INFO: Downloading default packages for language: it (Italian) ...
INFO:stanza:Downloading default packages for language: it (Italian) ...
2025-10-28 22:09:36 INFO: File exists: /Users/dfadda/stanza_resources/it/default.zip
INFO:stanza:File exists: /Users/dfadda/stanza_resources/it/default.zip
2025-10-28 22:09:38 INFO: Finished downloading models and saved to /Users/dfadda/stanza_resources
INFO:stanza:Finished downloading models and saved to /Users/dfadda/stanza_resources
2025-10-28 22:09:38 INFO: Checking for updates to resources.json in case models have been updated.  Note: this behavior can be turned off with download_method=None or download_method=Downlo

------------------------------------------------
‚è≥Ô∏è Load Italian NLP model (please wait‚Ä¶)


Downloading https://raw.githubusercontent.com/stanfordnlp/stanza-resources/main/resources_1.10.0.json: 434kB [00:00, 22.8MB/s]                    
2025-10-28 22:09:38 INFO: Downloaded file to /Users/dfadda/stanza_resources/resources.json
INFO:stanza:Downloaded file to /Users/dfadda/stanza_resources/resources.json
2025-10-28 22:09:38 INFO: Loading these models for language: it (Italian):
| Processor | Package           |
---------------------------------
| tokenize  | combined          |
| mwt       | combined          |
| pos       | combined_charlm   |
| lemma     | combined_nocharlm |
| depparse  | combined_charlm   |
| ner       | fbk               |

INFO:stanza:Loading these models for language: it (Italian):
| Processor | Package           |
---------------------------------
| tokenize  | combined          |
| mwt       | combined          |
| pos       | combined_charlm   |
| lemma     | combined_nocharlm |
| depparse  | combined_charlm   |
| ner       | fbk               |

202

------------------------------------------------
üìº Video 1/4: 7cf5ef492fe149249fb0743058b4b7a9_030
------------------------------------------------
üìº Video 2/4: 732dc471d65b4b14bd64321c1a537c88_030
------------------------------------------------
üìº Video 3/4: gSnVME7YCVQ

#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*
üîµ 3. Calculate the tf-idf values of each lemma
‚û°Ô∏è Calculate the tf-idf values,
‚û°Ô∏è update `lemmas.csv`.
*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#*#
------------------------------------------------
‚è≥Ô∏è Download nltk  package `punkt` (please wait‚Ä¶)
------------------------------------------------
‚è≥Ô∏è Download nltk  package `stopwords` (please wait‚Ä¶)
------------------------------------------------
‚ö†Ô∏è Lemmas without tf-idf: {'anche', 'uno', 'il', 'non', 'di', 'e', 'a', 'che', 'ci', 'si', 'o'}
------------------------------------------------
üìÑ Updated `text/lemmas.csv`


  ).apply(
[nltk_data] Downloading package punkt to /Users/dfadda/nltk_data...
[nltk_data]   Package punkt is already up-to-date!
[nltk_data] Downloading package stopwords to
[nltk_data]     /Users/dfadda/nltk_data...
[nltk_data]   Package stopwords is already up-to-date!


## Finally

You can use `text/transcriptions.csv` as input for further text analysis (e.g. LIWC analysis).