# GENERATE METADATA FOR SELECTS

Follow instructions to create a CSV file containing metadata for all videos contained in a folder.

## Step 1

Run the cell below.

You will be prompted to follow a link to authorize access to the Drive. Click the link and make sure that you sign in as ptrstoryteam@gmail.com. Copy the authorization token, return to this page, paste the token into the provided box, and hit enter on your keyboard. This completes the authorization.

In [None]:
# LIBRARIES

import os
import shutil
import glob
import csv
from datetime import datetime

from google.colab import drive
import pandas as pd

# MOUNTING THE DRIVE/AUTOMATION

drive.mount('/content/drive/', force_remount=True)

Mounted at /content/drive/


## Step 2

Run the cell below to complete initial setup.

In [None]:
# DIRECTORIES

# Path to top-level folder in Footage Library
FULL_PATH = "/content/drive/My Drive/Protect the Results Story Team/10 PTR DISTRIBUTION/Footage Library/"

# Name of metadata folder
METADATA_FOLDER = "METADATA/"

# Name of selects folder
SELECTS_FOLDER = "selects/"

# Full path to metadata
FULL_PATH_METADATA = FULL_PATH + METADATA_FOLDER

# Full path to selects
FULL_PATH_SELECTS = FULL_PATH + SELECTS_FOLDER

# FILES

# Master CSV file 
NAME_MASTER_CSV = "master_csv_test.csv"

# List/tuple of possible file format for the videos
LIST_VIDEO_FORMATS = (".mov", ".mp4")

# Name for CSV generated in outgoing folder
NAME_GENERATED_CSV = "metadata.csv"

# OTHER CONSTANTS

# Columns of interest for the output
COLUMN_NAMES = ["hash_id", "project_id", "contributor_email", "contributor_name","time_uploaded",
           "shot_title", "caption", "media_type", "latitude", "longitude", 
           "video_duration_seconds", "video_duration_minutes", "tags"]

# Functions

def metadata_from_videos_in_folder(name_folder):
  # Keep track of hashids that have been processed
  already_processed_hash_id = pd.read_csv(FULL_PATH_METADATA+NAME_MASTER_CSV)["hash_id"].to_list()

  # All filenames in target directory
  all_filenames_in_dir = os.listdir(FULL_PATH_SELECTS + name_folder)

  # All filenames of videos with specificed format(s)
  all_video_filenames_in_dir = [f for f in all_filenames_in_dir if f.endswith(LIST_VIDEO_FORMATS)]

  # The hashid of the videos
  all_hashid_in_dir = list(map(lambda filename: filename[:-4].split("_")[-1], all_video_filenames_in_dir))

  # Master CSV
  df_master_csv = pd.read_csv(FULL_PATH_METADATA+NAME_MASTER_CSV)

  # List of hash_id in master csv
  list_hashid_master = df_master_csv["hash_id"].to_list()

  # List of hashids from which to subset master dataframe from
  list_hashid_subset = []

  # Print the hashids that are not in master csv
  for hashid in all_hashid_in_dir:
    if hashid not in list_hashid_master:
      print(hashid + " not in the master csv")
    else:
      list_hashid_subset.append(hashid)

  # Subset dataframe
  df_subset = df_master_csv[df_master_csv["hash_id"].isin(list_hashid_subset)]

  # Reshape final df before writing out
  folder_name = ["FOLDER NAME" for num_vids in range(df_subset.shape[0])]
  df_subset["folder_name"] = folder_name

  df_sorted = df_subset[["folder_name", "hash_id", "time_uploaded", "video_duration", "caption", "contributor_name"]]
  df_sorted.columns = ["Folder Name", "Clip ID (contained within clip name", "Clip Date", "Clip Duration", "Caption/Location", "Videographer Name"]

  # Generate csv
  df_sorted.to_csv(FULL_PATH_SELECTS+name_folder+NAME_GENERATED_CSV, index = False)

## Step 3

Replace 

```
test_selects
```
in the below cell with the name of the folder where your selects are stored. Be sure to preserve the quotation marks and the slash.


In [None]:
TARGET_FOLDER = "11.05.2020_06.00PM/"

# Step 4

Run the cell below to generate the metadata CSV in the target folder.

In [None]:
metadata_from_videos_in_folder(TARGET_FOLDER)

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
