<a href="https://colab.research.google.com/github/fentresspaul61B/DATA102_Final_project_data/blob/main/Jubler_data_processing.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [14]:
from google.colab import drive
drive.mount("/content/gdrive")

Drive already mounted at /content/gdrive; to attempt to forcibly remount, call drive.mount("/content/gdrive", force_remount=True).


In [15]:
# These plugins are used to slice a video into smaller videos. 
!pip install moviepy
!pip3 install imageio==2.4.1
!pip install --upgrade imageio-ffmpeg

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting imageio==2.4.1
  Downloading imageio-2.4.1.tar.gz (3.3 MB)
[K     |████████████████████████████████| 3.3 MB 5.0 MB/s 
Building wheels for collected packages: imageio
  Building wheel for imageio (setup.py) ... [?25l[?25hdone
  Created wheel for imageio: filename=imageio-2.4.1-py3-none-any.whl size=3303886 sha256=9213d5f7b420cfba912fcb8025262a0414d1a2bdcf2e3e41839cae95b26fed37
  Stored in directory: /root/.cache/pip/wheels/be/7b/04/4d8d56f1d503e5c404f0de6018c0cfa592c71588a39b49e002
Successfully built imageio
Installing collected packages: imageio
  Attempting uninstall: imageio
    Found existing installation: imageio 2.9.0
    Uninstalling imageio-2.9.0:
      Successfully uninstalled imageio-2.9.0
Successfully installed imageio-2.4.1
Looking in indexes: https://pypi.org/simple

In [16]:
!mkdir video_test_files

In [17]:
# Input data paths, a video and corresponding text file. 
TEST_VIDEO_PATH = "/content/gdrive/MyDrive/jubler_test_data/Rp_1474507527_2022_08_06_12_24_02_780.mp4"
TEXT_FILE_PATH = "/content/gdrive/MyDrive/jubler_test_data/Rp_1474507527_2022_08_06_12_24_02_780.txt"

# These regex's are used to clean the input .txt file. 
TIME_REGEX = r"(\b[0-9]+\b)"
LABEL_REGEX = r"[a-zA-Z]+"

# These directories used to store the sliced videos, audio, and images. 
VIDEO_TEST_FOLDER = "/content/video_test_files"
AUDIO_TEST_FOLDER = "/content/audio_test_files"
IMAGE_TEST_FOLDER = "/content/image_test_files"

In [18]:
import pandas as pd
import re

def load(text_file_path=TEXT_FILE_PATH):
    """
    Loads in Jubler .txt file and creates a pandas dataframe from it. 

    Args:
        jubler_text_file_path: The .txt file path to be transformed into 
        pandas data frame. 

    Returns:
        pandas dataframe with 3 columns: 
        - start: start time. 
        - stop: stopping time. 
        - label: label associated with the starting and stopping time.
    """
    
    
    # 1. Load the .txt data into pandas dataframe. 
    df = pd.read_csv(text_file_path, header=None)

    # 2. Define data cleaning functions. 
    extract_time_function = lambda test_str: tuple(re.findall(TIME_REGEX, test_str))
    convert_string_to_int_function = lambda test_str: int(test_str)
    extract_label_function = lambda test_str: re.findall(LABEL_REGEX, test_str)
    clean_labels_function = lambda test_str: test_str[0] if len(test_str) else "skip"

    # 3. Apply data cleaning functions. 
    df["start"], df["stop"]  = zip(*df[0].apply(extract_time_function))
    # dividing by a tenth of a second, because the MPL2 format is read at 1/10th 
    # of seconds rather than entire seconds. 
    df["start"] = df["start"].apply(convert_string_to_int_function) / 10
    df["stop"] = df["stop"].apply(convert_string_to_int_function) / 10
    df["label"] = df[0].apply(extract_label_function)
    df["label"] = df["label"].apply(clean_labels_function)

    # 4. Drop unneeded column. 
    df = df.drop([0], axis=1)
    return df 

load().head()

Unnamed: 0,start,stop,label
0,0.0,7.2,Joy
1,7.2,9.2,Joy
2,9.2,11.2,Indifference
3,11.2,15.1,Indifference
4,15.1,19.7,Indifference


Now I want to create the nested folder strucutre based on the labels extracted from the text file. This will simplify the training process. 

In [19]:
from os import mkdir
import os
TEST_PATH = "/content/test_path"
TEST_LABELS = ["Joy", "Indifference", "skip"]

def create_nested_folder_structure_for_training(parent_path, labels):
    """
    Creates the ideal file structure for ML training:
    parent:
        label_1:
            file_with_label_1
            file_with_label_1
            ...
        label_2:
            file_with_label_2
            file_with_label_2
            ...
        ...
        label_n:
            file_with_label_n
            ...
            last_file

    Args:
        parent_path: The path to start the nested directory structure. 
        labels: The labels which will be the sub folders. 

    Returns:
        parent_path   
    """
    if not os.path.exists(parent_path):
        mkdir(parent_path)
    try:
        for label in labels:
            mkdir(parent_path + "/" + label)
        return parent_path 
    except:
        return parent_path

create_nested_folder_structure_for_training(TEST_PATH, TEST_LABELS)


'/content/test_path'

In [20]:
image_test_folder = create_nested_folder_structure_for_training("/content/image_test_files", TEST_LABELS)
image_test_folder

'/content/image_test_files'

In [21]:
from moviepy.video.io.ffmpeg_tools import ffmpeg_extract_subclip

def slice_video(
    start_time, 
    end_time, 
    label, 
    folder_path=VIDEO_TEST_FOLDER, 
    video_file=TEST_VIDEO_PATH
    ):
    """
    Takes in a video file, and slices a smaller video from it, then 
    saves that sliced video to the desired folder_path. 

    Args:
        start_time: When the slices should begin.
        end_time: When the slice ends.
        label: Label that corresponds to that video slice. Used for ML model
            training. 
        folder_path: Destniation for the video slice to be saved to. 
        video_file: Path to the video file to be sliced. 

    Returns:
        slice_video_path: The path to the new sliced video. This will be 
        added to the jubler data frame for easy editing.    
    """
    # 1. Create video and sliced video path names. 
    video_name = str(start_time) + "_" + str(end_time) + "_" + label + ".mp4"
    slice_video_path = folder_path + "/" + label + "/" + video_name

    # 2. Slice the video and save it to desired path. 
    ffmpeg_extract_subclip(video_file, 
                           start_time, 
                           end_time, 
                           targetname = slice_video_path)
    
    # 3. Return the path to the new sliced video. 
    print(video_name + " Succsefully sliced and saved to: " + folder_path)
    return slice_video_path




Testing the slice video funtion to create a single slice. 

In [22]:
slice_video(0, 7.2, "Joy", folder_path = "/content/test_path")


[MoviePy] Running:
>>> /usr/bin/ffmpeg -y -i /content/gdrive/MyDrive/jubler_test_data/Rp_1474507527_2022_08_06_12_24_02_780.mp4 -ss 0.00 -t 7.20 -vcodec copy -acodec copy /content/test_path/Joy/0_7.2_Joy.mp4
... command successful.
0_7.2_Joy.mp4 Succsefully sliced and saved to: /content/test_path


'/content/test_path/Joy/0_7.2_Joy.mp4'

In [23]:
def slice_entire_video(
    jubler_data_frame,
    folder_path=VIDEO_TEST_FOLDER, 
    video_file=TEST_VIDEO_PATH):
    """
    Takes in the jubler dataframe created from load method, then slices 
    the video_file into slices based on the "start" and "stop" columns from
    the jubler dataframe. Saves the sliced videos to folder_path, then adds the 
    paths back into the respective rows in the jubler_data_frame.  

    Args:
        jubler_data_frame: Pandas Dataframe containing the columns: 
        - "start": Where to begin the slice.
        - "stop": Where to end the slice.
        - "label": The label for ML corresponding to the sliced time frame. 
        folder_path: Where the sliced videos will be saved. 
        video_file: the input video to be sliced. 

    Returns:
        jubler_data_frame: The same dataframe that is a input, but now has an
        additional column "video_path" that has the sliced videos path.    
    """

    video_paths = []

    for index, row in jubler_data_frame.iterrows():
        start, stop, label = row.start, row.stop, row.label
        new_video_path = slice_video(start, stop, label, folder_path = folder_path)
        video_paths.append(new_video_path)

    jubler_data_frame["video_path"] = video_paths

    return jubler_data_frame


jubler_df = load()
jubler_df_with_paths = slice_entire_video(jubler_df, folder_path = "/content/test_path")       


[MoviePy] Running:
>>> /usr/bin/ffmpeg -y -i /content/gdrive/MyDrive/jubler_test_data/Rp_1474507527_2022_08_06_12_24_02_780.mp4 -ss 0.00 -t 7.20 -vcodec copy -acodec copy /content/test_path/Joy/0.0_7.2_Joy.mp4
... command successful.
0.0_7.2_Joy.mp4 Succsefully sliced and saved to: /content/test_path

[MoviePy] Running:
>>> /usr/bin/ffmpeg -y -i /content/gdrive/MyDrive/jubler_test_data/Rp_1474507527_2022_08_06_12_24_02_780.mp4 -ss 7.20 -t 2.00 -vcodec copy -acodec copy /content/test_path/Joy/7.2_9.2_Joy.mp4
... command successful.
7.2_9.2_Joy.mp4 Succsefully sliced and saved to: /content/test_path

[MoviePy] Running:
>>> /usr/bin/ffmpeg -y -i /content/gdrive/MyDrive/jubler_test_data/Rp_1474507527_2022_08_06_12_24_02_780.mp4 -ss 9.20 -t 2.00 -vcodec copy -acodec copy /content/test_path/Indifference/9.2_11.2_Indifference.mp4
... command successful.
9.2_11.2_Indifference.mp4 Succsefully sliced and saved to: /content/test_path

[MoviePy] Running:
>>> /usr/bin/ffmpeg -y -i /content/gdrive/

In [24]:
jubler_df_with_paths.head()

Unnamed: 0,start,stop,label,video_path
0,0.0,7.2,Joy,/content/test_path/Joy/0.0_7.2_Joy.mp4
1,7.2,9.2,Joy,/content/test_path/Joy/7.2_9.2_Joy.mp4
2,9.2,11.2,Indifference,/content/test_path/Indifference/9.2_11.2_Indif...
3,11.2,15.1,Indifference,/content/test_path/Indifference/11.2_15.1_Indi...
4,15.1,19.7,Indifference,/content/test_path/Indifference/15.1_19.7_Indi...


In [25]:
# 1. Load the data. This extracts the start, stop and label extracted from the 
# .txt data and adds it to a df. ✅

# 2. Create set from labels column in df as a tuple. ✅ 

# 3. create_nested_folder_structure_for_training with the tuple of labels. ✅

# -----------------------------------------------------------------------------

# 4. Extract the long audio from the long video. 

# 5. Split the long audio into shorter audio, based on the df with start and 
# stop times. add this data to the data frame. 

# 6. Use the sort data function to sort the audio in the df into the correct 
# folder structure. 

# -----------------------------------------------------------------------------

# 7. Split long video into shorter videos, based on df with start and stop 
# times, add to df. ✅

# 8. Split short videos into images, create new df where each row has a label 
# and image. 

# 9. Use the sort data function to sort the images in the df into the correct 
# folder structure.

In [26]:
import subprocess
import os
import sys

def convert_video_to_audio_ffmpeg(video_file, output_ext="wav"):
    """
    Converts a single video into an audo file using ffmpeg command. 

    Args:
        video_file: the file path for the video to be converted to audio. 
        output_ext: the data type for the audio. By default it is .wav. 

    Returns:
        new_file_name: the path to the new audio file.  
    """
    # Splitting the extension from the file name. For example, it will remove 
    # the ".mp4" at the end of the string and assign it to the "ext" variable. 
    filename, ext = os.path.splitext(video_file)

    # Creating a new file name to store the audio. 
    new_file_name = f"{filename}.{output_ext}"

    # Calling the ffmpeg command to extract audio from a video. 
    subprocess.call(["ffmpeg", "-y", "-i", video_file, new_file_name], 
                    stdout=subprocess.DEVNULL,
                    stderr=subprocess.STDOUT)
    
    return new_file_name

In [27]:
LONG_AUDIO_PATH = convert_video_to_audio_ffmpeg(TEST_VIDEO_PATH)
LONG_AUDIO_PATH

'/content/gdrive/MyDrive/jubler_test_data/Rp_1474507527_2022_08_06_12_24_02_780.wav'

In [28]:
import librosa
from IPython.display import Audio

# y, sr = librosa.load(LONG_AUDIO_PATH)
# Audio(data=y, rate=sr)

In [29]:
# def split_video_into_images():
#     pass 



# import soundfile as sf
# def split_audio_into_shorter_audio(original_audio_path, 
#                                    slice_audio_path, 
#                                    start, 
#                                    stop, 
#                                    label):

#     # 1. Load audio data.
#     data, fs = librosa.load(original_audio_path)

#     # 2. Convert the start and stop times to samples. 
#     start_sample = librosa.time_to_samples(start, fs)
#     stop_sample = librosa.time_to_samples(stop, fs)

#     # 3. Extract the desired portion of the audio. 
#     sliced_audio = data[start_sample:stop_sample]

#     # 5. Save the sliced audio file to a new file. 
#     sf.write(slice_audio_path, sliced_audio, fs)

#     print("Complete.")
#     return slice_audio_path, sliced_audio, fs


# def create_path_for_audio_file(audio_folder_path, 
#                                start, 
#                                stop, 
#                                label):
#     audio_name = str(start) + "_" + str(stop) + "_" + label + ".wav"
#     slice_audio_path = audio_folder_path + "/" + label + "/" + audio_name
#     return slice_audio_path





In [30]:
# testing audio slicing function. 
# audio_slice_path = create_path_for_audio_file(TEST_PATH, 0,	72,	"Joy")
# print(audio_slice_path)
# slice_audio_path, sliced_audio, fs = split_audio_into_shorter_audio(LONG_AUDIO_PATH, audio_slice_path, 0, 7.2, "Joy")
# Audio(data=sliced_audio, rate=fs)

In [31]:
# Testing audio extraction process. 
test_video_slice_path = "/content/test_path/Indifference/151_197_Indifference.mp4"
convert_video_to_audio_ffmpeg(test_video_slice_path)

'/content/test_path/Indifference/151_197_Indifference.wav'

In [32]:
def convert_all_video_to_audio(jubler_data_frame):
    """
    Converts all videos from the video_path column in the jubler dataframe, into
    audio files. Then adds those audio paths to the jubler df. 

    Args:
        jubler_data_frame: The df that is being used to store file paths. 

    Returns:
        jubler_data_frame
    """
    # Iterating over all the video paths, converting them to audio, and 
    # returning a list of audio paths. 
    audio_paths = [convert_video_to_audio_ffmpeg(video_path) for video_path in jubler_data_frame.video_path]

    # Adding a new column to the jubler df with the audio paths. 
    jubler_data_frame["audio_path"] = audio_paths
    return jubler_data_frame


jubler_df_with_paths = convert_all_video_to_audio(jubler_df_with_paths)
jubler_df_with_paths

Unnamed: 0,start,stop,label,video_path,audio_path
0,0.0,7.2,Joy,/content/test_path/Joy/0.0_7.2_Joy.mp4,/content/test_path/Joy/0.0_7.2_Joy.wav
1,7.2,9.2,Joy,/content/test_path/Joy/7.2_9.2_Joy.mp4,/content/test_path/Joy/7.2_9.2_Joy.wav
2,9.2,11.2,Indifference,/content/test_path/Indifference/9.2_11.2_Indif...,/content/test_path/Indifference/9.2_11.2_Indif...
3,11.2,15.1,Indifference,/content/test_path/Indifference/11.2_15.1_Indi...,/content/test_path/Indifference/11.2_15.1_Indi...
4,15.1,19.7,Indifference,/content/test_path/Indifference/15.1_19.7_Indi...,/content/test_path/Indifference/15.1_19.7_Indi...
5,19.7,23.3,Indifference,/content/test_path/Indifference/19.7_23.3_Indi...,/content/test_path/Indifference/19.7_23.3_Indi...
6,23.3,27.9,Indifference,/content/test_path/Indifference/23.3_27.9_Indi...,/content/test_path/Indifference/23.3_27.9_Indi...
7,27.9,31.8,Joy,/content/test_path/Joy/27.9_31.8_Joy.mp4,/content/test_path/Joy/27.9_31.8_Joy.wav
8,31.8,35.2,Indifference,/content/test_path/Indifference/31.8_35.2_Indi...,/content/test_path/Indifference/31.8_35.2_Indi...
9,35.2,41.1,Indifference,/content/test_path/Indifference/35.2_41.1_Indi...,/content/test_path/Indifference/35.2_41.1_Indi...


In [33]:
# Algorithm: 
# 1. Extract all of the unique classes from the jubler dataframe. Each one of 
# these classes has a folder to be sorted into. 

# 2. Create the paths for the sub directories based on the set of labels and 
# the corresponding parent path. 

# 3. Iterate over the paths and labels from the jubler df, and move each file 
# into the destination path based on the label. 


import shutil

def sort_data(jubler_df_with_paths, path_column, parent_path):
    """
    Sorts data stored in the path_column (example: all the audio files), into
    sub directories inside the parent_path, based on their labels. 

    Args:
        jubler_df_with_paths: pandas df. 
        path_column: column to be sorted. 
        parent_path: the parent directory that has sub directories based on the
        classes in the jubler df. 

    Returns:
        None. 
    """
    # Extract the unique labels from the jubler df. 
    set_of_labels = set(jubler_df_with_paths.label)
    class_directories = {}

    # Iterate over the labels. 
    for label in set_of_labels:

        # Saving path of sub directory to dictionary that has the label as the 
        # key. 
        class_directories[label] = f"{parent_path}/{label}"

    # Defining the labels and paths to iterate over. 
    labels = jubler_df_with_paths.label
    paths =  jubler_df_with_paths[path_column]

    # Sorting the data by moving files into the directory sub folder based on 
    # their label. 
    for label, file_path in zip(labels, paths):
        destination_path = class_directories[label]
        shutil.move(file_path, destination_path)

    return # Return nothing. 



In [34]:
# testing data sorting function. 
audio_test_folder = create_nested_folder_structure_for_training("/content/audio_test_files", TEST_LABELS)
sort_data(jubler_df_with_paths, "audio_path", audio_test_folder)

In [35]:
from moviepy.editor import VideoFileClip
import imageio 

def extract_images_from_video(video_path, 
                              start, 
                              stop, 
                              label, 
                              image_folder_path):
    """
    Extracts 1 image/ per second from a video clip.  

    Args:
        video_path: the path to the video to be sliced. 
        start: time when video starts in the orginal video. 
        stop: time when video ends in the original video. 
        label: label corresponding to video slice. 
        image_folder_path: directory path to store images. 

    Returns:
        images: a list of the image paths generated from the input video. 
         
    """
    # Create a video clip object to enable slicing. 
    clip = VideoFileClip(video_path)
    images = []

    # Iterate in seconds based on the length of the clip. 
    for t in range(0, int(clip.duration)):

        # Extract the image from second t. 
        frame = clip.get_frame(t)

        # Creating the path to save the image. 
        frame_name = f"{start}_{stop}_{label}_{t}.png"
        frame_path = f"{image_folder_path}/{label}/{frame_name}"

        # Saving the image to the desired destination. 
        imageio.imwrite(frame_path, frame)

        images.append(frame_path)

    # Return a list of image paths. 
    return images 



Imageio: 'ffmpeg-linux64-v3.3.1' was not found on your computer; downloading it now.
Try 1. Download from https://github.com/imageio/imageio-binaries/raw/master/ffmpeg/ffmpeg-linux64-v3.3.1 (43.8 MB)
Downloading: 8192/45929032 bytes (0.0%)2826240/45929032 bytes (6.2%)6692864/45929032 bytes (14.6%)10551296/45929032 bytes (23.0%)13524992/45929032 bytes (29.4%)16064512/45929032 bytes (35.0%)19767296/45929032 bytes (43.0%)23412736/45929032 bytes (51.0%)26968064/45929032 bytes (58.7%)30490624/45929032 bytes (66.4%)34381824/45929032 bytes (74.9%)37560320/45929032 bytes (81.8%)41476096/45929032 bytes (90.3%)

In [36]:
# Testing the video splitting function. 
test_video_for_image_extraction = "/content/test_path/Joy/0.0_7.2_Joy.mp4"
start, stop = 0, 7.2
label = "Joy"
image_folder_path = "/content/image_test_files"
extract_images_from_video(test_video_for_image_extraction,
                          start=start,
                          stop=stop,
                          label=label,
                          image_folder_path=image_folder_path)

['/content/image_test_files/Joy/0_7.2_Joy_0.png',
 '/content/image_test_files/Joy/0_7.2_Joy_1.png',
 '/content/image_test_files/Joy/0_7.2_Joy_2.png',
 '/content/image_test_files/Joy/0_7.2_Joy_3.png',
 '/content/image_test_files/Joy/0_7.2_Joy_4.png',
 '/content/image_test_files/Joy/0_7.2_Joy_5.png',
 '/content/image_test_files/Joy/0_7.2_Joy_6.png']

Next step is to apply this image extraction function to every video in the df, and save those images to the correct folder. 

In [37]:
image_folder_path = "/content/image_test_files"

def add_image_paths_to_df(dataframe, image_folder_path):
    row_data = dataframe.start, dataframe.stop, dataframe.label, dataframe.video_path

    images= []

    for start, stop, label, video_path in zip(*row_data):
        try:
            sliced_images = extract_images_from_video(video_path,
                            start=start,
                            stop=stop,
                            label=label,
                            image_folder_path=image_folder_path)
            images.append(sliced_images)   
        except:
            print(f"Unable to convert {video_path} to images.")
            images.append(["Unable to Convert Image"])

    print("Image extraction complete.")
        


add_image_paths_to_df(jubler_df_with_paths, image_folder_path)

Unable to convert /content/test_path/Indifference/103.6_105.6_Indifference.mp4 to images.
Unable to convert /content/test_path/Indifference/105.6_109.1_Indifference.mp4 to images.
Unable to convert /content/test_path/Indifference/145.9_151.4_Indifference.mp4 to images.
Unable to convert /content/test_path/skip/157.3_159.3_skip.mp4 to images.
Image extraction complete.


In [38]:
class JublerDataProcessing:
      def __init__(self, 
                   jubler_text_file_path,
                   video_chunk_directory_path,
                   audio_chunk_directory_path, 
                   image_chunk_directory_path):
          self.jubler_text_file_path = jubler_text_file_path
          self.video_chunk_directory_path = video_chunk_directory_path
          self.audio_chunk_directory_path = audio_chunk_directory_path
          self.image_chunk_directory_path = image_chunk_directory_path 
          
      def load(self):
          """
          Loads in Jubler .txt file and creates a pandas dataframe from it. 

          Args:
            jubler_text_file_path: The .txt file path to be transformed into 
            pandas data frame. 

          Returns:
            pandas dataframe with 3 columns: 
            - start: start time. 
            - stop: stopping time. 
            - label: label associated with the starting and stopping time.

          """

          return data_frame

      def process_1(self, data_frame):
          """
          Cleans the labels if they have typos or added characters. 

          Args:
            data_frame: The dataframe created after loading the jubler .txt 
            file. 

          Returns:
            data_frame: An edited version of the dataframe. 
          """
          return data_frame

      def process_2(self, data_frame):
          "do some transformation on the data1"
          return data_frame

      def process_3(self, data_frame):
          "do some transformation on the data2"
          return data_frame

      def run(self):
          data = self.load()
          data = self.process_1(data)
          data = self.process_2(data)
          data = self.process_3(data)
          return data

In [39]:
# from google.cloud import storage


# def list_blobs(bucket_name):
#     """Lists all the blobs in the bucket."""
#     # bucket_name = "your-bucket-name"

#     storage_client = storage.Client()

#     # Note: Client.list_blobs requires at least package version 1.17.0.
#     blobs = storage_client.list_blobs(bucket_name)

#     # Note: The call returns a response only when the iterator is consumed.
#     for blob in blobs:
#         print(blob.name)

# bucket_name = 'https://console.cloud.google.com/storage/browser?forceOnBucketsSortingFiltering=false&project=devaworld-282317&prefix=&forceOnObjectsSortingFiltering=false'
# list_blobs(bucket_name)

In [40]:
# def export_folder_to_gcp():
#     # This function should return the scrpit required to pull the data from GCP
#     # as well. 
#     pass 

In [41]:
!pip install pipreqs
!pip install nbconvert

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/


In [46]:
!jupyter nbconvert --output-dir="./reqs" --to script /content/Jubler_data_processing.ipynb



This application is used to convert notebook files (*.ipynb)
        to various other formats.


Options
The options below are convenience aliases to configurable class-options,
as listed in the "Equivalent to" description-line of the aliases.
To see all configurable class-options for some <cmd>, use:
    <cmd> --help-all

--debug
    set log level to logging.DEBUG (maximize logging output)
    Equivalent to: [--Application.log_level=10]
--show-config
    Show the application's configuration (human-readable format)
    Equivalent to: [--Application.show_config=True]
--show-config-json
    Show the application's configuration (json format)
    Equivalent to: [--Application.show_config_json=True]
--generate-config
    generate default config file
    Equivalent to: [--JupyterApp.generate_config=True]
-y
    Answer yes to any questions instead of prompting.
    Equivalent to: [--JupyterApp.answer_yes=True]
--execute
    Execute the notebook prior to export.
    Equivalent to: [--ExecutePr

In [43]:
!cd reqs

In [44]:
!ls

audio_test_files  image_test_files  sample_data  video_test_files
gdrive		  reqs		    test_path


In [45]:
%pwd

'/content'

In [48]:
!pipreqs /content/Jubler_data_processing.ipynb

Traceback (most recent call last):
  File "/usr/local/bin/pipreqs", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.8/dist-packages/pipreqs/pipreqs.py", line 488, in main
    init(args)
  File "/usr/local/lib/python3.8/dist-packages/pipreqs/pipreqs.py", line 478, in init
    generate_requirements_file(path, imports, symbol)
  File "/usr/local/lib/python3.8/dist-packages/pipreqs/pipreqs.py", line 157, in generate_requirements_file
    with _open(path, "w") as out_file:
  File "/usr/lib/python3.8/contextlib.py", line 113, in __enter__
    return next(self.gen)
  File "/usr/local/lib/python3.8/dist-packages/pipreqs/pipreqs.py", line 81, in _open
    file = open(filename, mode)
FileNotFoundError: [Errno 2] No such file or directory: '/content/Jubler_data_processing.ipynb/requirements.txt'


In [49]:
!pip freeze > requirements.txt

In [50]:
!git status 

fatal: not a git repository (or any of the parent directories): .git
