-
Notifications
You must be signed in to change notification settings - Fork 1
Dataset Check
To examine the dataset, I asked ChatGPT-4o to write a python script to check if all files listed in the excel column are included in the audio folder. Here is the answer :
To check if all files listed in an Excel column are present in an audio folder, we can use the pandas library to read the Excel file and the os library to check for the presence of the files in the specified folder.
Requirements:
Install the required libraries if you haven't already:
pip install pandas openpyxl
Script:
import os
import pandas as pd
def check_files_in_folder(excel_file_path, sheet_name, column_name, folder_path):
# Read the Excel file
df = pd.read_excel(excel_file_path, sheet_name=sheet_name)
# Extract the list of files from the specified column
files_list = df[column_name].tolist()
# List all files in the directory
folder_files = os.listdir(folder_path)
# Check for missing files
missing_files = [file for file in files_list if file not in folder_files]
if not missing_files:
print("All files listed in the Excel column are present in the folder.")
else:
print("The following files are missing in the folder:")
for file in missing_files:
print(file)
if __name__ == "__main__":
# Input parameters
excel_file_path = input("Enter the path to the Excel file: ")
sheet_name = input("Enter the sheet name: ")
column_name = input("Enter the column name that contains the file names: ")
folder_path = input("Enter the path to the folder containing audio files: ")
# Check if all files listed in the Excel column are in the folder
check_files_in_folder(excel_file_path, sheet_name, column_name, folder_path)
How to Use:
Save the script to a file, for example, check_files.py .
Run the script:
python check_files.py
When prompted, enter:
- The path to the Excel file.
- The sheet name containing the file list.
- The column name where the file names are listed.
- The path to the folder containing your audio files.
The script will then check if all files listed in the specified Excel column are present in the specified folder and will print the result.
Notes:
Ensure the Excel file is in a format readable by pandas (e.g., .xlsx).
Ensure the folder path and file names are correct and correspond to the actual files you want to check.
This script assumes that the file names listed in the Excel column exactly match the names of the files in the folder. If there are extensions or case differences, you might need to adjust the script accordingly.
The program works like a charm. Since a few weeks, I always refer to ChatGPT if I need small python scripts to progress with my projects. It's very efficient, up-to-date and reliable.