# 1. Gathering Information about the files in source folder:

Blow Info is displayed after gathering the details from the `source-dir`
* List total number of files 
* List of all File Formats
* Check if it matches the MASTER_REGEX 

Information is also logged inside a csv file named `info-ddmmyy-hhmmss.csv` after information is gathered about the `source-dir`

## [1.1] Get `configs.ini` : 

In [1]:
# https://stackoverflow.com/questions/8884188/how-to-read-and-write-ini-file-with-python3
import configparser
config = configparser.ConfigParser()

In [4]:
config.read('configs.ini')
source_dir      = config['INFOGATHER']['source_dir']
destination_dir = config['INFOGATHER']['destination_dir_pv'] 
log_dir         = config['INFOGATHER']['loginfo_dir'] 

MASTER_REGEX_PHOTOS_1 = config['NOEDIT']['MASTER_REGEX_PHOTOS_1'] 
MASTER_REGEX_VIDEOS_1 = config['NOEDIT']['MASTER_REGEX_VIDEOS_1'] 


In [5]:
print("Source Directory: ",source_dir)
print("Destionation Directory: ",destination_dir)
print("Log Info Directory: ",log_dir)

Source Directory:  D:/Photo-Project/R_PhotosUnsorted/Plain-01
Destionation Directory:  D:/Photo-Project/PythonPhotoSort/temp/destination-dir
Log Info Directory:  D:/Photo-Project/PythonPhotoSort/temp/loginfo-dir


In [6]:
print("MASTER_REGEX_PHOTOS_1 : ",MASTER_REGEX_PHOTOS_1)
print("MASTER_REGEX_VIDEOS_1 : ",MASTER_REGEX_VIDEOS_1)

MASTER_REGEX_PHOTOS_1 :  ^[iImMgG]{3}[-_]([0-9]{8})[-_].*\.(?:jpg|jpeg)$
MASTER_REGEX_VIDEOS_1 :  ^[VvIiDdvideo]{3,}[-_]([0-9]{8})[-_].*\.(?:mp4)$


## [1.2] Loop through folder and fetch info: 


In [7]:
import re
import os
from datetime import datetime
import shutil

In [8]:
input_folder_path  = source_dir
output_folder_path = destination_dir


In [9]:
# Log file name 
temp = "info-" + datetime.now().strftime("%Y%m%d-%H%M%S") + ".csv"
csv_file           = os.path.join(log_dir, temp)
print("Info CSV File: ", csv_file) 

Info CSV File:  D:/Photo-Project/PythonPhotoSort/temp/loginfo-dir\info-20240519-140725.csv


In [10]:
FILE_COUNT = 0
FILE_EXT_LIST = set([])
FILE_EXT_COUNTER = []

In [11]:
csvfile_handle = open(csv_file, "w", encoding="utf-8")
csvfile_handle.write("Filename;Extension;regex_photos_match;regex_videos_match;Extracted TS;Path;Null\n")
print()




In [12]:
src = input_folder_path
dst = output_folder_path

# OS.walk() generate the file names in a directory tree (nested subfolders) by walking the tree either top-down or bottom-up.
for root, subdirs, files in os.walk(src):
    for file in files:
        path = os.path.join(root, file)
        
        _filenameonly = file   # e.g. IMG_20150829_141244.jpg
        _extension = os.path.splitext(file)[1] # e.g. jpg

        regex_photos_match = False
        regex_videos_match = False
        
        _extracted_ts = "Null" # extracted timestamp 

        m1 = re.search(r'{}'.format(MASTER_REGEX_PHOTOS_1), file)
        if m1: 
            _extracted_ts = m1.group(1)
            regex_photos_match = True

        m2 = re.search(r'{}'.format(MASTER_REGEX_VIDEOS_1), file)
        if m2: 
            _extracted_ts = m2.group(1)
            regex_videos_match = True
            
        _fullfilepath = path   # e.g. H:/myfolder/IMG_20150829_141244.jpg
        
        # Filename;Extension;Filename TS-Regex Compliant;Extracted TS;Path\n
        csv_line = _filenameonly + ";" + _extension + ";" + str(regex_photos_match) + ";" + str(regex_videos_match) + ";" + \
                  _extracted_ts + ";" + _fullfilepath + ";NOTHING;\n"
        
        csvfile_handle.write(csv_line)
        
        # Increment File Count
        FILE_COUNT = FILE_COUNT+ 1
        # Add file extension to set
        FILE_EXT_LIST.add(_extension)
        # Group counter
        FILE_EXT_COUNTER.append(_extension)
 

csvfile_handle.close()

## [1.3] Print Info: 

In [23]:
print("Number of files: ",FILE_COUNT)
print("List of file extensions: ",FILE_EXT_LIST)
print("Info written to CSV File : ", csv_file) 

Number of files:  7779
List of file extensions:  {'.jpg', '.JPG', '.mp4', '.jpeg', '.png'}
Info written to CSV File :  D:/Photo-Project/PythonPhotoSort/temp/loginfo-dir\info-20240519-140725.csv


In [28]:
from collections import Counter
import pandas as pd
filecount_by_ext = Counter(FILE_EXT_COUNTER)
df = pd.DataFrame.from_records(list(dict(filecount_by_ext).items()), columns=['extensions','count'])
df.sort_values(by=['count'],inplace=True, ascending=False)
df.head()


Unnamed: 0,extensions,count
0,.jpg,7278
2,.mp4,481
1,.JPG,15
4,.png,4
3,.jpeg,1
