# The File Renamer

Have tons and tons of files with random names, not giving a feel of what is there inside it ??

Don't you think, it would be a great if a magic wand can be waved and get all this renamed. Worry not, we have tried to achieve the same through a simple code. All we need to know is, which line within the document has some relevant information that can be used as the title of the file. For PDFs, it is even simpler, just hope, whoever created the PDF ensured its title property is properly filled.

__Here, I present a simple code sequence which will run through all the files within the master folder and its subfolders (one level withn the master folder) to search for documents, notepad files and PDFs and renames them with relevant titles by searching for the title inside the document.__

## Object Oriented Approach

Here, the target is to rename PDFs, TEXTs, and Documents. The modules and methods to read through these files are indeed different and need different set of codes, but the Operating system based works like searching through sub folders, renaming, handling errors etc. are kind of common for all.

## Here we go

This is going to be as less verbose as possible, please refer my Github profile which in which I have a series of artciles wrotten using Jupyter notebooks explaining Python prgramming basics from data types to loops to functions to file handling and to exception handling and finally OOPs. 

Github Python Programming Repo Link (PyPro_ahhp): https://github.com/arvindhhp/PyPro_ahhp

In [None]:
#Importing all the necessary packages before we start
#Install the packages for PDF and Docs using pip install
#pip install python-docx
#pip install PyPDF2

import os
import PyPDF2
import docx
import re

In [None]:
#Defining Child Classes to implement PDF new name extraction

class pdf_rename:    
    
    def pdf_new_name(self, file_name):
        
        pdfObject=open(file_name,'rb')
        pdf = PyPDF2.PdfFileReader(pdfObject)
        information = pdf.getDocumentInfo()
        
        #If PDF title is not assigned, information will have None
        if information.title!='None':
            title_=information.title
        else:
            title_=file_path
        
        #USing RegEx to ensure file name is alpha-numeric
        #Non alpha numeric values are skipped
        new_name=''
        reg_ex=re.findall(r'(\w| )',title_)
        for p in reg_ex:
            new_name+=p
        pdfObject.close()
        
        return new_name

In [None]:
#Defining Child Classes to implement .docx new name extraction

class doc_rename:
    
    def __init__(self):
        self.doc_title_line=int(input('Enter the line number of the Title for .docx Files : '))
    
    def doc_new_name(self, file_name, line_number):
        doc = docx.Document(file_name)
        title_ = str(doc.paragraphs[line_number-1].text)
        
        new_name=''
        reg_ex=re.findall(r'(\w| )',title_)
        
        for p in reg_ex:
            new_name+=p
            
        return new_name

In [None]:
#Defining Child Classes to implement .docx new name extraction

class txt_rename:
    def __init__(self):
        self.txt_title_line=int(input('Enter the line number of the Title for .txt Files : '))
    
    def txt_new_name(self, file_name, line_number):
        lines_list=[]
        
        f=open(file_name)
        for line in f:
            lines_list.append(line)
        f.close()
        
        title_=lines_list[line_number-1]
        new_name=''
        reg_ex=re.findall(r'(\w| )',title_)
        
        for p in reg_ex:
            new_name+=p
            
        return new_name

In [None]:
#Instantiating the renaming objects for diff file types

pdf_obj=pdf_rename()
doc_obj=doc_rename()
txt_obj=txt_rename()

In [None]:
#Class to take care of directory traversing and file searching to be renamed

class file_rename_process:
    
    #Getting the master folder directory path as input argument
    
    def __init__(self, master_path):
        self.master_dir_path=master_path
        self.subfolder_path=''
        self.new_name=''
        self.rename(self.master_dir_path)
        self.subfolder_rename()
        self.temp=''
        
    #Defining method to search through Master Folder and renames the files
    
    def rename(self, path):
        os.chdir(path)
        
        for item in os.listdir():
            
            #Rename PDF files
            if '.pdf' in item:
                
                self.new_name=pdf_obj.pdf_new_name(item)
                try:
                    os.rename(item,item[0:-4]+' '+self.new_name+'.pdf')
                except:
                    print(f'New File name of {item} is too long')
            
            #Rename .docx (MS-Word) files
            if '.docx' in item:
                
                self.new_name=doc_obj.doc_new_name(item, doc_obj.doc_title_line)
                try:
                    os.rename(item,item[0:-5]+' '+self.new_name+'.docx')
                except:
                    print(f'New File name of {item} is too long')                  
                              
            #Rename .txt (Notepad) files
            if '.txt' in item:
                
                self.new_name=txt_obj.txt_new_name(item, txt_obj.txt_title_line)
                try:
                    os.rename(item,item[0:-4]+' '+self.new_name+'.txt')
                except:
                    print(f'New File name of {item} is too long')   
                    
    #Defining method to search through Sub-folders 1 level within the Folder and renames the files
    
    def subfolder_rename(self):
        
        os.chdir(self.master_dir_path)
                
        for item in os.listdir():
            
            #Checking if the item is a sub-fodler or just a file
            #try block changes  sub fodler to current working directory
            #If code comes across a file instead of a sub-folder, it is skipped 
            
            try:
                sub_folder=self.master_dir_path+'\\'+item
                os.chdir(sub_folder)
                
                #Looping for files within the sub-folder
                
                for sub_item in os.listdir():
                    
                    if '.pdf' in sub_item:
                        self.new_name=pdf_obj.pdf_new_name(sub_item)
                        try:
                            os.rename(sub_item,sub_item[0:-4]+' '+self.new_name+'.pdf')
                        except:
                            print(f'New File name of {item} is too long')
                                             
                    if '.docx' in sub_item:
                        self.new_name=doc_obj.doc_new_name(sub_item, doc_obj.doc_title_line)
                        try:
                            os.rename(sub_item,sub_item[0:-5]+' '+self.new_name+'.docx') 
                        except:
                            print(f'New File name of {item} is too long')
                        
                    if '.txt' in sub_item:
                        self.new_name=txt_obj.txt_new_name(sub_item, txt_obj.txt_title_line)
                        try:
                            os.rename(sub_item,sub_item[0:-5]+' '+self.new_name+'.txt')
                        except:
                            print(f'New File name of {item} is too long')        
            except:
                pass

In [None]:
#File Renaming object instatiation

master_directory_path=input('Enter the Master Directory Path : ')
file=file_rename_process(master_directory_path)
