Python is a popular interpreted and dynamically typed programming language for building web services, desktop apps, automation scripts, and machine learning projects. Programmers often have to access the operating system’s file system when they work with Python-based software projects.

For example, we use text files as inputs, write text files as outputs, and process binary files often. Like any other popular, general-purpose programming language, Python also offers cross-platform file handling features. Python provides file handling features via several inbuilt functions and standard modules.

In this article, I will explain everything you need to know about Python file handling, including:

* Reading files
* Writing files
* Reading file attributes
* Creating new Python directories
* Reading Python directory contents
* Removing files or directories
* Performing file searches
* Processing binary files
* Creating and extracting data from Python archives
* Copying and moving files
* Best practices

# Write in File 

In [1]:
file = open("demofile.txt","w")
file.write(" Hello this file is created for learning purpose of file handling.")
file.close() # this will clean up resource after writing in file 

# Read File

In [2]:
file = open("demofile.txt","r")
print(file.read())

 Hello this file is created for learning purpose of file handling.


## Read bytes

In [3]:
file = open("demofile.txt","r")
print(file.read(12))
print(file.read(8)) # we are specifying byte size

 Hello this 
file is 


## Seek 

In [4]:
file = open("demofile.txt","r")
print(file.read(28))
file.seek(0) # this will reset cursor back to the  begining of the file
print(file.read(28))

 Hello this file is created 
 Hello this file is created 


# Append

In [5]:
file = open("demofile.txt","a")
file.write("my name is rani rathore.") # return number of character you have appended


24

The Python interpreter will return an EOL error if the quotation mark at the end of the string literal is missing. We can easily fix this problem by making sure the end quotation marks are in place

# writelines

In [42]:
file = open("demofile.txt","a")
file.writelines("\n my name is rani rathore.\n my father name is dilip singh rathore. \n i am good girl, with very beautifull heart. \n i am in love with coding.")
 

# enumerate

In [8]:
file = open("demofile.txt","r")
for i, line in enumerate(file.readlines()):
    print(i, line)

0  Hello this file is created for learning purpose of file handling.my name is rani rathore.

1  my name is rani rathore.

2  my father name is dilip singh rathore. 

3  i am good girl, with very beautifull heart. 

4  i am in love with coding.


# Reading File attributes in python :
reading meta data

we used st_size to get the file size, at_atime to get the last file accessed timestamp, and st_mtime to get the last modified timestamp. 

In [9]:
import os 

In [13]:
stat = os.stat("demofile.txt")
print(type(stat.st_size))

<class 'int'>


In [15]:
size = os.path.getsize("demofile.txt")
print(size)

234


# creating directory in python

In [59]:
os.mkdir("MyFolder") # this will create new folder in current directory 

In [60]:
# with the above code we cannot create multiple directory at once it will fails for that we need to do this :

In [61]:
os.makedirs("MyFolder/sampleDirectory")

# Reading Python directory Contents

In [20]:
currdir = os.getcwd() # current working directory
currdir

'C:\\Users\\rani\\30_Days_of_Data_Science'

In [29]:
entries = os.listdir(currdir)
entries

[' Time_Module.ipynb',
 '.ipynb_checkpoints',
 'demofile.txt',
 'File_Handling.ipynb',
 'MyFolder']

In [30]:
for i in entries:
    print(i)

 Time_Module.ipynb
.ipynb_checkpoints
demofile.txt
File_Handling.ipynb
MyFolder


In [32]:
for root, sub_dirs, files in os.walk(os.getcwd()):
    relative_root = os.path.relpath(root)
    print(relative_root)
    print("____________________________________")
    for j in sub_dirs + files:
        print(j)

.
____________________________________
.ipynb_checkpoints
MyFolder
 Time_Module.ipynb
demofile.txt
File_Handling.ipynb
.ipynb_checkpoints
____________________________________
 Time_Module-checkpoint.ipynb
File_Handling-checkpoint.ipynb
MyFolder
____________________________________
sampleDirectory
MyFolder\sampleDirectory
____________________________________


# Removing Files or Directories in Python

In [None]:
import os

file_to_remove = "demofile.txt"

if os.path.exists(file_to_remove):
    os.remove(file_to_remove)
else:
    print("no such file")

In [45]:
import os

dir_to_remove = "MyFolder/"

if os.path.exists(dir_to_remove):
    os.rmdir(dir_to_remove)
    
else:
    print("removes")

removes


In [39]:
import os, shutil

dir_to_remove = "MyFolder/"

if os.path.exists(dir_to_remove):
    shutil.rmtree(dir_to_remove) # Recursively remove all entries
else:
    print("%s doesn't exist!" % dir_to_remove)

# Performing File Searches in Python

* Finding all entries with the os.listdir function and checking each entry with an if condition inside a for loop
* Finding all entries recursively with the os.walktree function and validating each entry with an if condition inside a for loop.
* Querying all entries with the glob.glob function and obtaining only entries you need


Overall, the third approach is best for most scenarios because it has inbuilt filtering support, very good performance, and requires minimal code from the developer’s end (more Pythonic). Let’s implement a file search with the Python glob module.

In [53]:
import glob,os
query = "C:/Users/rani/Jupyter_Notebook_Python_Code/ Day_29_Pipeline_part_1.ipynb"
entries = glob.glob(query,recursive=True)
no_of_entries  = len(entries)
if no_of_entries==0:
    print("no result")
else:
    print(no_of_entries,query)
    
for i  in entries:
    print(i)
    

1 C:/Users/rani/Jupyter_Notebook_Python_Code/ Day_29_Pipeline_part_1.ipynb
C:/Users/rani/Jupyter_Notebook_Python_Code/ Day_29_Pipeline_part_1.ipynb


In [55]:
my_binary_file = open("my_file.bin","wb")
bytes = bytearray([80 ,121, 116 ,104, 111, 110])
my_binary_file.write(bytes)
my_binary_file.close()

In [56]:
my_binary_file = open("my_file.bin","rb")
bytes = my_binary_file.read()


# Creating and extracting from Python archives

In [62]:
import shutil

output_file = "myArchive"
input_dir = "MyFolder"

shutil.make_archive(output_file,"zip",input_dir)

'C:\\Users\\rani\\30_Days_of_Data_Science\\myArchive.zip'

extracting files of zip folder into new file

In [63]:
input_file = "myArchive.zip"
output_dir = "My_new_Folder"
shutil.unpack_archive(input_file,output_dir)

# Copying and moving files

In [66]:
shutil.copy("C:/Users/rani/Jupyter_Notebook_Python_Code/ E_D_A.ipynb","copy_eda.ipynb")

'copy_eda.ipynb'

In [67]:
shutil.move("copy_eda.ipynb","MyFolder/")

'MyFolder/copy_eda.ipynb'

In [69]:
shutil.copytree("MyFolder/","My_new_Fold/")

'My_new_Fold/'

# Best Practices

Programmers follow various coding practices. Similarly, Python programmers also follow different coding practices when they handle files.

For example, some programmers use try-finally block and close file handlers manually. Some programmers let the garbage collector close the file handler by omitting the close method call — which is not a good practice. Meanwhile, other programmers use the with syntax to work with file handlers.

In this section, I will summarize some best practices for file handling in Python. First, look at the following code that follows file handling best practices.

def print

In [71]:
def print_file_content(filename):
    with open(filename) as f:
        content = f.read()
        print(content)

file_to_read = "File_Handling.ipynb"

try:
    print_file_content(file_to_read)
except:
    print("unable to open file")
else:
    print("succesfully %s" %file_to_read)

{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "03fab350",
   "metadata": {},
   "source": [
    "# Write in File "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "dfba084c",
   "metadata": {},
   "outputs": [],
   "source": [
    "file = open(\"demofile.txt\",\"w\")\n",
    "file.write(\" Hello this file is created for learning purpose of file handling.\")\n",
    "file.close() # this will clean up resource after writing in file "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "548006dd",
   "metadata": {},
   "source": [
    "# Read File"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "4f18c812",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      " Hello this file is created for learning purpose of file handling.\n"
     ]
    }
   ],
   "source": [
    "file = open(\"demofile.txt\",\"r\")\n",
    "print(file.read())"
   ]
  },
  {
   "cell_type": "markdown",


Here, we used the with keyword to implicitly close the file handler. Also, we handle possible exceptions with a try-except block. While you are working with Python file handling, may sure that your code has the following points.

* Never ignore exceptions   —  especially with long-running Python processes. However, it’s okay to ignore exceptions for simple utility scripts because unhandled exceptions halt utility scripts from continuing further
* If you are not using the with syntax, make sure to close opened file handlers properly. The Python garbage collector will clean the unclosed file handlers, but it’s always good to close a file handler via our code to avoid unwanted resource usages
* Make sure to unify file handling syntaxes in your codebase. For example, if you use with keyword for handling files, make sure to use the same syntax for all places where you are handling file
* Avoid reopening the same file again when you read or write with multiple handlers. Instead, use the flush and seek methods, as shown below:


In [72]:
def process_file(File_name):
    with open(File_name,"+w") as file:
        
        file.write("create a new file or if already exits then write into that file")
        print("cursor position ",file.tell())
        
        #reset internal buffer
        file.flush()
        
        # set cursor to the begining 
        file.seek(0)
        print("cursor position ",file.tell())
        
        #print content
        content = file.read()
        print(content)
        print("cursor position ",file.tell())
        
file_to_read = "file.txt"

try:
    process_file(file_to_read)
except:
    print("Unable to process file %s " % file_to_read)
else:
    print("Successfully processed %s" % file_to_read)
    
        

cursor position  63
cursor position  0
create a new file or if already exits then write into that file
cursor position  63
Successfully processed file.txt


The above content saves a string to the file first. After that, it reads the newly added content again by resetting the internal buffer. The flush method clears the temporarily saved data in memory, so the next read will return the newly added content. Also, we need to use the seek(0) method call to reset the cursor to the beginning because the write method sets it to the end.