# Python  Project

# Removing the Header from CSV Files

Say you have the boring job of removing the first line from several hundred
CSV files. Maybe you’ll be feeding them into an automated process that
requires just the data and not the headers at the top of the columns. You
could open each file in Excel, delete the first row, and resave the file—but
that would take hours.

But Python program literally takes few seconds to do it for you

The program will need to open every file with the .csv extension in the
current working directory, read in the contents of the CSV file, and rewrite
the contents without the first row to a file of the same name. This will
replace the old contents of the CSV file with the new, headless contents

For this project, open a new file editor window and save it as
removeCsvHeader.py.

We do this project in 3 steps

# Step 1: Loop Through Each CSV File

The first thing your program needs to do is loop over a list of all CSV filenames for the current working directory.

In [2]:
import csv, os
os.makedirs('headerRemoved', exist_ok=True)

In [10]:
for csvFilename in os.listdir('.'):
 if not csvFilename.endswith('.csv'):
    continue # skip non-csv files
 print('Removing header from ' + csvFilename + '...')

Removing header from example.csv...


The os.makedirs() call will create a headerRemoved folder where all the headless CSV files will be written.

A for loop on os.listdir('.') gets you partway
there, but it will loop over all files in the working directory, so you’ll need
to add some code at the start of the loop that skips filenames that don’t end
with .csv. 

The continue statement makes the for loop move on to the next
filename when it comes across a non-CSV file

 print out a message
saying which CSV file the program is working on. Then, add some TODO comments for what the rest of the program should do.

# Step 2: Read in the CSV File

In [None]:
# Read the CSV file in (skipping first row).

csvRows = []
csvFileObj = open(csvFilename)
readerObj = csv.reader(csvFileObj)
for row in readerObj:
    if readerObj.line_num == 1:
        continue # skip first row
    csvRows.append(row)
csvFileObj.close()

The Reader object’s line_num attribute can be used to determine which
line in the CSV file it is currently reading. Another for loop will loop over
the rows returned from the CSV Reader object, and all rows but the first will
be appended to csvRows.

As the for loop iterates over each row, the code checks whether
readerObj.line_num is set to 1. If so, it executes a continue to move on to the
next row without appending it to csvRows. For every row afterward, the
condition will be always be False, and the row will be appended to csvRows.

# Step 3: Write Out the CSV File Without the First Row

Now that csvRows contains all rows but the first row, the list needs to be
written out to a CSV file in the headerRemoved folder. Add the following
to your code

In [15]:
 # Write out the CSV file.
csvFileObj = open(os.path.join('headerRemoved', csvFilename), 'w',
newline='')
csvWriter = csv.writer(csvFileObj)
for row in csvRows:
  csvWriter.writerow(row)
csvFileObj.close()

The CSV Writer object will write the list to a CSV file in headerRemoved
using csvFilename (which we also used in the CSV reader). This will overwrite the original file.

After the code is executed, the outer for loop u will loop to the next
filename from os.listdir('.'). When that loop is finished, the program will
be complete.

# Execute this below code in the directory where your CSV files are present

In [16]:
# The entire code is this below

import csv, os
os.makedirs('headerRemoved', exist_ok=True)
for csvFilename in os.listdir('.'):
    if not csvFilename.endswith('.csv'):
        continue # skip non-csv files
    print('Removing header from ' + csvFilename + '...')

 # Read the CSV file in (skipping first row).

    csvRows = []
    csvFileObj = open(csvFilename)
    readerObj = csv.reader(csvFileObj)
    for row in readerObj:
        if readerObj.line_num == 1:
            continue # skip first row
        csvRows.append(row)
    csvFileObj.close()

 # Write out the CSV file.

    csvFileObj = open(os.path.join('headerRemoved', csvFilename), 'w',newline='')
    csvWriter = csv.writer(csvFileObj)
    for row in csvRows:
      csvWriter.writerow(row)
    csvFileObj.close()

Removing header from example.csv...


# Credits : Al Sweigart's Automate The Boring Stuff With Python