We have a csv file with two columns: longitude and latitude. Each coordinate pair is the center of a volcano around the world. There are 1,509 volcanoes in our dataset. The original coordinate reference system is geographic coordinates with datum WGS84. We want to make a coordinate transformation of these data points to World Mercator. It will take much too long to manually transform these coordinates as we have done in the notebooks before. Therefore, our new code will read the csv file and create a new csv file.

Check that the pathway of in_path and out_path matches the directory where the csv file is. In this example, the volcanoes file (volc_longlat.csv) is in the directory ch2-5 in the user's Desktop. Run the code, you will know the process is finished when the message "process completed" and the time of execution are returned:

In [1]:
import csv
import pyproj
from functools import partial
from os import listdir, path

# time the execution of the code
import time
start_time = time.time()


# Remove warnings
import warnings
warnings.simplefilter('ignore')

#Define some constants at the top

lon = 'LONGITUDE' #name of longitude field in original files
lat = 'LATITUDE' #name of latitude field in original files
f_x = 'x' #name of new x value field in new projected files
f_y = 'y' #name of new y value field in new projected files
in_path = u'/Users/nestor/Desktop/ch2-5' #input directory
out_path = u'/Users/nestor/Desktop/ch2-5' #output directory
input_projection = 'EPSG:4326' #WGS84
output_projection = 'EPSG:3395' #World Mercator

#Get CSVs to reproject from input path
files= [f for f in listdir(in_path) if f.endswith('.csv')]

#Define partial function for use later when reprojecting
project = partial(
    pyproj.transform,
    pyproj.Proj(init=input_projection),
    pyproj.Proj(init=output_projection))

for csvfile in files:
    #open a writer, appending '_project' onto the base name
    with open(path.join(out_path, csvfile.replace('.csv','_project.csv')), 'w') as w:
        #open the reader
        with open(path.join( in_path, csvfile), 'r') as r:
            reader = csv.DictReader(r, dialect='excel')
            #Create new fieldnames list from reader
            # replacing lon and lat fields with 
            # x and y fields
            fn = [x for x in reader.fieldnames]
            fn[fn.index(lon)] = f_x
            fn[fn.index(lat)] = f_y
            writer = csv.DictWriter(w, fieldnames=fn)
            #Write the output
            writer.writeheader()
            for row in reader:
                x,y = (float(row[lon]), float(row[lat]))
                try:
                    #Add x,y keys and remove lon, lat keys
                    row[f_x], row[f_y] = project(x, y) # project point
                    row.pop(lon, None)
                    row.pop(lat, None)
                    writer.writerow(row)
                except Exception as e:
                    #If coordinates are out of bounds, 
                    #skip row and print the error
                    print (e)
print('process completed')
end_time = time.time()
print("it took {} seconds to run the code".format(end_time-start_time))

process completed
it took 55.04537224769592 seconds to run the code


It takes about 55 seconds to run this code. Check the newly created csv file and notice that you now have a listing of coordinates in meters. The EPSG definition of the output coordinate reference system is listed under output_projection. You can easily change this variable to another EPSG and rerun the script. Notice that the code is written so that every csv file in the directory will undergo a coordinate transformation. 