We have a csv file with two columns: longitude and latitude. Each coordinate pair is the center of a volcano around the world. There are 1,509 volcanoes in our dataset. The original coordinate reference system is geographic coordinates with datum WGS84. We want to make a coordinate transformation of these data points to World Mercator. It will take much too long to manually transform these coordinates as we have done in the notebooks before. Therefore, our new code will read the csv file and create a new csv file.

Check that the src_dir and dst_dir variables match the directory where the csv file is. In this example, the volcanoes file (volc_longlat.csv) is in the directory data/ch2-5. Run the code, you will know the process is finished when the message "process completed" and the time of execution are returned:

In [1]:
# New API
# Import libraries
import csv, time
from os import path
from pyproj import Transformer, CRS

src_file = 'volc_longlat.csv'
dst_file = 'volc_projected.csv'

src_dir = path.abspath('../data/ch2-5') # input directory
dst_dir = path.abspath('../data/ch2-5') # output directory

src_path = path.join(src_dir, src_file)
dst_path = path.join(dst_dir, dst_file)

src_crs = CRS("EPSG:4326") #WGS84
dst_crs = CRS("EPSG:3395") #World Mercator

# create coordinate transformer
# always_xy=True makes projector.transform() accept lon, lat (GIS order) instead of lat, lon
# for more info see the doc https://pyproj4.github.io/pyproj/stable/api/transformer.html?highlight=transformer#pyproj.transformer.Transformer.from_crs
projector = Transformer.from_crs(src_crs, dst_crs, always_xy=True)

# source csv file has lon, lat columns
src_header = ['LONGITUDE', 'LATITUDE']

# destinatin csv file will have x, y columns
dst_header = ['x', 'y']

# start benchmark timer
start_time = time.time()

# open destination file in write mode
with open(dst_path, 'w') as w:
    # open source file in read mode
    with open(src_path, 'r') as r:
        reader = csv.reader(r, dialect='excel')
        input_headers = next(reader) # read and skip first header row ['LONGITUDE', 'LATITUDE']        

        writer = csv.writer(w, delimiter=',', quotechar='"', quoting=csv.QUOTE_MINIMAL)
        writer.writerow(dst_header)   # Write the output header
        for row in reader:
            try:
                # convert string values inside row into float values
                lon, lat = [float(val) for val in row]
                x, y = projector.transform(lon, lat)
                writer.writerow([ x, y ])
            except Exception as e:
                # If coordinates are out of bounds, 
                # skip row and print the error
                print (e)

# stop benchmarking
end_time = time.time()

print('process completed')
print("it took {} seconds to run the code".format(end_time-start_time))

process completed
it took 0.012376785278320312 seconds to run the code


It takes less than a second to run this code. Check the newly created csv file and notice that you now have a listing of coordinates in meters. The EPSG definition of the output coordinate reference system is listed under the variable dst_src. You can easily change this variable to another EPSG and rerun the script. 