# Reading CSV Files

In [25]:
import csv

The preferred way to read CSV files is using the `DictReader()` method. Which directly reads each row and creates a dictionary from it - with column names as *key* and column values as *value*. Let's see how to read a file using the `csv.DictReader()` method.

In [26]:
import os
data_pkg_path = 'data'
filename = 'uk_airport_coord.csv'
path = os.path.join(data_pkg_path, filename)

In [27]:
f = open(path, 'r')
csv_reader = csv.DictReader(f, delimiter=',', quotechar='"')
print(csv_reader)
f.close()

<csv.DictReader object at 0x0000019AD940C210>


We can use enumerate() on any iterable object and get a tuple with an index and the iterable value with each iteration. Let's use it to print the first 5 lines from the DictReader object.

In [29]:
f = open(path, 'r', encoding='utf-8')
csv_reader = csv.DictReader(f, delimiter=',', quotechar='"')
for index, row in enumerate(csv_reader):
    print(row)
    if index == 4:
        break
f.close()

{'NAME': 'HONINGTON', 'ICAO': 'EGXH', 'Latitude': '52.342611', 'Longitude': '0.772939'}
{'NAME': 'WELSHPOOL', 'ICAO': 'EGCW', 'Latitude': '52.628611', 'Longitude': '-3.153333'}
{'NAME': 'CRANFIELD', 'ICAO': 'EGTC', 'Latitude': '52.072222', 'Longitude': '-0.616667'}
{'NAME': 'KEMBLE', 'ICAO': 'EGBP', 'Latitude': '51.668056', 'Longitude': '-2.056944'}
{'NAME': 'PERRANPORTH', 'ICAO': 'EGTP', 'Latitude': '50.331667', 'Longitude': '-5.1775'}


## Using `with` statement


The code for file handling requires we open a file, do something with the file object and then close the file. That is tedious and it is possible that you may forget to call `close()` on the file. If the code for processing encounters an error the file is not closed property, it may result in bugs - especially when writing files.

The preferred way to work with file objects is using the `with` statement. It results in simpler and cleaer code - which also ensures file objects are closed properly in case of errors.

As you see below, we open the file and use the file object `f` in a `with` statement. Python takes care of closing the file when the execution of code within the statement is complete.

In [30]:
with open(path, 'r', encoding='utf-8') as f:
    csv_reader = csv.DictReader(f)

## Filtering rows

We can use conditional statement while iterating over the rows, to select and process rows that meet certain criterial. Let's count how many cities from a particular country are present in the file.

Replace the `name_aiport` variable with your home country below.

In [50]:
name_aiport = 'HEATHROW'
num_aiport = 0

with open(path, 'r', encoding='utf-8') as f:
    csv_reader = csv.DictReader(f)

    for row in csv_reader:
        if row['NAME'] == name_aiport:
            num_aiport += 1
            
print(num_aiport)

1


## Calculating distance

Let's apply the skills we have learnt so far to solve a complete problem. We want to read the `worldcities.csv` file, find all cities within a home country, calculate the distance to each cities from a home city and write the results to a new CSV file.

In [53]:
aiport = 'CITY'

aiport_coordinates = ()

with open(path, 'r', encoding='utf-8') as f:
    csv_reader = csv.DictReader(f)
    for row in csv_reader:
        if row['NAME_ascii'] == aiport:
            Latitude = row['Latitude']
            Longitude = row['Longitude']
            aiport_coordinates = (Latitude, Longitude)
            break
        
print(aiport_coordinates)

('51.505278', '0.055278')


Now we can loop through the file, find a city in the chosen home country and call the `geopy.distance.geodesic()` function to calculate the distance. In the code below, we are just computing first 5 matches.

In [54]:
from geopy import distance

counter = 0
with open(path, 'r', encoding='utf-8') as f:
    csv_reader = csv.DictReader(f)
    for row in csv_reader:
        if (row['NAME'] == name_aiport and
            row['NAME_ascii'] != aiport):
            airport_t_coordinates = (row['Latitude'], row['Longitude'])
            airport_distance = distance.geodesic(
                airport_t_coordinates, aiport_coordinates).km
            print(row['NAME_ascii'], airport_distance)
            counter += 1
            
        if counter == 5:
            break
            

HEATHROW 36.01714619169927
