[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/googlecolab/colabtools/blob/master/notebooks/colab-github-demo.ipynb)

# Exercise: Regular Expressions

In [11]:
import collections
import csv
import re

## Step-1: Load the airport data set
+ Write the function `open_data_file()`:
    - with the named input parameter, `filename`
    - with the default value for the input parameter to be "../data_raw/us_airports.csv"
+ Use `try-except` structure
+ Return the file pointer to the open file upon success or None, otherwise

In [261]:
# YOUR CODE GOES HERE
def open_data_file(filename = '../data_raw/us_airports.csv'):
    fp = None
    try:
        fp = open(filename, encoding = 'utf-8')
    except OSError as e:
        print(e)
    finally:
        return fp

Here is the output from the function call(s):

In [259]:
fp = open_data_file("airports.dat")
if fp == None:
    print("Failed to open the data file.")

[Errno 2] No such file or directory: 'airports.dat'
Failed to open the data file.


In [260]:
fp = open_data_file()

[Errno 2] No such file or directory: '../data_raw/us_airports.csv'


## Step-2: Check the input parameters
+ Write the function `is_valid_input()`
+ Input parameters: `name`, `country`, and `airport_code` of type `str`
+ Check that at least one of the input parameters is ___not___ empty; otherwise, return `False`
+ Check that the airport code consists of 3-letters (lower or upper case is ok)
    - Use _regular expressions_
    - Raise the `ValueError` and Return `False` if the airport code is incorrect
    - Return `True` if the match is correct

In [176]:
def is_valid_input(name = "", country = "", airport_code = ""):
    if name == "" and country == "" and airport_code == "":
        print("Provide an airport name (name=) or country (country=) or 3-letter airport code (airport_code)")
        return False
    else:
        code_check = re.match("^[A-z]{3}$", airport_code)
        if code_check:
            return True
        else:
            raise ValueError("Invalid airport code: should be three letters")
    

Here are some example outputs upon testing the function:

In [177]:
try:
    ret = is_valid_input ()
    print(ret)
except ValueError as err:
    print('Error with {ac}: {err}'.format(ac=airport_code, err=err))

Provide an airport name (name=) or country (country=) or 3-letter airport code (airport_code)
False


In [178]:
airport_code= 'RDU'
try:
    ret = is_valid_input (airport_code=airport_code)
    print(ret)
except ValueError as err:
    print('Error with {ac}: {err}'.format(ac=airport_code, err=err))

True


In [179]:
airport_code= 'rdu'
try:
    ret = is_valid_input (airport_code=airport_code)
    print(ret)
except ValueError as err:
    print('Error with {ac}: {err}'.format(ac=airport_code, err=err))

True


In [181]:
airport_code= 'Raleigh'
try:
    ret = is_valid_input (airport_code=airport_code)
    print(ret)
except ValueError as err:
    print('Error with {ac}: {err}'.format(ac=airport_code, err=err))

Error with Raleigh: Invalid airport code: should be three letters


## Step-3: Find an airport that is in the `IATA_FAA` field
+ Write the function `find_airpot()`:
    - Input parameters: `name`, `country`, `airport_code` and `filename`
        * the `filename` has the default value of `../data_raw/airports.csv`
+ Open the input file defined by filename using your previous function
+ Check if the other input parameters are valid using your previous function
+ If the airport_code is found to be valid:
    - read each line from the csv file: `csv.reader(fp)` 
    - check if the airport code is in the `IATA_FAA` field.  
        * If so, then add the line to the list of the results found
    - Return the list of resulting line(s)

In [306]:
def find_airport(name='', country='', airport_code='', filename='../data_raw/airports.csv'):
    # Open the data file for reading:
    fp = open_data_file(filename)
    if fp == None:
        return
    # Check if input parameters are valid
    try:
        ret = is_valid_input(name, country, airport_code = airport_code)
        if not ret:
            return
    except ValueError as err:
        print('Error with {ac}: {err}'.format(ac=airport_code, err=err))
        return
    # Generate results
    results = []
    try:
        headings = fp.readline().strip()[1:].split(',') # [1:] is stripping the \ufeff byte order mark out.
        attributes_tuple = ' '.join([heading.strip() for heading in headings])
        print ("HEADER: {}".format(attributes_tuple))
        Airport = collections.namedtuple('Airport', attributes_tuple)
        
        # Process one row at a time from the list  returned by csv.reader() function
        reader = csv.reader(fp)
        for row in reader:
            curr_code = row[4]
            if(curr_code == airport_code.upper()):
                print("FOUND: ", row)
                airport_str = row[1] + ' (' + row[2] + ', ' + row[3] + ') Abbr: ' + row[4]
                results.append(airport_str)
        print("RESULTS: ")
        return results
        
    # YOUR except clause GOES here
    except Exception as e:
        print(e)
    
        


Here are example outputs for different invocations of the function:

In [307]:
# invalid inputs
find_airport()

Provide an airport name (name=) or country (country=) or 3-letter airport code (airport_code)


In [308]:
# invalid filename
find_airport(filename="us_airports.dat")

[Errno 2] No such file or directory: 'us_airports.dat'


In [309]:
# invalid airport code
find_airport(airport_code="Raleigh")

Error with Raleigh: Invalid airport code: should be three letters


In [310]:
find_airport(airport_code="OHR")

HEADER: airportid name city country IATA_FAA ICAO latitude longitude altitude timezone dst tz
FOUND:  ['8770', 'Wyk auf Foehr', 'Wyk', 'Germany', 'OHR', '\\N', '54.411', '8.3145', '24', '1', 'E', 'Europe/Berlin']
RESULTS: 


['Wyk auf Foehr (Wyk, Germany) Abbr: OHR']

In [311]:
find_airport(airport_code='ord')

HEADER: airportid name city country IATA_FAA ICAO latitude longitude altitude timezone dst tz
FOUND:  ['3830', 'Chicago Ohare Intl', 'Chicago', 'United States', 'ORD', 'KORD', '41.978603', '-87.904842', '668', '-6', 'A', 'America/Chicago']
RESULTS: 


['Chicago Ohare Intl (Chicago, United States) Abbr: ORD']

In [312]:
find_airport(airport_code='ORD')

HEADER: airportid name city country IATA_FAA ICAO latitude longitude altitude timezone dst tz
FOUND:  ['3830', 'Chicago Ohare Intl', 'Chicago', 'United States', 'ORD', 'KORD', '41.978603', '-87.904842', '668', '-6', 'A', 'America/Chicago']
RESULTS: 


['Chicago Ohare Intl (Chicago, United States) Abbr: ORD']

In [313]:
find_airport(airport_code='abcd')

Error with abcd: Invalid airport code: should be three letters


In [314]:
find_airport(airport_code='123')

Error with 123: Invalid airport code: should be three letters


In [315]:
find_airport(airport_code='foo')

HEADER: airportid name city country IATA_FAA ICAO latitude longitude altitude timezone dst tz
RESULTS: 


[]