# Efficient Vaccine Distribution
### Delivering Excess Vaccine Supply to Most Needed Locations

The CDC publishes extensive daily data on vaccine delivery, availability, and administration. From this, an approximate "excess" supply can be calculated at the state level for each state in the US. Further, from standard data on vaccine expirations, we can calculate excess supply which is at risk of expiring. *What if we could donate the excess supply to needy countries?* This would make good use of a sunk-cost as well as vaccinate more people globally, and reduce the risk of newer, more deadly variants which could harm us all.

How would we efficiently get the vaccines to the destinations given tight vaccine expiration timeframes? We can use commercial ariline networks! All data on airline networks and routes are public and we can create a graph from this.

#### Required Data
To achieve this logistical optimization, we'll need several datasets. The eight key datasets are listed below:

* Airlines, IATA Codes, and Callsigns
https://kinetica-community.s3.amazonaws.com/vaccine-distro/airlines.csv

* Airports, Countries serviced, and Geo-coordinates
https://kinetica-community.s3.amazonaws.com/vaccine-distro/airport-to-country-map.csv

* Airport IATA Codes
https://kinetica-community.s3.amazonaws.com/vaccine-distro/airports.csv

* US Vaccine Statistics -- Availability, Usage (to calculate excess)
https://kinetica-community.s3.amazonaws.com/vaccine-distro/vaccine-us.csv

* Map of Country ISO Alpha-2 to Alpha-3 for Cross-Dataset Mapping
https://kinetica-community.s3.amazonaws.com/vaccine-distro/map_iso_alpha2_alpha3.csv

* Global COVID Statistics (to calculate need/demand for vaccine) 
https://kinetica-community.s3.amazonaws.com/vaccine-distro/owid-covid-data.csv

* Airport-to-Airport Routes and Flight Map
https://kinetica-community.s3.amazonaws.com/vaccine-distro/routes.csv

* List of US Airports (to segment "excess supply" locations)
https://kinetica-community.s3.amazonaws.com/vaccine-distro/us-airports.csv    

In [1]:
import csv
import gpudb

First, we'll take the airports and airline routes data above and create a graph from it. More information about creating graphs is available at https://docs.kinetica.com/7.1/graph_solver/network_graph_solver/. Below we work through it step-by-step. Start by downloading the airports and routes file referenced above (we put it into a "./data" directory.) Our goal will be to create nodes and edges files so we can create a graph which will drive logistics.

In [6]:
INPUT_FILE_AIRPORTS = "data_originals/airports.csv"
INPUT_FILE_ROUTES = "data_originals/routes.csv"

OUTPUT_FILE_NODES = "out_nodes.csv"
OUTPUT_FILE_EDGES = "out_edges.csv"

FIELDS_NODES = ["NODE_ID",
                "NODE_X",
                "NODE_Y",
                "NODE_NAME",
                "NODE_WKTPOINT",
                "NODE_LABEL",
                "IATA",
                "ICAO",
                "CITY",
                "COUNTRY"]

FIELDS_EDGES = ["EDGE_ID",
                "EDGE_NODE1_ID",
                "EDGE_NODE2_ID",
                "EDGE_WKTLINE",
                "EDGE_NODE1_X",
                "EDGE_NODE1_Y",
                "EDGE_NODE2_X",
                "EDGE_NODE2_Y",
                "EDGE_NODE1_WKTPOINT",
                "EDGE_NODE2_WKTPOINT",
                "EDGE_NODE1_NAME",
                "EDGE_NODE2_NAME",
                "EDGE_DIRECTION",
                "EDGE_LABEL",
                "EDGE_WEIGHT_VALUESPECIFIED"]

In [7]:
# Helper function to simplify strings
def cleanse(in_str):
    out_str = in_str
    out_str.replace(",", "")
    out_str.replace("'", "")
    return out_str

# TODO: this is just a rough measure, a stand-in for now
# https://stackoverflow.com/questions/19412462/getting-distance-between-two-points-based-on-latitude-longitude
def rough_distance(slon, slat, dlon, dlat):
    from math import sin, cos, sqrt, atan2, radians
    # approximate radius of earth in km
    R = 6373.0
    lat1 = radians(float(slat))
    lon1 = radians(float(slon))
    lat2 = radians(float(dlat))
    lon2 = radians(float(dlon))
    dlon = lon2 - lon1
    dlat = lat2 - lat1
    a = sin(dlat / 2)**2 + cos(lat1) * cos(lat2) * sin(dlon / 2)**2
    c = 2 * atan2(sqrt(a), sqrt(1 - a))
    distance = R * c
    return distance    

In [8]:
lookup_airport = {}
airport_nodes = []

## ------------------------------------------------------------------------
## Create Lookup table for Edge Enrichment

input_file = csv.DictReader(open(INPUT_FILE_AIRPORTS))
for row in input_file:
    lookup_airport[row['AIRPORT_ID']] = {
        'AIRPORT_ID': row['AIRPORT_ID'],
        'NAME': cleanse(row['NAME']),
        'CITY': cleanse(row['CITY']),
        'COUNTRY': cleanse(row['COUNTRY']),
        'IATA': row['IATA'],
        'ICAO': row['ICAO'],
        'LATITUDE': row['LATITUDE'],
        'LONGITUDE': row['LONGITUDE']
    }

    ## ------------------------------------------------------------------------
    ## Create Nodes

    if row['AIRPORT_ID'] == "\\N":
        continue
    nlon = row['LONGITUDE']
    nlat = row['LATITUDE']
    if row['IATA']=="\\N":
        row['IATA']=None
        nodename = f"{row['ICAO']}: {cleanse(row['NAME'])}"
        nodelabel = f"{row['ICAO']}: {cleanse(row['NAME'])}; {cleanse(row['CITY'])}, {cleanse(row['COUNTRY'])}"
    else:
        nodename = f"{row['IATA']}: {cleanse(row['NAME'])}"
        nodelabel = f"{row['IATA']}: {cleanse(row['NAME'])}; {cleanse(row['CITY'])}, {cleanse(row['COUNTRY'])}"

    persistable = {
        "NODE_ID": row['AIRPORT_ID'],
        "NODE_X": nlon,
        "NODE_Y": nlat,
        "NODE_NAME": nodename,
        "NODE_WKTPOINT": f"POINT({nlon} {nlat})",
        "NODE_LABEL": nodelabel,
        "IATA": row['IATA'],
        "ICAO": row['ICAO'],
        "CITY": cleanse(row['CITY']),
        "COUNTRY": cleanse(row['COUNTRY'])
    }
    airport_nodes.append(persistable)
    #print(f"Adding node {persistable['NODE_LABEL']}")

print(f"Writing {len(airport_nodes)} rows")

with open(OUTPUT_FILE_NODES, 'w', newline='\n') as csvfile:        
    writer = csv.DictWriter(csvfile, fieldnames=FIELDS_NODES)
    writer.writeheader()
    for n in airport_nodes:
        writer.writerow(n)

## ------------------------------------------------------------------------
## Create Edges

inter_airport_network_edges = []
edge_id = 10000

input_file = csv.DictReader(open(INPUT_FILE_ROUTES))
for row in input_file:
    edge_id = edge_id + 1
    if row['SOURCE_AIRPORT_ID'] == "\\N":
        print(f"Warn source airport {row['SOURCE_AIRPORT_ID']} is Null, skipping...")
        continue
    if row['DEST_AIRPORT_ID'] == "\\N":
        print(f"Warn source airport {row['DEST_AIRPORT_ID']} is Null, skipping...")
        continue
    if str(row['SOURCE_AIRPORT_ID']) not in lookup_airport:
        print(f"Warn source airport {row['SOURCE_AIRPORT_ID']} not found in Airports lookup table, skipping...")
        continue
    if str(row['DEST_AIRPORT_ID']) not in lookup_airport:
        print(f"Warn destination airport {row['SOURCE_AIRPORT_ID']} not found in Airports lookup table, skipping...")
        continue
    if lookup_airport[row['SOURCE_AIRPORT_ID']]['COUNTRY'] == lookup_airport[row['DEST_AIRPORT_ID']]['COUNTRY']:
        # skipping domestic flight
        continue
    slon = lookup_airport[row['SOURCE_AIRPORT_ID']]['LONGITUDE']
    slat = lookup_airport[row['SOURCE_AIRPORT_ID']]['LATITUDE']
    dlon = lookup_airport[row['DEST_AIRPORT_ID']]['LONGITUDE']
    dlat = lookup_airport[row['DEST_AIRPORT_ID']]['LATITUDE']
    persistable = {
        "EDGE_ID": edge_id,
        "EDGE_NODE1_ID": row['SOURCE_AIRPORT_ID'],
        "EDGE_NODE2_ID": row['DEST_AIRPORT_ID'],
        "EDGE_WKTLINE": f"LINESTRING({slon} {slat}, {dlon} {dlat})",
        "EDGE_NODE1_X": slon,
        "EDGE_NODE1_Y": slat,
        "EDGE_NODE2_X": dlon,
        "EDGE_NODE2_Y": dlat,
        "EDGE_NODE1_WKTPOINT": f"POINT({slon} {slat})",
        "EDGE_NODE2_WKTPOINT": f"POINT({dlon} {dlat})",
        "EDGE_NODE1_NAME": f"{lookup_airport[row['SOURCE_AIRPORT_ID']]['IATA']}: {lookup_airport[row['SOURCE_AIRPORT_ID']]['NAME']}",
        "EDGE_NODE2_NAME": f"{lookup_airport[row['DEST_AIRPORT_ID']]['IATA']}: {lookup_airport[row['DEST_AIRPORT_ID']]['NAME']}",
        "EDGE_DIRECTION": "0",
        "EDGE_LABEL": f"'{row['AIRLINE']} {row['AIRLINE_ID']} from {lookup_airport[row['SOURCE_AIRPORT_ID']]['IATA']} --> {lookup_airport[row['DEST_AIRPORT_ID']]['IATA']}'",
        "EDGE_WEIGHT_VALUESPECIFIED": rough_distance(slon, slat, dlon, dlat)
    }
    inter_airport_network_edges.append(persistable)
    #print(f"Adding edge {persistable['EDGE_LABEL']}")

print(f"Writing {len(inter_airport_network_edges)} rows")

with open(OUTPUT_FILE_EDGES, 'w', newline='\n') as csvfile:        
    writer = csv.DictWriter(csvfile, fieldnames=FIELDS_EDGES)
    writer.writeheader()
    for i in inter_airport_network_edges:
        writer.writerow(i)

FileNotFoundError: [Errno 2] No such file or directory: 'data_originals/airports.csv'