# Deleting ports with no routes

After cleaning up our routes database, there's some ports that are missing routes (temporarily). We must delete them and clean up all dataframes so thta we can work with them in **MongoDB**.

## Importing files and exploration

In [1]:
import pandas as pd

In [4]:
ports = pd.read_csv('puertos_i.csv')

In [5]:
ports.head()

Unnamed: 0,ID,name,province,municipality,altitude,gradient,distance,mountain_slope,technical_difficulty,url,peak_coords,photo
0,0,Angliru,Asturias,Santa Eulalia,1570,1423,18.0,7.0,528,https://www.altimetrias.net/aspbk/verPuerto.as...,"[-5.94178,43.221596]",
1,1,Gamoniteiro,Asturias,Pola-Cobertoria,1772,1465,15.0,9.0,492,https://www.altimetrias.net/aspbk/verPuerto.as...,"[-5.923458,43.18786]",
2,2,Peña Escrita,Granada,Torrecuevas,1200,1150,13.0,8.0,462,https://www.altimetrias.net/aspbk/verPuerto.as...,"[-3.771034,36.818155]",
3,3,Ancares,Lugo,Sª Morela-Balouta,1670,1355,36.0,3.0,427,https://www.altimetrias.net/aspbk/verPuerto.as...,"[-6.818333,42.868532]",
4,4,Pajares-Cuitu Negru,Asturias,Campomanes,1843,1466,25.0,5.0,394,https://www.altimetrias.net/aspbk/verPuerto.as...,"[-5.788388,42.96824]",


In [6]:
routes = pd.read_csv('routes_2307_598.csv')

In [7]:
routes.head()

Unnamed: 0,ID,name,ccaa,province,start,midpoint,trailrank,distance,gradient,min_alt,max_alt,municipality,mountain_passes_ids,municipalities_ids,difficulty_score
0,923,Angliru por Oviedo y Lena.,,,"[-6.101982,43.158859]","[-5.939921,43.235847]",67,124,3476,101,1566,,[0],"[5039, 5027, 5020, 5067]",8
1,5611,Angliru y Gamoniteiro por Oviedo y Lena.,,,"[-5.8297,43.155729]","[-5.929957,43.288199]",51,118,4234,102,1700,,"[0, 1, 84, 131]","[5039, 5027, 5067]",10
2,881,Ancares y Sierra De Morela por Cervantes y Ibias.,,,"[-7.157974,42.852246]","[-6.844199,42.889535]",55,130,2861,289,1651,,"[3, 182, 1109]","[4245, 5022]",7
3,5618,Pajares-Cuitu Negru y La Cubilla por Lena.,,,"[-5.806177,43.128166]","[-5.829091,43.083221]",42,121,2917,344,1824,,"[4, 51, 69, 438]",[5027],7
4,3467,Puerto Camacho y Haza Del Lino por Dúrcal y Ór...,,,"[-3.609443,37.156292]","[-3.275512,36.854694]",89,158,2450,242,1186,,"[5, 13]","[2747, 2806, 2818]",7


## Seeking out missing ports

Let's compare a list of all port IDs with the ones present in the *routes* dataframe.

In [8]:
#Creating a list of all ports.

ports_list = ports['ID'].tolist()

In [12]:
#Let's check how many of them are there.

len(ports_list)

1123

In [9]:
#Creating a list of all individual ports present in our routes.

routes_ports = []

for i in range(len(routes)):
    for n in eval(routes['mountain_passes_ids'].iloc[i]):
        if n not in routes_ports:
            routes_ports.append(n)

In [11]:
#Let's check how many ports we have in our routes dataframe.

len(routes_ports)

643

In [13]:
#Now it's time to create a new list with the missing ones so that we can filter them out.

missing_ports = []

for i in ports_list:
    if i not in routes_ports:
        missing_ports.append(i)

In [15]:
len(missing_ports)

480

## Dropping our missing ports

In [18]:
#Since port ID matches the index position, we can simply drop them like so.

for i in missing_ports:
    ports = ports[ports['ID'] != i]

In [23]:
#Checking our results.

ports.head()

Unnamed: 0,ID,name,province,municipality,altitude,gradient,distance,mountain_slope,technical_difficulty,url,peak_coords,photo
0,0,Angliru,Asturias,Santa Eulalia,1570,1423,18.0,7.0,528,https://www.altimetrias.net/aspbk/verPuerto.as...,"[-5.94178,43.221596]",
1,1,Gamoniteiro,Asturias,Pola-Cobertoria,1772,1465,15.0,9.0,492,https://www.altimetrias.net/aspbk/verPuerto.as...,"[-5.923458,43.18786]",
3,3,Ancares,Lugo,Sª Morela-Balouta,1670,1355,36.0,3.0,427,https://www.altimetrias.net/aspbk/verPuerto.as...,"[-6.818333,42.868532]",
4,4,Pajares-Cuitu Negru,Asturias,Campomanes,1843,1466,25.0,5.0,394,https://www.altimetrias.net/aspbk/verPuerto.as...,"[-5.788388,42.96824]",
5,5,Puerto Camacho,Granada,Órgiva-Los Tablones,1873,1551,20.0,7.0,384,https://www.altimetrias.net/aspbk/verPuerto.as...,"[-3.3678194,36.8412261]",


In [26]:
#Finally, reindexing and exporting our dataframe.

ports = ports.reset_index(drop=True)
ports.to_csv('ports_643.csv', index=False)