# Assigning custom names to our routes

Since our *routes* dataframe contains parsed data from a wide variety of sources the route names can vary greatly both in accuracy and grammar. To solve this issue in this notebook we will be generating new route names based on the mountain passes and town that each route visits.

In [1]:
import pandas as pd

In [3]:
#Importing our routes dataframe.

routes = pd.read_csv('routes_1607_819.csv')

In [4]:
routes.head()

Unnamed: 0,ID,name,ccaa,province,start,midpoint,trailrank,distance,gradient,min_alt,max_alt,municipality,mountain_passes_ids,municipalities_ids
0,923,"ANGLIRU, CIRCULAR DESDE LA PLAZA, TEVERGA",,,"[-6.101982,43.158859]","[-5.939921,43.235847]",67,124,3476,101,1566,,[0],
1,5611,"Pola de Lena, Cobertoria, Gamoniteiro, Tenebre...",,,"[-5.8297,43.155729]","[-5.929957,43.288199]",51,118,4234,102,1700,,"[0, 1, 84, 131]",
2,5490,PEÑA ESCRITA (POR ALMUÑECAR),,,"[-3.743127,36.734975]","[-3.762692,36.818439]",42,45,1481,6,1191,,[2],
3,881,Ancares-Pandozarco,,,"[-7.157974,42.852246]","[-6.844199,42.889535]",55,130,2861,289,1651,,"[3, 182, 1109]",
4,5618,POLA DE LENA - PUERTO DE PAJARES - CUITU NEGRU...,,,"[-5.806177,43.128166]","[-5.829091,43.083221]",42,121,2917,344,1824,,"[4, 51, 69, 438]",


# Adding municipalities to each route

To create a better naming scheme we need to know which towns are near every route. For this purpose we will be using our *towns* dataframe.

In [8]:
towns = pd.read_csv('towns_1807_841n.csv')

In [9]:
towns.head()

Unnamed: 0,ID,municipality,ccaa,province,municipality_inhabitants,geographic_area,radius,routes_number,routes_ids,mountain_passes_ids,coords
0,884,Barcelona,Cataluña,Barcelona,1664182,100.7644,5.663411,4,"[1292, 1732, 6228, 8149]",,"(41.38424664,2.17634927)"
1,7257,València,Comunitat Valenciana,Valencia,800215,139.2687,6.658115,9,"[528, 1469, 1472, 2478, 5225, 7040, 7231, 7734...",,"(39.47534441,-0.37565717)"
2,4547,Málaga,Andalucía,Málaga,578460,395.7069,11.223062,10,"[2933, 4541, 4546, 5035, 5997, 5998, 8379, 841...",,"(36.72034267,-4.41997511)"
3,4613,Murcia,Región de Murcia,Murcia,459403,885.1149,16.785117,7,"[691, 1691, 6089, 6769, 7049, 8099, 8196]",,"(37.98436361,-1.1285408)"
4,7518,Bilbao,País Vasco,Bizkaia,350184,41.3426,3.627634,42,"[32, 181, 182, 1070, 1280, 1282, 1504, 1572, 1...",,"(43.25721957,-2.92390606)"


In [21]:
#We will be using nested loops to add each town ID to the routes dataframe.

for i in range(len(routes)):
    town_list = []
    for n in range(len(towns)):
        for p in eval(towns['routes_ids'].iloc[n]):
            if p == routes['ID'].iloc[i]:
                town_list.append(towns['ID'].iloc[n])
    routes['municipalities_ids'].iloc[i] = str(town_list)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  self._setitem_single_block(indexer, value, name)


In [22]:
routes.head()

Unnamed: 0,ID,name,ccaa,province,start,midpoint,trailrank,distance,gradient,min_alt,max_alt,municipality,mountain_passes_ids,municipalities_ids
0,923,"ANGLIRU, CIRCULAR DESDE LA PLAZA, TEVERGA",,,"[-6.101982,43.158859]","[-5.939921,43.235847]",67,124,3476,101,1566,,[0],"[5039, 5032, 5027, 5020, 5033, 5053, 5052, 506..."
1,5611,"Pola de Lena, Cobertoria, Gamoniteiro, Tenebre...",,,"[-5.8297,43.155729]","[-5.929957,43.288199]",51,118,4234,102,1700,,"[0, 1, 84, 131]","[5039, 5032, 5027, 5033, 5053, 5052, 5067, 504..."
2,5490,PEÑA ESCRITA (POR ALMUÑECAR),,,"[-3.743127,36.734975]","[-3.762692,36.818439]",42,45,1481,6,1191,,[2],[]
3,881,Ancares-Pandozarco,,,"[-7.157974,42.852246]","[-6.844199,42.889535]",55,130,2861,289,1651,,"[3, 182, 1109]","[4245, 5022, 4267, 4277]"
4,5618,POLA DE LENA - PUERTO DE PAJARES - CUITU NEGRU...,,,"[-5.806177,43.128166]","[-5.829091,43.083221]",42,121,2917,344,1824,,"[4, 51, 69, 438]","[5027, 3748]"


# Creating a function that re-names our routes

Now that we have all mountain passes and towns that each route visits we can build a nice naming structure. This naming schema must meet the following criteria:

1. Mention all mountain passes visited.
2. Name all towns that have access to this route.
3. Make sense gramatically.


The schema will vary depending on the number of ports and towns:

**1 port 1 town:** *port* por *town*.

**2 ports 1 town:** *port1* y *port2* por *town*.

**2 ports 2 towns:** *port1* y *port2* por *town1* y *town2*.

**> 2 ports > 2 towns:** *port1*, *port2*, (...) *port_n* por *town1*, *town2*, (...) y *town_n*.

## Importing our dataframes

We are now using a smaller *towns* dataframe since we filtered the destinations in notebook 4.

In [4]:
routes = pd.read_csv('routes_1607_819.csv')

In [6]:
towns = pd.read_csv('towns_1907_201.csv')

In [15]:
routes.head(1)

Unnamed: 0,ID,name,ccaa,province,start,midpoint,trailrank,distance,gradient,min_alt,max_alt,municipality,mountain_passes_ids,municipalities_ids
0,923,"ANGLIRU, CIRCULAR DESDE LA PLAZA, TEVERGA",,,"[-6.101982,43.158859]","[-5.939921,43.235847]",67,124,3476,101,1566,,[0],"[5039, 5027, 5020, 5067]"


In [14]:
towns.head(1)

Unnamed: 0,ID,municipality,ccaa,province,municipality_inhabitants,geographic_area,radius,routes_number,routes_ids,mountain_passes_ids,coords,coords_MDB
0,884,Barcelona,Cataluña,Barcelona,1664182,100.7644,5.663411,4,"[1292, 1732, 6228, 8149]",,"(41.38424664,2.17634927)","[2.17634927,41.38424664]"


## Adding municipalities to each route

We will simply be using the previous loop to add municipalities to each route.

In [11]:
for i in range(len(routes)):
    town_list = []
    for n in range(len(towns)):
        for p in eval(towns['routes_ids'].iloc[n]):
            if p == routes['ID'].iloc[i]:
                town_list.append(towns['ID'].iloc[n])
    routes['municipalities_ids'].iloc[i] = str(town_list)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  self._setitem_single_block(indexer, value, name)


In [17]:
routes.head(1)

Unnamed: 0,ID,name,ccaa,province,start,midpoint,trailrank,distance,gradient,min_alt,max_alt,municipality,mountain_passes_ids,municipalities_ids
0,923,"ANGLIRU, CIRCULAR DESDE LA PLAZA, TEVERGA",,,"[-6.101982,43.158859]","[-5.939921,43.235847]",67,124,3476,101,1566,,[0],"[5039, 5027, 5020, 5067]"


In [22]:
ports = pd.read_csv('puertos_i.csv')

In [23]:
ports.head(1)

Unnamed: 0,ID,name,province,municipality,altitude,gradient,distance,mountain_slope,technical_difficulty,url,peak_coords,photo
0,0,Angliru,Asturias,Santa Eulalia,1570,1423,18.0,7.0,528,https://www.altimetrias.net/aspbk/verPuerto.as...,"[-5.94178,43.221596]",


## Creating the function

Our job will be made easier by the use of two separate functions. The first one will take every route and generate a list of ports and towns, while the second one will generate a name based on those lists and return it.

In [48]:
#Defining our first function:

def name_creator(routes, towns, ports):
    """
    Input : dataframes of routes, towns and ports.
    
    Output: dataframe with the custom names containing both ports and towns.
    
    """
    routes = routes.copy()
    for i in range(len(routes)): #Iterating through each route.
        list_ports = eval(routes['mountain_passes_ids'].iloc[i]) #Generating a list of ports.
        list_towns = eval(routes['municipalities_ids'].iloc[i]) #The same procedure for the towns.
        routes['name'].iloc[i] = composer(list_ports, list_towns, towns, ports) #Assigning the name returned by the second function.
        
    return routes

In [88]:
#Defining the second one:

def composer(list_ports, list_towns, towns, ports):
    """
    Input : two lists of port and town IDs, towns and ports dataframes.
    
    Output: custom name containing those ports and towns.
    
    """
    if len(list_ports) == 1 and len(list_towns) == 1: #First case, 1 port 1 town.
        return ports[ports['ID'] == list_ports[0]]['name'].iloc[0] + ' por ' + towns[towns['ID'] == list_towns[0]]['municipality'].iloc[0] + '.'
    elif len(list_ports) == 2 and len(list_towns) == 1: #Second case, 2 ports 1 town.
        return ports[ports['ID'] == list_ports[0]]['name'].iloc[0] + ' y ' + ports[ports['ID'] == list_ports[1]]['name'].iloc[0] + ' por ' + towns[towns['ID'] == list_towns[0]]['municipality'].iloc[0] + '.'
    elif len(list_ports) == 2 and len(list_towns) == 2: #Third case, 2 ports 2 towns.
        return ports[ports['ID'] == list_ports[0]]['name'].iloc[0] + ' y ' + ports[ports['ID'] == list_ports[1]]['name'].iloc[0] + ' por ' + towns[towns['ID'] == list_towns[0]]['municipality'].iloc[0] + ' y ' + towns[towns['ID'] == list_towns[1]]['municipality'].iloc[0] + '.'
    elif len(list_ports) > 2 or len(list_towns) > 2:
        name = ports[ports['ID'] == list_ports[0]]['name'].iloc[0]
        for i in range(len(list_ports[1:-1])): #Iterating through all ports minus the first and last ones.
            name = name + ', ' + ports[ports['ID'] == list_ports[i]]['name'].iloc[0] #Adding each port's name.
        name = name + ' y' + ports[ports['ID'] == list_ports[-1]]['name'].iloc[0] + ' por ' #Adding the last port.
        name = name + towns[towns['ID'] == list_towns[0]]['municipality'].iloc[0]
        for n in range(len(list_towns[1:-1])):
            name = name + ', ' + towns[towns['ID'] == list_towns[i]]['municipality'].iloc[0] #Adding each town's name.
        name = name + ' y' + towns[towns['ID'] == list_towns[-1]]['municipality'].iloc[0] + '.'
    return name

In [100]:
#Defining the second one:

def composer(list_ports, list_towns, towns, ports):
    """
    Input : two lists of port and town IDs, towns and ports dataframes.
    
    Output: custom name containing those ports and towns.
    
    """
    try:
        if len(list_ports) == 1 and len(list_towns) == 1: #First case, 1 port 1 town.
            name_r = ports[ports['ID'] == list_ports[0]]['name'].iloc[0] + ' por ' + towns[towns['ID'] == list_towns[0]]['municipality'].iloc[0] + '.'
        elif len(list_ports) == 2 and len(list_towns) == 1: #Second case, 2 ports 1 town.
            name_r = ports[ports['ID'] == list_ports[0]]['name'].iloc[0] + ' y ' + ports[ports['ID'] == list_ports[1]]['name'].iloc[0] + ' por ' + towns[towns['ID'] == list_towns[0]]['municipality'].iloc[0] + '.'
        elif len(list_ports) == 2 and len(list_towns) == 2: #Third case, 2 ports 2 towns.
            name_r = ports[ports['ID'] == list_ports[0]]['name'].iloc[0] + ' y ' + ports[ports['ID'] == list_ports[1]]['name'].iloc[0] + ' por ' + towns[towns['ID'] == list_towns[0]]['municipality'].iloc[0] + ' y ' + towns[towns['ID'] == list_towns[1]]['municipality'].iloc[0] + '.'
        elif len(list_ports) > 2 and len(list_towns) == 2:
        
        
        
        elif len(list_ports) > 2 or len(list_towns) > 2:
            name_r = ports[ports['ID'] == list_ports[0]]['name'].iloc[0]
            for i in range(len(list_ports[1:-1])): #Iterating through all ports minus the first and last ones.
                name_r = name_r + ', ' + ports[ports['ID'] == list_ports[i]]['name'].iloc[0] #Adding each port's name.
            name_r = name_r + ' y ' + ports[ports['ID'] == list_ports[-1]]['name'].iloc[0] + ' por ' #Adding the last port.
            name_r = name_r + towns[towns['ID'] == list_towns[0]]['municipality'].iloc[0]
            for n in range(len(list_towns[1:-1])):
                name_r = name_r + ', ' + towns[towns['ID'] == list_towns[n]]['municipality'].iloc[0] #Adding each town's name.
            name_r = name_r + ' y ' + towns[towns['ID'] == list_towns[-1]]['municipality'].iloc[0] + '.'
        return name_r
    except:
        pass

In [101]:
test = name_creator(routes, towns, ports)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  self._setitem_single_block(indexer, value, name)
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  self._setitem_single_block(indexer, value, name)
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  self._setitem_single_block(indexer, value, name)
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  self._setitem_single_

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  self._setitem_single_block(indexer, value, name)
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  self._setitem_single_block(indexer, value, name)
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  self._setitem_single_block(indexer, value, name)
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  self._setitem_single_

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  self._setitem_single_block(indexer, value, name)
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  self._setitem_single_block(indexer, value, name)
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  self._setitem_single_block(indexer, value, name)
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  self._setitem_single_

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  self._setitem_single_block(indexer, value, name)
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  self._setitem_single_block(indexer, value, name)
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  self._setitem_single_block(indexer, value, name)
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  self._setitem_single_

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  self._setitem_single_block(indexer, value, name)
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  self._setitem_single_block(indexer, value, name)
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  self._setitem_single_block(indexer, value, name)
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  self._setitem_single_

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  self._setitem_single_block(indexer, value, name)
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  self._setitem_single_block(indexer, value, name)
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  self._setitem_single_block(indexer, value, name)
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  self._setitem_single_

In [103]:
test.head(10)

Unnamed: 0,ID,name,ccaa,province,start,midpoint,trailrank,distance,gradient,min_alt,max_alt,municipality,mountain_passes_ids,municipalities_ids
0,923,,,,"[-6.101982,43.158859]","[-5.939921,43.235847]",67,124,3476,101,1566,,[0],"[5039, 5027, 5020, 5067]"
1,5611,"Angliru, Angliru, Gamoniteiro y El Cordal por ...",,,"[-5.8297,43.155729]","[-5.929957,43.288199]",51,118,4234,102,1700,,"[0, 1, 84, 131]","[5039, 5027, 5067]"
2,5490,,,,"[-3.743127,36.734975]","[-3.762692,36.818439]",42,45,1481,6,1191,,[2],[]
3,881,,,,"[-7.157974,42.852246]","[-6.844199,42.889535]",55,130,2861,289,1651,,"[3, 182, 1109]","[4245, 5022]"
4,5618,,,,"[-5.806177,43.128166]","[-5.829091,43.083221]",42,121,2917,344,1824,,"[4, 51, 69, 438]",[5027]
5,3467,,,,"[-3.609443,37.156292]","[-3.275512,36.854694]",89,158,2450,242,1186,,"[5, 13]","[2747, 2806, 2818]"
6,5630,,,,"[-6.598849,42.556634]","[-6.708998,42.29604]",61,157,3201,342,1960,,"[6, 17, 32, 407]","[3723, 4910]"
7,4740,,,,"[-3.770706,40.812061]","[-3.996795,40.785964]",46,100,2354,1035,2251,,"[7, 148, 157, 229]",[4416]
8,8329,,,,"[0.15549,40.873539]","[0.368356,40.826121]",85,259,5111,0,1438,,"[8, 435]","[6423, 6784, 1889, 6450]"
9,6550,,,,"[-4.609893,43.15849]","[-4.602693,43.170711]",39,99,2779,94,1101,,"[9, 657, 838]",[5832]


In [65]:
l = [1,2,3,4]
l[1:-1]

[2, 3]

In [85]:
name = ''
name = name + 'hola' + ' ' + 'test'
print(name)

hola test
