# Origin Destination Dataset

This notebook presents an analysis of the origin destination datasets provided by the Sao Paulo government. The datasets contain information about how people move through the city in a tipycal workday, whether to go for work, school or to chill. Also, there are some socio-economic data regarding family income, gender, age and other information that we use to analyze the behaviour of people's trajectories.

## Datasets

We will be using the datasets from the Origin Destination census from 2017.

## Variables of Interest

The dataset contain dozens of variables, we focus our analysis in a small group to gather insights from the behaviour of the population. Here we give a brief description of those. Furthermore we start with a simple analysis on each one until we get to deeper insights about how people move in a typical day of the week.

 - **ZONA_O**: The origin;
 - **ZONA_D**: The destination zone;
 - **MODO_PRIN**: Main transport mode used in the trip;
 - **COORD_X_ORIGIN**: Latitude in the origin;
 - **COORD_Y_ORIGIN**: Longitude in the origin;
 - **COORD_X_DESTINATION**: Latitude in the destination;
 - **COORD_Y_DESTINATION**: Longitude in the destination;
 - **FEVIAG**: Expansion factor of the trajectory that represents how many more trips like that may exist.
 - **CD_ENTRE/ID_VIAG**: Binary flag to register if the interviewed person declared a trip. We want discard registers without trips.

In [1]:
# General Imports
import geopandas as gpd #pip install geopandas descartes
import pandas as pd #pip install pandas
import matplotlib.pyplot as plt # pip install matplotlib
import numpy as np

from multiprocessing import Pool
from dbfread import DBF #pip install dbfread
from simpledbf import Dbf5 #pip install simpledbf

# Local imports
import utils

PyTables is not installed. No support for HDF output.
SQLalchemy is not installed. No support for SQL output.


# Loading the dataset

First we load the data from the 2017 dataset.

In [2]:
# Read the spec file from 2017
entry2017 = utils.load_spec("../datasets/od2017/od-spec.json")

print(entry2017)

{'title': 'Pesquisa Origem Destino 2017', 'base_path': '../datasets/od2017/', 'zones_shapefile': '../datasets/od2017/raw/Mapas/Shape/Zonas_2017_region.shp', 'trips_dbfile': '../datasets/od2017/raw/Banco de dados/OD_2017.dbf', 'zone_id_attr': 'NumeroZona', 'zone_name_attr': 'NomeZona'}


In [3]:
# a function to read the DBF file from a given dataset entry and return a Dataframe
# containing the N trips specified by chuncksize
def read_trips(entry, chunksize=None):

    dbf_file = entry['trips_dbfile']
    dbf = Dbf5(dbf_file)

    raw_trips = None
    
    if(chunksize == None):
        raw_trips = dbf.to_dataframe()
    else:
        raw_trips=[]
        trips_iterator = dbf.to_dataframe(chunksize=chunksize)
        
        for trip in trips_iterator:
            raw_trips.append(trip)
        
            break
    
        raw_trips = raw_trips[0]
 
    print('Scanned trips from:', entry['title'])
    return raw_trips

### Loading the datasets

We first read the datasets for the given year that we want analyse. We can load the entire dataset or just
a sample.

In [57]:
SAMPLE_SIZE = 500
trips2017 = read_trips(entry2017, SAMPLE_SIZE)

Scanned trips from: Pesquisa Origem Destino 2017


In [62]:
utils.full_print(trips2017.head())
trips2017.to_csv('mytrips.csv', index=False)
reduced_trips = trips2017.dropna(subset=['CO_O_X', 'CO_O_Y', 'CO_D_X', 'CO_D_Y'])

Unnamed: 0,ZONA,MUNI_DOM,CO_DOM_X,CO_DOM_Y,ID_DOM,F_DOM,FE_DOM,DOM,CD_ENTRE,DATA,TIPO_DOM,AGUA,RUA_PAVI,NO_MORAD,TOT_FAM,ID_FAM,F_FAM,FE_FAM,FAMILIA,NO_MORAF,CONDMORA,QT_BANHO,QT_EMPRE,QT_AUTO,QT_MICRO,QT_LAVALOU,QT_GEL1,QT_GEL2,QT_FREEZ,QT_MLAVA,QT_DVD,QT_MICROON,QT_MOTO,QT_SECAROU,QT_BICICLE,NAO_DCL_IT,CRITERIOBR,PONTO_BR,ANO_AUTO1,ANO_AUTO2,ANO_AUTO3,RENDA_FA,CD_RENFA,ID_PESS,F_PESS,FE_PESS,PESSOA,SIT_FAM,IDADE,SEXO,ESTUDA,GRAU_INS,CD_ATIVI,CO_REN_I,VL_REN_I,ZONA_ESC,MUNIESC,CO_ESC_X,CO_ESC_Y,TIPO_ESC,ZONATRA1,MUNITRA1,CO_TR1_X,CO_TR1_Y,TRAB1_RE,TRABEXT1,OCUP1,SETOR1,VINC1,ZONATRA2,MUNITRA2,CO_TR2_X,CO_TR2_Y,TRAB2_RE,TRABEXT2,OCUP2,SETOR2,VINC2,N_VIAG,FE_VIA,DIA_SEM,TOT_VIAG,ZONA_O,MUNI_O,CO_O_X,CO_O_Y,ZONA_D,MUNI_D,CO_D_X,CO_D_Y,ZONA_T1,MUNI_T1,CO_T1_X,CO_T1_Y,ZONA_T2,MUNI_T2,CO_T2_X,CO_T2_Y,ZONA_T3,MUNI_T3,CO_T3_X,CO_T3_Y,MOTIVO_O,MOTIVO_D,SERVIR_O,SERVIR_D,MODO1,MODO2,MODO3,MODO4,H_SAIDA,MIN_SAIDA,ANDA_O,H_CHEG,MIN_CHEG,ANDA_D,DURACAO,MODOPRIN,TIPVG,PAG_VIAG,TP_ESAUTO,VL_EST,PE_BICI,VIA_BICI,TP_ESTBICI,ID_ORDEM
0,1,36,333743,7394463,10001,1,15.416667,1,1,6092017,1,1,1,2,1,100011,1,15.416667,1,2,2,1.0,0.0,0.0,1.0,0.0,0.0,1.0,1.0,1.0,1.0,1.0,0.0,0.0,0.0,1,4.0,25.0,,,,2732.58,3,10001101,1,19.532274,1,1,59,2,1,3,1,3,,,,,,,3.0,36.0,333104.0,7394476.0,2.0,2.0,4.0,13.0,1.0,,,,,,,,,,1.0,22.132647,3.0,2,1.0,36.0,333743.0,7394463.0,3.0,36.0,333104.0,7394476.0,,,,,,,,,,,,,8.0,3.0,2.0,2.0,16.0,,,,5.0,45.0,,5.0,55.0,,10.0,16.0,3.0,,,,1.0,,,1
1,1,36,333743,7394463,10001,0,15.416667,1,1,6092017,1,1,1,2,1,100011,0,15.416667,1,2,2,1.0,0.0,0.0,1.0,0.0,0.0,1.0,1.0,1.0,1.0,1.0,0.0,0.0,0.0,1,4.0,25.0,,,,2732.58,3,10001101,0,19.532274,1,1,59,2,1,3,1,3,,,,,,,3.0,36.0,333104.0,7394476.0,2.0,2.0,4.0,13.0,1.0,,,,,,,,,,2.0,22.132647,3.0,2,3.0,36.0,333104.0,7394476.0,1.0,36.0,333743.0,7394463.0,,,,,,,,,,,,,3.0,8.0,2.0,2.0,16.0,,,,15.0,45.0,,15.0,55.0,,10.0,16.0,3.0,,,,1.0,,,2
2,1,36,333743,7394463,10001,0,15.416667,1,1,6092017,1,1,1,2,1,100011,0,15.416667,1,2,2,1.0,0.0,0.0,1.0,0.0,0.0,1.0,1.0,1.0,1.0,1.0,0.0,0.0,0.0,1,4.0,25.0,,,,2732.58,3,10001102,1,16.663976,2,3,21,2,5,4,1,3,,84.0,36.0,329431.0,7395939.0,2.0,82.0,36.0,327503.0,7392159.0,2.0,2.0,4.0,7.0,2.0,,,,,,,,,,1.0,18.882487,3.0,3,1.0,36.0,333743.0,7394463.0,82.0,36.0,327503.0,7392159.0,,,,,,,,,,,,,8.0,3.0,2.0,2.0,1.0,,,,9.0,0.0,10.0,9.0,50.0,20.0,50.0,1.0,1.0,2.0,,,,,,3
3,1,36,333743,7394463,10001,0,15.416667,1,1,6092017,1,1,1,2,1,100011,0,15.416667,1,2,2,1.0,0.0,0.0,1.0,0.0,0.0,1.0,1.0,1.0,1.0,1.0,0.0,0.0,0.0,1,4.0,25.0,,,,2732.58,3,10001102,0,16.663976,2,3,21,2,5,4,1,3,,84.0,36.0,329431.0,7395939.0,2.0,82.0,36.0,327503.0,7392159.0,2.0,2.0,4.0,7.0,2.0,,,,,,,,,,2.0,18.882487,3.0,3,82.0,36.0,327503.0,7392159.0,84.0,36.0,329431.0,7395939.0,93.0,36.0,329861.0,7397268.0,,,,,,,,,3.0,4.0,2.0,2.0,1.0,4.0,,,17.0,0.0,20.0,18.0,0.0,1.0,60.0,1.0,1.0,2.0,,,,,,4
4,1,36,333743,7394463,10001,0,15.416667,1,1,6092017,1,1,1,2,1,100011,0,15.416667,1,2,2,1.0,0.0,0.0,1.0,0.0,0.0,1.0,1.0,1.0,1.0,1.0,0.0,0.0,0.0,1,4.0,25.0,,,,2732.58,3,10001102,0,16.663976,2,3,21,2,5,4,1,3,,84.0,36.0,329431.0,7395939.0,2.0,82.0,36.0,327503.0,7392159.0,2.0,2.0,4.0,7.0,2.0,,,,,,,,,,3.0,18.882487,3.0,3,84.0,36.0,329431.0,7395939.0,1.0,36.0,333743.0,7394463.0,,,,,,,,,,,,,4.0,8.0,2.0,2.0,12.0,,,,22.0,50.0,1.0,23.0,30.0,1.0,40.0,12.0,2.0,,,,,,,5


In [64]:
### Step 2: Change coordinates projection to ellps:WGS84

# Create a dataframe containing origin coordinates CO_O_X and CO_O_Y
geo_trips_origins = gpd.GeoDataFrame(
    reduced_trips[['CO_O_X', 'CO_O_Y']], geometry=gpd.points_from_xy(reduced_trips.CO_O_X, reduced_trips.CO_O_Y))

# Convert origin coordinates to the desired projection 
geo_trips_origins.crs = {'init': 'epsg:22523'}
geo_trips_origins.to_crs({'proj': 'longlat', 'ellps': 'WGS84', 'no_defs': True}, inplace=True)

# Create a dataframe containing origin coordinates CO_D_X and CO_D_Y
geo_trips_destinations = gpd.GeoDataFrame(
    reduced_trips[['CO_D_X', 'CO_D_Y']], geometry=gpd.points_from_xy(reduced_trips.CO_D_X, reduced_trips.CO_D_Y))

# Convert destination coordinates to the desired projection
geo_trips_destinations.crs = {'init': 'epsg:22523'}
geo_trips_destinations.to_crs({'proj': 'longlat', 'ellps': 'WGS84', 'no_defs': True}, inplace=True)

None

  return _prepare_from_string(" ".join(pjargs))
  return _prepare_from_string(" ".join(pjargs))


In [65]:
# Replace data by the new transformed coordinates
reduced_trips['CO_O_X'] = geo_trips_origins.apply(lambda x: x['geometry'].x, axis=1)
reduced_trips['CO_O_Y'] = geo_trips_origins.apply(lambda x: x['geometry'].y, axis=1)
reduced_trips['CO_D_X'] = geo_trips_destinations.apply(lambda x: x['geometry'].x, axis=1)
reduced_trips['CO_D_Y'] = geo_trips_destinations.apply(lambda x: x['geometry'].y, axis=1)

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  reduced_trips['CO_O_X'] = geo_trips_origins.apply(lambda x: x['geometry'].x, axis=1)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  reduced_trips['CO_O_Y'] = geo_trips_origins.apply(lambda x: x['geometry'].y, axis=1)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  reduced_trips['CO_D_X'] = geo_trips_de

In [71]:
reduced_trips['LOCAL_ORIGEM'] = reduced_trips[['CO_O_Y', 'CO_O_X']].apply(lambda x: ','.join(x.astype(str)), axis=1)

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  reduced_trips['LOCAL_ORIGEM'] = reduced_trips[['CO_O_Y', 'CO_O_X']].apply(lambda x: ','.join(x.astype(str)), axis=1)


In [72]:
reduced_trips['LOCAL_DESTINO'] = reduced_trips[['CO_D_Y', 'CO_D_X']].apply(lambda x: ','.join(x.astype(str)), axis=1)

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  reduced_trips['LOCAL_DESTINO'] = reduced_trips[['CO_D_Y', 'CO_D_X']].apply(lambda x: ','.join(x.astype(str)), axis=1)


In [73]:
utils.full_print(reduced_trips)

Unnamed: 0,ZONA,MUNI_DOM,CO_DOM_X,CO_DOM_Y,ID_DOM,F_DOM,FE_DOM,DOM,CD_ENTRE,DATA,TIPO_DOM,AGUA,RUA_PAVI,NO_MORAD,TOT_FAM,ID_FAM,F_FAM,FE_FAM,FAMILIA,NO_MORAF,CONDMORA,QT_BANHO,QT_EMPRE,QT_AUTO,QT_MICRO,QT_LAVALOU,QT_GEL1,QT_GEL2,QT_FREEZ,QT_MLAVA,QT_DVD,QT_MICROON,QT_MOTO,QT_SECAROU,QT_BICICLE,NAO_DCL_IT,CRITERIOBR,PONTO_BR,ANO_AUTO1,ANO_AUTO2,ANO_AUTO3,RENDA_FA,CD_RENFA,ID_PESS,F_PESS,FE_PESS,PESSOA,SIT_FAM,IDADE,SEXO,ESTUDA,GRAU_INS,CD_ATIVI,CO_REN_I,VL_REN_I,ZONA_ESC,MUNIESC,CO_ESC_X,CO_ESC_Y,TIPO_ESC,ZONATRA1,MUNITRA1,CO_TR1_X,CO_TR1_Y,TRAB1_RE,TRABEXT1,OCUP1,SETOR1,VINC1,ZONATRA2,MUNITRA2,CO_TR2_X,CO_TR2_Y,TRAB2_RE,TRABEXT2,OCUP2,SETOR2,VINC2,N_VIAG,FE_VIA,DIA_SEM,TOT_VIAG,ZONA_O,MUNI_O,CO_O_X,CO_O_Y,ZONA_D,MUNI_D,CO_D_X,CO_D_Y,ZONA_T1,MUNI_T1,CO_T1_X,CO_T1_Y,ZONA_T2,MUNI_T2,CO_T2_X,CO_T2_Y,ZONA_T3,MUNI_T3,CO_T3_X,CO_T3_Y,MOTIVO_O,MOTIVO_D,SERVIR_O,SERVIR_D,MODO1,MODO2,MODO3,MODO4,H_SAIDA,MIN_SAIDA,ANDA_O,H_CHEG,MIN_CHEG,ANDA_D,DURACAO,MODOPRIN,TIPVG,PAG_VIAG,TP_ESAUTO,VL_EST,PE_BICI,VIA_BICI,TP_ESTBICI,ID_ORDEM,LOCAL_ORIGEM,LOCAL_DESTINO
0,1,36,333743,7394463,00010001,1,15.416667,1,1,06092017,1,1,1,2,1,000100011,1,15.416667,1,2,2,1.0,0.0,0.0,1.0,0.0,0.0,1.0,1.0,1.0,1.0,1.0,0.0,0.0,0.0,1,4.0,25.0,,,,2732.58,3,00010001101,1,19.532274,1,1,59,2,1,3,1,3,,,,,,,3.0,36.0,333104.0,7394476.0,2.0,2.0,4.0,13.0,1.0,,,,,,,,,,1.0,22.132647,3.0,2,1.0,36.0,-46.628785,-23.551369,3.0,36.0,-46.635042,-23.551186,,,,,,,,,,,,,8.0,3.0,2.0,2.0,16.0,,,,5.0,45.0,,5.0,55.0,,10.0,16.0,3.0,,,,1.0,,,1,"-23.551368569079575,-46.62878523102084","-23.551185513618474,-46.63504195391411"
1,1,36,333743,7394463,00010001,0,15.416667,1,1,06092017,1,1,1,2,1,000100011,0,15.416667,1,2,2,1.0,0.0,0.0,1.0,0.0,0.0,1.0,1.0,1.0,1.0,1.0,0.0,0.0,0.0,1,4.0,25.0,,,,2732.58,3,00010001101,0,19.532274,1,1,59,2,1,3,1,3,,,,,,,3.0,36.0,333104.0,7394476.0,2.0,2.0,4.0,13.0,1.0,,,,,,,,,,2.0,22.132647,3.0,2,3.0,36.0,-46.635042,-23.551186,1.0,36.0,-46.628785,-23.551369,,,,,,,,,,,,,3.0,8.0,2.0,2.0,16.0,,,,15.0,45.0,,15.0,55.0,,10.0,16.0,3.0,,,,1.0,,,2,"-23.551185513618474,-46.63504195391411","-23.551368569079575,-46.62878523102084"
2,1,36,333743,7394463,00010001,0,15.416667,1,1,06092017,1,1,1,2,1,000100011,0,15.416667,1,2,2,1.0,0.0,0.0,1.0,0.0,0.0,1.0,1.0,1.0,1.0,1.0,0.0,0.0,0.0,1,4.0,25.0,,,,2732.58,3,00010001102,1,16.663976,2,3,21,2,5,4,1,3,,84.0,36.0,329431.0,7395939.0,2.0,82.0,36.0,327503.0,7392159.0,2.0,2.0,4.0,7.0,2.0,,,,,,,,,,1.0,18.882487,3.0,3,1.0,36.0,-46.628785,-23.551369,82.0,36.0,-46.690163,-23.571519,,,,,,,,,,,,,8.0,3.0,2.0,2.0,1.0,,,,9.0,0.0,10.0,9.0,50.0,20.0,50.0,1.0,1.0,2.0,,,,,,3,"-23.551368569079575,-46.62878523102084","-23.5715185392017,-46.69016308514412"
3,1,36,333743,7394463,00010001,0,15.416667,1,1,06092017,1,1,1,2,1,000100011,0,15.416667,1,2,2,1.0,0.0,0.0,1.0,0.0,0.0,1.0,1.0,1.0,1.0,1.0,0.0,0.0,0.0,1,4.0,25.0,,,,2732.58,3,00010001102,0,16.663976,2,3,21,2,5,4,1,3,,84.0,36.0,329431.0,7395939.0,2.0,82.0,36.0,327503.0,7392159.0,2.0,2.0,4.0,7.0,2.0,,,,,,,,,,2.0,18.882487,3.0,3,82.0,36.0,-46.690163,-23.571519,84.0,36.0,-46.670847,-23.537594,93.0,36.0,329861.0,7397268.0,,,,,,,,,3.0,4.0,2.0,2.0,1.0,4.0,,,17.0,0.0,20.0,18.0,0.0,1.0,60.0,1.0,1.0,2.0,,,,,,4,"-23.5715185392017,-46.69016308514412","-23.537593971852292,-46.670846791946076"
4,1,36,333743,7394463,00010001,0,15.416667,1,1,06092017,1,1,1,2,1,000100011,0,15.416667,1,2,2,1.0,0.0,0.0,1.0,0.0,0.0,1.0,1.0,1.0,1.0,1.0,0.0,0.0,0.0,1,4.0,25.0,,,,2732.58,3,00010001102,0,16.663976,2,3,21,2,5,4,1,3,,84.0,36.0,329431.0,7395939.0,2.0,82.0,36.0,327503.0,7392159.0,2.0,2.0,4.0,7.0,2.0,,,,,,,,,,3.0,18.882487,3.0,3,84.0,36.0,-46.670847,-23.537594,1.0,36.0,-46.628785,-23.551369,,,,,,,,,,,,,4.0,8.0,2.0,2.0,12.0,,,,22.0,50.0,1.0,23.0,30.0,1.0,40.0,12.0,2.0,,,,,,,5,"-23.537593971852292,-46.670846791946076","-23.551368569079575,-46.62878523102084"
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
487,2,36,333247,7395663,00021723,0,45.166667,1723,1,19092018,1,1,1,1,1,000217231,0,45.166667,1,1,1,1.0,0.0,0.0,2.0,0.0,0.0,1.0,1.0,1.0,1.0,1.0,0.0,0.0,0.0,1,3.0,30.0,,,,2200.00,1,00021723101,0,72.733237,1,1,28,1,5,4,1,1,2200.0,24.0,36.0,333200.0,7393152.0,2.0,2.0,36.0,333247.0,7395663.0,1.0,1.0,5.0,14.0,4.0,,,,,,,,,,2.0,82.416369,5.0,6,82.0,36.0,-46.693515,-23.563265,2.0,36.0,-46.633509,-23.540483,,,,,,,,,,,,,3.0,8.0,2.0,2.0,1.0,,,,10.0,30.0,10.0,11.0,0.0,1.0,30.0,1.0,1.0,1.0,,,,,,488,"-23.563264803090004,-46.693515315949774","-23.54048284464554,-46.633509019744054"
488,2,36,333247,7395663,00021723,0,45.166667,1723,1,19092018,1,1,1,1,1,000217231,0,45.166667,1,1,1,1.0,0.0,0.0,2.0,0.0,0.0,1.0,1.0,1.0,1.0,1.0,0.0,0.0,0.0,1,3.0,30.0,,,,2200.00,1,00021723101,0,72.733237,1,1,28,1,5,4,1,1,2200.0,24.0,36.0,333200.0,7393152.0,2.0,2.0,36.0,333247.0,7395663.0,1.0,1.0,5.0,14.0,4.0,,,,,,,,,,3.0,82.416369,5.0,6,2.0,36.0,-46.633509,-23.540483,6.0,36.0,-46.634069,-23.539032,,,,,,,,,,,,,8.0,5.0,2.0,2.0,16.0,,,,12.0,20.0,,12.0,25.0,,5.0,16.0,3.0,,,,1.0,,,489,"-23.54048284464554,-46.633509019744054","-23.539032139850487,-46.634068948003964"
489,2,36,333247,7395663,00021723,0,45.166667,1723,1,19092018,1,1,1,1,1,000217231,0,45.166667,1,1,1,1.0,0.0,0.0,2.0,0.0,0.0,1.0,1.0,1.0,1.0,1.0,0.0,0.0,0.0,1,3.0,30.0,,,,2200.00,1,00021723101,0,72.733237,1,1,28,1,5,4,1,1,2200.0,24.0,36.0,333200.0,7393152.0,2.0,2.0,36.0,333247.0,7395663.0,1.0,1.0,5.0,14.0,4.0,,,,,,,,,,4.0,82.416369,5.0,6,6.0,36.0,-46.634069,-23.539032,2.0,36.0,-46.633509,-23.540483,,,,,,,,,,,,,5.0,8.0,2.0,2.0,16.0,,,,12.0,35.0,,12.0,40.0,,5.0,16.0,3.0,,,,1.0,,,490,"-23.539032139850487,-46.634068948003964","-23.54048284464554,-46.633509019744054"
490,2,36,333247,7395663,00021723,0,45.166667,1723,1,19092018,1,1,1,1,1,000217231,0,45.166667,1,1,1,1.0,0.0,0.0,2.0,0.0,0.0,1.0,1.0,1.0,1.0,1.0,0.0,0.0,0.0,1,3.0,30.0,,,,2200.00,1,00021723101,0,72.733237,1,1,28,1,5,4,1,1,2200.0,24.0,36.0,333200.0,7393152.0,2.0,2.0,36.0,333247.0,7395663.0,1.0,1.0,5.0,14.0,4.0,,,,,,,,,,5.0,82.416369,5.0,6,2.0,36.0,-46.633509,-23.540483,24.0,36.0,-46.634250,-23.563150,,,,,,,,,,,,,8.0,4.0,2.0,2.0,16.0,,,,18.0,0.0,,18.0,34.0,,34.0,16.0,3.0,,,,6.0,,,491,"-23.54048284464554,-46.633509019744054","-23.563149733751473,-46.63424961863989"


In [74]:
reduced_trips.to_csv('mytrips.csv', index=False)

In [75]:
# A function to load points from the shapefile of a given entry
def read_zones(entry):
    # Read the shapefile pointed in the spec.json
    print("Reading shapefile: ", entry['zones_shapefile'])
    zones_shape = gpd.read_file(entry['zones_shapefile'], encoding='latin')
    print("Current projection: ", zones_shape.crs)

    # Projection used as Coordinate System, compatible with Cubu lat/lon format
    projection = {'proj': 'longlat', 'ellps': 'WGS84', 'no_defs': True}

    # Change projection for long/lat if different and save to new file
    if(zones_shape.crs != projection):
        print("Changing projection.")
        zones_shape = zones_shape.to_crs(projection)
        zones_shape.to_file(entry['base_path'] + '/processed/regions.shp')

    print('Scanned zones from:', entry['title'], '\n')
    return zones_shape

In [76]:
regions_map2017 = read_zones(entry2017)

Reading shapefile:  ../datasets/od2017/raw/Mapas/Shape/Zonas_2017_region.shp
Current projection:  {'init': 'epsg:22523'}
Changing projection.


  return _prepare_from_string(" ".join(pjargs))


Scanned zones from: Pesquisa Origem Destino 2017 



In [78]:
regions_map2017.to_file('2017map.json', driver='GeoJSON')