The part below allows to retrieve the necessary packages in order for the script to run correctly. We also included the APK which allows us to connect to the Open Data SNCF API. You may remove the # at the beginning of the script should you need to install packages.

In [1]:
#import sys
#!{sys.executable} -m pip install requests;
#!{sys.executable} -m pip install json;
#!{sys.executable} -m pip install pandas;
#!{sys.executable} -m pip install functools;

In [25]:
import requests as r
import json as j
import pandas as pd
from pandas.io.json import json_normalize
from requests.auth import HTTPBasicAuth


user = "754d5493-37cf-4298-a34b-5ec027b5feaa"
pw = ""
coverage = "fr-idf"
url = "https://api.sncf.com/v1/coverage/sncf/physical_modes/physical_mode%3ALongDistanceTrain/stop_points//?count=1000&"
headers={'Authorization': 'TOK:apk'}

Here, we request the data from the SNCF API.

In [26]:
response = r.get(url, 
                 auth=HTTPBasicAuth(user, ""))
print(response.status_code)
type(response)

200


requests.models.Response

In [27]:
json_data = j.loads(response.text)
print(type(json_data))
for key, value in json_data.items() :
    print (key)
json_data

<class 'dict'>
pagination
links
disruptions
feed_publishers
context
stop_points


{'pagination': {'start_page': 0,
  'items_on_page': 388,
  'items_per_page': 1000,
  'total_result': 388},
 'links': [{'href': 'https://api.sncf.com/v1/coverage/sncf/stop_points/{stop_points.id}',
   'type': 'stop_points',
   'rel': 'stop_points',
   'templated': True},
  {'href': 'https://api.sncf.com/v1/coverage/sncf/stop_areas/{stop_area.id}',
   'type': 'stop_area',
   'rel': 'stop_areas',
   'templated': True},
  {'href': 'https://api.sncf.com/v1/coverage/sncf/stop_points/{stop_points.id}/route_schedules',
   'type': 'route_schedules',
   'rel': 'route_schedules',
   'templated': True},
  {'href': 'https://api.sncf.com/v1/coverage/sncf/stop_points/{stop_points.id}/stop_schedules',
   'type': 'stop_schedules',
   'rel': 'stop_schedules',
   'templated': True},
  {'href': 'https://api.sncf.com/v1/coverage/sncf/stop_points/{stop_points.id}/arrivals',
   'type': 'arrivals',
   'rel': 'arrivals',
   'templated': True},
  {'href': 'https://api.sncf.com/v1/coverage/sncf/stop_points/{stop

As we can see, data were correctly retrieved in JSON format, which we then converted to a dictionary format. The main issue is that all the records of interest are contained within the "disruptions" sub-dictionary. Therefore, we first need to extract it.

In [28]:
json_data.pop('pagination', None)
json_data.pop('links', None)
json_data.pop('disruptions', None)
json_data.pop('feed_publishers', None)
json_data.pop('context', None)

{'timezone': 'Europe/Paris', 'current_datetime': '20190321T205916'}

We have correctly removed irrelevant data from the main dictionary. However, the subdictionaries, which are the data of interest are contained within a list. Therefore, we first need to extract the list in order to manipulate more easily each nested dictionary.

In [59]:
type(json_data)
json_data

{'stop_points': [{'name': 'Aachen/Aix la Chapelle',
   'links': [],
   'coord': {'lat': '50.767729', 'lon': '6.091261'},
   'label': 'Aachen/Aix la Chapelle (Aachen)',
   'equipments': [],
   'administrative_regions': [{'insee': '',
     'name': 'Aachen-Mitte',
     'level': 9,
     'coord': {'lat': '50.756966', 'lon': '6.092983'},
     'label': 'Aachen-Mitte',
     'id': 'admin:osm:22146',
     'zip_code': ''},
    {'insee': '',
     'name': 'Aachen',
     'level': 8,
     'coord': {'lat': '50.776348', 'lon': '6.083862'},
     'label': 'Aachen',
     'id': 'admin:osm:62564',
     'zip_code': ''}],
   'fare_zone': {'name': '0'},
   'id': 'stop_point:OCE:SP:Thalys-80153452',
   'stop_area': {'codes': [{'type': 'CR-CI-CH', 'value': '0080-153452-BV'},
     {'type': 'CR-CI-CH', 'value': '0080-153452-BV'},
     {'type': 'UIC8', 'value': '80153452'},
     {'type': 'external_code', 'value': 'OCE80153452'}],
    'name': 'Aachen/Aix la Chapelle',
    'links': [],
    'coord': {'lat': '50.767729

In [58]:
gares = []
for gare in json_data["stop_points"]:
    gares.append(gare)
gares

[{'name': 'Aachen/Aix la Chapelle',
  'links': [],
  'coord': {'lat': '50.767729', 'lon': '6.091261'},
  'label': 'Aachen/Aix la Chapelle (Aachen)',
  'equipments': [],
  'administrative_regions': [{'insee': '',
    'name': 'Aachen-Mitte',
    'level': 9,
    'coord': {'lat': '50.756966', 'lon': '6.092983'},
    'label': 'Aachen-Mitte',
    'id': 'admin:osm:22146',
    'zip_code': ''},
   {'insee': '',
    'name': 'Aachen',
    'level': 8,
    'coord': {'lat': '50.776348', 'lon': '6.083862'},
    'label': 'Aachen',
    'id': 'admin:osm:62564',
    'zip_code': ''}],
  'fare_zone': {'name': '0'},
  'id': 'stop_point:OCE:SP:Thalys-80153452',
  'stop_area': {'codes': [{'type': 'CR-CI-CH', 'value': '0080-153452-BV'},
    {'type': 'CR-CI-CH', 'value': '0080-153452-BV'},
    {'type': 'UIC8', 'value': '80153452'},
    {'type': 'external_code', 'value': 'OCE80153452'}],
   'name': 'Aachen/Aix la Chapelle',
   'links': [],
   'coord': {'lat': '50.767729', 'lon': '6.091261'},
   'label': 'Aachen/

In [64]:
for i in gares:
    print(i["label"], i["id"], i["coord"]["lat"], i["coord"]["lon"])

Aachen/Aix la Chapelle (Aachen) stop_point:OCE:SP:Thalys-80153452 50.767729 6.091261
Agde (Agde) stop_point:OCE:SP:TGV-87781278 43.317566 3.466029
Agen (Agen) stop_point:OCE:SP:TGV-87586008 44.207958 0.620892
Agen (Agen) stop_point:OCE:SP:TGVINOUI-87586008 44.207958 0.620892
Aime-la-Plagne (Aime-la-Plagne) (Aime-la-Plagne) stop_point:OCE:SP:TGV-87741769 45.55436 6.648881
Aime-la-Plagne (Aime-la-Plagne) (Aime-la-Plagne) stop_point:OCE:SP:Thalys-87741769 45.55436 6.648881
Aix-en-Provence-TGV (Aix-en-Provence) stop_point:OCE:SP:Lyria-87319012 43.455151 5.317273
Aix-en-Provence-TGV (Aix-en-Provence) stop_point:OCE:SP:OUIGO-87319012 43.455151 5.317273
Aix-en-Provence-TGV (Aix-en-Provence) stop_point:OCE:SP:TGV-87319012 43.455151 5.317273
Aix-en-Provence-TGV (Aix-en-Provence) stop_point:OCE:SP:TGVINOUI-87319012 43.455151 5.317273
Aix-les-Bains-le-Revard (Aix-les-Bains) stop_point:OCE:SP:TGV-87741132 45.687856 5.909349
Albertville (Albertville) stop_point:OCE:SP:TGV-87741645 45.673177 6.38323

Poitiers (Poitiers) stop_point:OCE:SP:OUIGO-87575001 46.58218 0.333076
Poitiers (Poitiers) stop_point:OCE:SP:TGV-87575001 46.58218 0.333076
Poitiers (Poitiers) stop_point:OCE:SP:TGVINOUI-87575001 46.58218 0.333076
Pornichet (Pornichet) stop_point:OCE:SP:TGV-87481747 47.270506 -2.344768
Quimper (Quimper) stop_point:OCE:SP:TGV-87474098 47.994694 -4.09209
Quimper (Quimper) stop_point:OCE:SP:TGVINOUI-87474098 47.994694 -4.09209
Quimperlé (Quimperlé) stop_point:OCE:SP:TGV-87476317 47.869099 -3.553004
Quimperlé (Quimperlé) stop_point:OCE:SP:TGVINOUI-87476317 47.869099 -3.553004
Rang du Fl. Verton Ber. (Rang-du-Fliers) stop_point:OCE:SP:TGV-87317057 50.415824 1.648103
Rang du Fl. Verton Ber. (Rang-du-Fliers) stop_point:OCE:SP:TGVINOUI-87317057 50.415824 1.648103
Redon (Redon) stop_point:OCE:SP:TGV-87471300 47.651785 -2.087888
Redon (Redon) stop_point:OCE:SP:TGVINOUI-87471300 47.651785 -2.087888
Reims (Reims) stop_point:OCE:SP:TGVINOUI-87171009 49.259056 4.024037
Remiremont (Remiremont) stop_p

Perfect ! We managed to print relevant data ! We just need to append each element to a list, and the job will be almost done !

In [104]:
list_of_stations = []
for i in gares:
    a = [i["label"], i["id"], i["coord"]["lat"], i["coord"]["lon"]]
    list_of_stations.append(a)

In [102]:
type(list_of_stations[0])

list

In [105]:
labels = ["gare", "stop", "lat", "lon"]

In [106]:
df = pd.DataFrame.from_records(list_of_stations, columns = labels)
df

Unnamed: 0,gare,stop,lat,lon
0,Aachen/Aix la Chapelle (Aachen),stop_point:OCE:SP:Thalys-80153452,50.767729,6.091261
1,Agde (Agde),stop_point:OCE:SP:TGV-87781278,43.317566,3.466029
2,Agen (Agen),stop_point:OCE:SP:TGV-87586008,44.207958,0.620892
3,Agen (Agen),stop_point:OCE:SP:TGVINOUI-87586008,44.207958,0.620892
4,Aime-la-Plagne (Aime-la-Plagne) (Aime-la-Plagne),stop_point:OCE:SP:TGV-87741769,45.55436,6.648881
5,Aime-la-Plagne (Aime-la-Plagne) (Aime-la-Plagne),stop_point:OCE:SP:Thalys-87741769,45.55436,6.648881
6,Aix-en-Provence-TGV (Aix-en-Provence),stop_point:OCE:SP:Lyria-87319012,43.455151,5.317273
7,Aix-en-Provence-TGV (Aix-en-Provence),stop_point:OCE:SP:OUIGO-87319012,43.455151,5.317273
8,Aix-en-Provence-TGV (Aix-en-Provence),stop_point:OCE:SP:TGV-87319012,43.455151,5.317273
9,Aix-en-Provence-TGV (Aix-en-Provence),stop_point:OCE:SP:TGVINOUI-87319012,43.455151,5.317273
