Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OData4 implementation #8

Open
wants to merge 12 commits into
base: main
Choose a base branch
from
Open

OData4 implementation #8

wants to merge 12 commits into from

Conversation

J535D165
Copy link
Owner

@J535D165 J535D165 commented Jun 26, 2019

This Pull Request proposes an implementation of the new OData4 protocol used by CBS. CBS migrates from OData version 3 to OData version 4. This migration comes with a lot of other changes. Read about the changes at the website of CBS https://beta.opendata.cbs.nl/OData4/index.html.

All feedback is welcome.

Available functions

 'download_data',
 'get_catalog_info',
 'get_catalog_list',
 'get_data',
 'get_dataset', # alias of get_data
 'get_dataset_info',
 'get_dataset_list',
 'get_metadata',
 'get_observations',

Example

>>> import cbsodata4 as cbsodata

>>> obs = cbsodata.get_data("84120NED")
[{'Id': 0,
  'Measure': 'M0001185',
  'ValueAttribute': 'None',
  'Value': 121137.0,
  'BelastingenEnWettelijkePremies': 'T001396',
  'Perioden': '1995JJ00',
  'Title': 'Ontvangen belastingen en wett. premies',
  'Unit': 'mln euro',
  'MeasureGroupID': None,
  'BelastingenEnWettelijkePremiesTitle': 'Totaal belastingen en wettelijke premies',
  'BelastingenEnWettelijkePremiesGroupTitle': 'Totalen',
  'PeriodenTitle': '1995',
  'PeriodenGroupTitle': 'Jaren'},
 {'Id': 1,
  'Measure': 'M0001185',
  'ValueAttribute': 'None',
  'Value': 127158.0,
  'BelastingenEnWettelijkePremies': 'T001396',
  'Perioden': '1996JJ00',
  'Title': 'Ontvangen belastingen en wett. premies',
  'Unit': 'mln euro',
  'MeasureGroupID': None,
  'BelastingenEnWettelijkePremiesTitle': 'Totaal belastingen en wettelijke premies',
  'BelastingenEnWettelijkePremiesGroupTitle': 'Totalen',
  'PeriodenTitle': '1996',
  'PeriodenGroupTitle': 'Jaren'},
 {'Id': 2,
  'Measure': 'M0001185',
  'ValueAttribute': 'None',
  'Value': 133619.0,
  'BelastingenEnWettelijkePremies': 'T001396',
  'Perioden': '1997JJ00',
  'Title': 'Ontvangen belastingen en wett. premies',
  'Unit': 'mln euro',
  'MeasureGroupID': None,
  'BelastingenEnWettelijkePremiesTitle': 'Totaal belastingen en wettelijke premies',
  'BelastingenEnWettelijkePremiesGroupTitle': 'Totalen',
  'PeriodenTitle': '1997',
  'PeriodenGroupTitle': 'Jaren'}]

Tidy data/Long data

The new data is in a tidy (or long) format. Read more about tidy data: https://en.wikipedia.org/wiki/Tidy_data.

Transform long data into a pivot table with Pandas.

>>> import pandas as pd                                                                                                             

>>> df = pd.DataFrame(obs)                                                                                                          
>>> df.pivot(index='PeriodenTitle', columns='BelastingenEnWettelijkePremiesTitle', values='Value')  
BelastingenEnWettelijkePremiesTitle  Accijns op alcohol  Accijns op benzine  ...  Ziekengeldkassen  Zorgverzekeringsfonds
PeriodenTitle                                                                ...                                         
1995                                              408.0              2708.0  ...            1610.0                    0.0
1996                                              376.0              2744.0  ...             252.0                    0.0
1997                                              400.0              2902.0  ...               0.0                    0.0
1998                                              426.0              3038.0  ...               0.0                    0.0
1999                                              396.0              3127.0  ...               0.0                    0.0
1999 1e kwartaal                                    NaN                 NaN  ...               0.0                    0.0
1999 2e kwartaal                                    NaN                 NaN  ...               0.0                    0.0
1999 3e kwartaal                                    NaN                 NaN  ...               0.0                    0.0
1999 4e kwartaal                                    NaN                 NaN  ...               0.0                    0.0
2000                                              397.0              3151.0  ...               0.0                    0.0                               

Migration

In a couple of months, we release a version 4.0.0 of this client-side library. This version drops support for OData 3 completely.

cbsodata>=4

Installs version 4 or higher. This version uses OData4 by default.

pip install cbsodata>=4

cbsodata<4

Default version 3 and experimental implementation of OData version 4.

pip install cbsodata<4

Use version 3

>>> import cbsodata

Use version 4

>>> import cbsodata4 as cbsodata

Todo

  • More testing
  • Improve code and group handling
  • Filter implementation
  • Top and skip options
  • Save datasets
  • More examples

@jolienoomens

@J535D165
Copy link
Owner Author

Add warning for version 3 (in the transition period)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant