# How to use Foreign Keys in Data Packages
---
**Intermediate | Python | Datapackage**
---
## Learning Goals:
- Learn how to work with foreign keys in your data packages
- Learn how to validate your data integrity with foreign keys
- This How-To assumes knowledge of how to create a [data package](TODO: add docs link) and assumes that you have defined foreign keys in your Table Schema specification data (i.e. your data package descriptor has ```resources[].schema.foreignKeys property defined```).

## Step 0: Installation

In [0]:
!pip install datapackage

## Step 1: Load datapackage

We will work with this example data package:

In [0]:
DESCRIPTOR = {
  'resources': [
    {
      'name': 'teams',
      'data': [
        ['id', 'name', 'city'],
        ['1', 'Arsenal', 'London'],
        ['2', 'Real', 'Madrid'],
        ['3', 'Bayern', 'Munich'],
      ],
      'schema': {
        'fields': [
          {'name': 'id', 'type': 'integer'},
          {'name': 'name', 'type': 'string'},
          {'name': 'city', 'type': 'string'},
        ],
        'foreignKeys': [
          {
            'fields': 'city',
            'reference': {'resource': 'cities', 'fields': 'name'},
          },
        ],
      },
    }, {
      'name': 'cities',
      'data': [
        ['name', 'country'],
        ['London', 'England'],
        ['Madrid', 'Spain'],
      ],
    },
  ],
}

In [0]:
from datapackage import Package

package = Package(DESCRIPTOR)

## Step 2: Check for foreign key violations with ```resource.check_relations()``` method

In [0]:
teams = package.get_resource('teams')
teams.check_relations()
# tableschema.exceptions.RelationError: Foreign key "['city']" violation in row "4"

We get an error because our lookup table ```cities``` does not have a city of ```Munich``` but there is a team from ```Munich``` in the ```teams``` resource.

## Step 4: Fix the error

In [0]:
package.descriptor['resources'][1]['data'].append(['Munich', 'Germany'])
package.commit()
teams = package.get_resource('teams')
teams.check_relations()
# True

Fixed!

## Step 5: (Optional) Dereference resource relations with ```resource.read()``` method

In [0]:
teams.read(keyed=True, relations=True)
#[{'id': 1, 'name': 'Arsenal', 'city': {'name': 'London', 'country': 'England}},
# {'id': 2, 'name': 'Real', 'city': {'name': 'Madrid', 'country': 'Spain}},
# {'id': 3, 'name': 'Bayern', 'city': {'name': 'Munich', 'country': 'Germany}}]

Now instead of only the city name we've got a dictionary containing the city data. These ```resource.iter/read``` methods will fail the same as ```resource.check_relations``` if there is an integrity issue, but only if ```relations=True``` flag is passed.

## Related Reference Documentation
- https://github.com/frictionlessdata/datapackage-py