Foreign Key attribute in JSON schema #23

rufuspollock · 2013-02-15T18:32:29Z

Suggest in a type field:

foreignkey: {
  // points to the datapackage.json of the relevant data package
  url: ...
  // id / name of the dataset
  file: ...
  # id of the field in the referenced table
  field: ...
}

The text was updated successfully, but these errors were encountered:

rufuspollock · 2013-02-15T18:32:34Z

@mk270 comments please :-)

mk270 · 2013-03-07T19:24:30Z

As discussed, we should handle this outside the table schema.

trestletech · 2013-04-03T14:00:31Z

Can either of you point me to a doc explaining the proper way to do this, if not directly embed it in the Table Schema? It seems like a useful feature and I'm not sure of the best way to accomplish something like this.

Our use case is a bit different since we're using REST web services, so we're considering either embedding small factors directly into the Schema (a la #29) or providing a reference to the REST call which would provide such a mapping ({"id":"color", "rest":"/colors/"}). I realize that particular attribute would be outside of any spec, but if there were an encouraged way to reference another Schema, perhaps we could use that more heavily in the R package then customize from there.

rufuspollock · 2013-04-03T14:12:33Z

I think this and primary keys #21 should probably go in.

trestletech · 2013-04-03T15:32:09Z

I like it. The more I think about it, the REST-based approach I mentioned would actually fit within your proposed solution. If I were to host a datapackage.json file at the root of my webapp, all of my data/file references could just be relative URLs defining the GET/list functions for each type of object I'm interested in. Assuming my REST service produces strictly JSON Table Schema, we could end up with a pretty clean solution.

What's the best next step for this? We're trying to move pretty quickly on our client application, so I'd be happy to write up the docs and submit them in a pull request as we continue our development if we're in agreement that this should happen within the scope of this spec? If not, I'll just document it internally and save myself some time. Let me know...

trestletech · 2013-04-17T19:42:02Z

FYI: The project now supports foreign keys in the following format:

foreignkey: {
  // points to the datapackage.json of the relevant data package
  url: ...
  // id / name of the dataset
  resource: ...
  # id of the field in the referenced table
  field: ...
}

Note that we're now using resource instead of file. Since IDs are not guaranteed to be provided for a given resource, we use the following logic to map a resource attribute to another resource in the specified DataPackage:

If the resources in this DataPackage are a named hash, check in those names for the resource id.
Check the id field of each resource for a match.
Check the name field of each resource for a match.
Check the path and url field of each resource for a match.

As soon as a single match is found at any of these steps, no more searching is done. (As you can see, I think it would be easier for clients to implement this feature once this is pinned down in the spec).

In R, there's not an obvious way to preserve the underlying ID of an object and still map it to another abstract object ("hash" in JS or "list" in R) within the context of a data.frame, so we support this fairly naively at this point by only supporting mapping integer IDs to a single field (which get cast to a string, if its not already). The details (excerpt from https://github.com/QBRC/RODProt/blob/master/R/incorporate-foreign-keys.R) are copied below:

The behavior is as follows: if there is only one column in the referenced resource and that column
matches the \code{ID} given for the foreignkey, that column will be used as both the IDs and
the levels of the factor -- meaning that any values defined in this external table will map
to themselves, and any other values not in the table will map to \code{NA}. If the table has two
columns, one of which matches the \code{ID} value of the foreignkey, then the \code{ID} column
will be used for the IDs, and the second column, regardless of name, will be used as the levels
for the factor. If there are more than two columns, then the column to be used as the name must
be specified using the \code{name.column} parameter.

I created an issue to support more robust mappings of IDs to complex objects (https://github.com/QBRC/RODProt/issues/12), but it will require some engineering to get that functionality in place.

rufuspollock · 2013-04-18T18:54:05Z

@trestletech I really like this. Where does this go inside the JTS? Do you put it parallel to fields?

trestletech · 2013-04-18T19:11:14Z

We went within fields to more clearly identify which source field was associated with the foreign key. For instance (https://github.com/QBRC/RODProt/blob/master/inst/extdata/datapackage.json#L58)

[...
  "path": "data3.json",
  "id": "data3",
  "schema": {
    "fields":[
      {
        "id":"A",
         "label":"Column A",
         "type":"integer",
         "foreignkey": {
            "pkg": "../extdata/datapackage.json",
            "resource": "data",
            "id": "A"
         }
      },{
        "id":"B",
        "label":"Column B",
        "type":"string"
      }
    ]
  }
...
]

(in this case package is a relative path, but I imagine it would typically be a URL -- just harder to unit test a package on a remote URL).

rufuspollock · 2013-04-19T12:13:20Z

Interesting. How would you handle multiple foreign keys (is that possible)?

I'm in 2 minds whether we inline or separate into a separate foreignkeys attribute (more like SQL).

SQL approach: nicely partitionable from the fields list.
Inline: natural simplicity

Best thing here is you have running code which is what matters!

trestletech · 2013-04-19T13:22:10Z

I'm not sure which way you intend "multiple." To me, multiple foreign keys (single-columned FKs on multiple columns) could be handled well in this approach (indeed, we're using that pretty heavily in our application). To extend the previous example:

[...
  "path": "data3.json",
  "id": "data3",
  "schema": {
    "fields":[
      {
        "id":"A",
         "label":"Column A",
         "type":"integer",
         "foreignkey": {
            "pkg": "../extdata/datapackage.json",
            "resource": "data",
            "id": "A"
         }
      },{
        "id":"B",
        "label":"Column B",
        "type":"string",
         "foreignkey": {
            "pkg": "../extdata/datapackage.json",
            "resource": "anotherData",
            "id": "B"
         }
      }
    ]
  }
...
]

If you mean a single column referencing multiple other resources simultaneously, I suppose you could support a foreign key array?

[...
  "path": "data3.json",
  "id": "data3",
  "schema": {
    "fields":[
      {
        "id":"A",
         "label":"Column A",
         "type":"integer",
         "foreignkey": [{
            "pkg": "../extdata/datapackage.json",
            "resource": "data",
            "id": "A"
           },{
            "pkg": "../extdata/datapackage.json",
            "resource": "data",
            "id": "A"
           }
         ]
      }
  ]
]

(pardon any syntax errors). To be honest, though, I don't think I've ever seen that use case -- maybe it's more common in other fields.

Finally, if you meant composite foreign keys, then I agree that would be a limitation of this style. In my experience, it seems like the trend (for better or worse) has been to move away from composite keys where possible, though. I think at least part of this is motivated by the rise of web applications and the need to more easily de/serialize data. Most of my ORM experience has been with Hibernate, so I can only speak to that library, but I know support for Composite keys is pretty sloppy. Essentially, you typically need to embed your composite key as a single object so that HIbernate can treat it as a single reference to another table.

So instead of:

public class Person(){
  String firstName; //foreign key part1
  String lastName; //foreign key part2
  int height;
}

you'd have to use

public class NamedPerson(){
  String firstName; //foreign key part1
  String lastName; //foreign key part2
}

public class Person(){
  @EmbeddedId
  NamedPerson names; //now your foreign key is a single object
  int height;
}

I suppose you could take a similar approach and specify that the type for this field is actually another embedded JSON Table Schema (again, not thinking thoroughly through the precise syntax):

    "fields":[
      {
        "id":"A",
         "label":"Column A",
         "type":"embedded", // or "any" or some other type 
         "foreignkey": {
            "pkg": "../extdata/datapackage.json",
            "resource": "anotherData",
            "id": "nameObjCol"
         },
         "fields": [{
             "id":"first",
             "label": "FirstName",
             "type": "string"
           },{
             "id":"last",
             "label": "LastName",
             "type": "string"                 
           }
         }
       }           
     ]

I'm not sure how well that would work or if it would be worth the complexity, but it would be one approach.

I may just be biased by our application, but if I'm already serializing the data to JSON, I would have long given up the luxury of "fancy" relational mappings involving composite keys. Perhaps I'm in the minority there, though. I'm not sure about other common client languages like Python, but I know in R the support for a concept of a foreign key is so simplistic that it would require a pretty foundational re-write of some of the base data structures in the language to even get something like a composite foreign key into the language.

rufuspollock · 2013-04-21T10:15:20Z

I think we are making great progress here. I think there is agreement on:

At most one foreignkey per field
Inlining foreignkey on the field attribute
I think we should probably introduce a name field on resources - i will comment more in Files should have a required name (or id) attribute #32

In terms of the foreignkey hash I would suggest a minor tweak to rename id to field:

   {
      "pkg": "../extdata/datapackage.json",
      "resource": "anotherData",
      "field": "id-of-field"
   }

Questions:

pkg: should we rename to more explicit datapackage? (Probably no)
pkg: what can it be? Suggest either:
- a package name (resolvable by means we have not yet described!)
- a relative path in unix style
- a url
resource resolution is done by algorithm you describe above (though I suggest we simplify that via Files should have a required name (or id) attribute #32!)

trestletech · 2013-04-21T15:17:47Z

All sounds good to me. pkg seems reasonable to me, but I don't think it would be too inconvenient were it expanded.

davidmiller · 2013-04-30T17:38:06Z

My immediate use case for data packages was discoverability of other datasets, at which point multiple foreign key relationships becomes v.useful...

e.g. for drug relationships, the BNF code is a foreign key to

Prescribing data
Research papers on this drug
Recent health warnings/new side effects issued (Yellow card)
Patent expiry and grants
OpenBNF formulary dosage information
Individual hospital / commissioning group formulary notes

Being able to hit each of these in an explorer would allow me to get all the related metadata from different publishers...

In principle.

gthb · 2013-06-17T18:07:41Z

A foreign key into another resource in the same package could simply omit the pkg attribute, right? So it would look like:

{
   "id": "otherdata",
   "foreignkey": {
      "resource": "anotherData",
      "field": "id-of-field"
   }
}

sballesteros · 2013-10-16T21:13:51Z

Any news on this issue ?
Is what follows the current consensus ?


{
  "name": "resource_A",
  "schema":{
    "fields":[
      {
        "name": "X"
        "foreignkey": {
          "pkg": "datapackage_dep",
          "resource": "resource_X",
          "field": "field_X"
        }
      },
      {
        "name": "Y"
        "foreignkey": {
          "pkg": "datapackage_dep",
          "resource": "resource_X",
          "field": "field_Y"
        }
      }
    ]
  }
}

Thanks!

rufuspollock · 2013-10-16T23:04:24Z

OK, final recommendation is:

    "foreignkey": {
      "datapackage": "datapackage_dep",
      "resource": "resource_X",
      "field": "field_X"
    }

With:

datapackage: [optional] a url or a single name reference. If a single name it will be up to client to resolve that to a datapackage (we will not allow relative path references). If not supplied then we resolve for a resource within this datapackage.json
resource: [required] name of the resource within the datapackage
field: [required] the name of the field within the resource

Note rename of pkg attribute to datapackage

Example:

{
  "name": "resource_A",
  "schema":{
    "fields":[
      {
        "name": "X"
        "foreignkey": {
          "datapackage": "datapackage_dep",
          "resource": "resource_X",
          "field": "field_X"
        }
      }, ...
    ]
  }
}

rufuspollock · 2013-11-29T22:02:23Z

@paulfitz interested in your thoughts here especially given your comments on primary key and experience with coopy ...

I'd like to get this and the primary key stuff closed and into JTS asap ...

jpmckinney · 2013-11-29T23:16:30Z

Why not make datapackage always be a URL? If an implementation doesn't want to link to something accessible over HTTP, they can just use a custom scheme like custom://domain/path which is still a URL.

Also, why foreignkey not foreign_key?

As in #21, I think a separate section for constraints would be best, in case other constraints are implemented later.

rufuspollock · 2013-11-30T10:08:37Z

@jpmckinney all good points. For foreign keys which are already quite complex I guess this makes sense. I suppose for other items e.g. even unique or primary key it makes thinks more hasslesome to write.

Overall: I get the feeling that a good set of people want constraints outside of the fields in a separate section a la sql.

@paulfitz your thoughts either way would be very useful.

@sballesteros ditto ...

jpmckinney · 2013-11-30T22:02:48Z

I think a composite (multi-column) primary key is much easier to read if all the columns appear side-by-side, e.g.:

{
  "fields": [
    ...
  ],
  "indexes": [
    {
      "type": "primary",
      "fields": ["column1", "column10"]
    }
  ]
}

If you see a "primary_key" property on the first field in a datapackage.json file, you might assume that field is the primary key, when in fact it's composite with a field that appears later in the definition.

sballesteros · 2013-11-30T22:43:27Z

EDIT: This was based on a wrong understanding of foreign keys. What follows is largely irrelevant for this issue and is closer to materialized view.

Having foreignkeys in the fields array is convenient to do "joints" / merge.
For instance if there is a resource myresource in a data package called dpkg1 listing dpkg2 as dependencies:

"name": "dpkg1",
"version": "0.0.0",
"dependencies": {
    "dpkg2": "0.0.1"
},
"resources": [
    {
      "name": "myresource",
      "schema": {
        "fields":[
          {
            "name": "date",
            "foreignkey": {"datapackage": "dpkg2", "resource": "data", "field": "date"}
          },
          {
            "name": "seq",
            "foreignkey": {"datapackage": "dpkg2", "resource": "seq", "field": "seq"}
          },
          {
            "name": "x",
            "type": "integer"
          }
        ]
      },
      "data": [
        {"x": 10},
        {"x": 20},
        {"x": 30},
        {"x": 40}
      ]
    }

data referred by the foreignkeys are not available in dpkg1 (the resource data only contains x and not date and seq). To get date and seq, one need to go into dpkg2 and extract them from here (note that this is recursive as dpkg2 can also have foreignkeys...).
In this scenario it is convenient to know the order of the fields and have the foreignkeys right into schema.fields.

So to me foreignkeys allow to recreate composite JSON Table Schema from fields originating from different data packages.

Generalizing outside JSON Table Schema it would be good to have a discussion on how to use the top-level property dependencies or dataDependencies of a datapackage.
Maybe we should think of a more generic solution of "importing" resources (from x import y as z used in Python or var y = require('x') in node.js.).

If we stick with the current foreignkey hash and generalize it for non SDF resources we just need to omit the field property.
For instance we could have a resource y in dpkg1 coming from dpkg2:

"name": "dpkg1",
"version": "0.0.0",
"dependencies": {
    "dpkg2": "0.0.1"
},
"resources": [
  {
    "name": "y",
    "foreignkey": {"datapackage": "dpkg2", "resource": "x"}
  }

Here "foreignkey" is an extra possible property in addition to "data", "path" or "url".

If it is the intended use maybe "foreignkey" is not the best possible word.
What I really like about this usage of "foreignkey" is that it allows to repackage resources coming from different datapackages into one datapackage (tailored to someone specific data need) with full versioning support and without any data duplication.

jpmckinney · 2013-11-30T23:12:52Z

@sballesteros That's not how foreign keys work... the data for myresource would still have values for date and seq. A foreign key is just a constraint that can be enforced to ensure that the values in date and seq exist in the referenced columns of a resource in dpkg2. It's a way to check data consistency.

What you're describing is an entirely different feature, which may be worthwhile, but is not what a foreign key is. What you're describing is closer to a materialized view, that joins multiple tables together.

sballesteros · 2013-11-30T23:28:43Z

@jpmckinney thanks for clarifying. So what I suggest is more related to the way dependencies will be handled. Got it now... Foreign key is a constraint to ensure that the values taken by the foreignkey are elements of the resource field it points to.

jpmckinney · 2013-11-30T23:35:09Z

@sballesteros Duplication is useful if the third-party resource disappears. There are many other reasons. Update: My previous message was about a comment that was later edited. Yes, that's what foreign keys do.

sballesteros · 2013-12-01T17:18:24Z

@jpmckinney I was talking about an implementation. Using a hash with types values as keys to avoid some if statements.

paulfitz · 2013-12-02T19:38:28Z

+1 for parallel option. @jpmckinney's array suggestion seems reasonable. Agree with @sballesteros about avoiding spaces. Could also rename constraints to keys and give them type foreign.

jpmckinney · 2013-12-05T00:38:39Z

Another +1 from @besquared in #21 for keys.

besquared · 2013-12-05T01:27:57Z

Going to repost this here since apparently #21 isn't the right place anymore.

Howdy guys I just wanted to add my recommendation here based on a few things.

Keys are intrinsic descriptions of a dataset and not a field

I don't think adding any kind of key description at the field level is correct. I don't think It's a matter of shorthand either as structurally all keys are properties of a datasets and not a field.

Keys aren't indices and aren't necessarily constraints either

Indices are a database specific feature used for optimizing common operations. The terms are often used interchangeably in those systems because database vendors often default to (or require) building an index onto the primary key columns for their own internal operations or for the convenience of the user. Keys don't necessarily imply a constraint either although generally implementations constrain primary keys to be unique. One vendor specific example is that the InnoDB engine allows both non-unique indices and foreign keys to non-unique indices.

IMO If the specification's goal is to be producer and consumer agnostic then the it wouldn't include references to the concept of an index and leave that up to the system consuming the data package. I think there's probably some room here to include a unique flag as it might be seen as a description of the key's relationship to the dataset and not a constraint that the package is defining.

As far as how to implement a specification for keys I think the most generic form would try to only include things that are descriptive of the data itself. A key should signify that fields within or between datasets are linked to one another in some way or have some kind of special meaning in the dataset itself (such as uniqueness). Something like the following (which even I consider somewhat verbose) might be a good first shot:

{
  "schema: {
    "fields": [...]
  },

  "keys": [
    {
      "type": "primary",
      "unique": true, // for the convenience of the consumer
      "fields": ['field1', 'field2', ...]
    },
    {
      "type": "foreign",
      "fields": ['field3', ...],
      "references": {
         "url": <datapackage.json>, // url of datapackage.json
         "resource": <resource>, // name of the resource in the referenced package
         "fields": ['field1', ...] // ordered respective to the ancestor's fields
       }
    }
  ]

rufuspollock · 2013-12-05T08:17:08Z

My one objection to the array approach for keys/constraints is that it makes it more painful for consumers - rather than just doing:

resources[0].schema.constraints.primarykey.fields

i have to iterate through the list of constraints.

@besquared re constraints / keys in naming I feel a little ambivalent. I think it would be nice to include unique requirements in their at some point and they definitely aren't keys ... (however this isn't a biggie)

I do think generally that keys/constraints should live inside the schema attribute and be part of json table schema (@besquared your example suggested otherwise but i was not sure if that was intentional ...)

jpmckinney · 2013-12-05T16:03:11Z

@rgrp For all except the primary key, you'll still need to iterate, as only the primary key will have a single key object as its value. The others will have arrays. If your app loads all keys, you'll have a double-nested iteration if the constraints are an object, instead of a simple iteration if the constraints are an array. It depends on what use cases you want to prioritize.

Also, can we have separation between words, e.g. primary_key?

Also, unique keys are keys (it's in the name). How are they "definitely not keys"? @besquared is correct in saying that constraints in the language of DBs are something different, e.g. you can put a constraint at the field-level on a price field to say that it must have a value greater than zero. NOT NULL is maybe the most common constraint. "Keys" is the more universal DB word for what we're talking about.

besquared · 2013-12-05T19:23:02Z

+1 for everything @jpmckinney is saying. Also @rgrp I originally did think keys belongs to the resource and not the schema but I take it back now they obviously belong in the schema. To draw some inspiration from Codd's relational model something like the format I proposed above allows us to correctly express the three types of relational constraints.

@rgrp other options include a keys section keyed by type

{
  "schema: {

    "fields": [...]

    "keys": {

      "primary": {
        "fields": ['field1', 'field2', ...]
      },

      "foreign": [
        {
          "fields": ['field3', ...],
          "references": {
            "url": <datapackage.json>, // url of datapackage.json
            "resource": <resource>, // name of the resource in the referenced package
            "fields": ['field1', ...] // ordered respective to the ancestor's fields
        }
      ]

    }

  }
}

This seems fine too and allows the kind of object access one would expect (resource.schema.keys.primary) while still allowing multiple foreign keys and allows the keys section to remain pretty robust to extension later.

besquared · 2013-12-05T19:32:34Z

This might also be a good time to talk about foreign keys vs. references. In the relational model foreign keys imply a constraint in which the foreign relation must have a corresponding value for the keyed fields in order for a new relation to be added. This allows referential integrity between relations to be maintained.

If what the specification is trying to achieve is a way to link data together (but not necessarily imply a constraint) then we might want to have a section called "references" instead of a set of foreign keys. Consumers who would like to resolve reference packages and apply foreign key constraints in their data management system may choose to do so but wouldn't be required to.

This leads to another possible schema layout:

{
  "schema: {

    "fields": [...]

    "primary_key": {
      "fields": ['field1', 'field2', ...]
    }

    "references": [
      {
        "url": <datapackage.json>, // url of datapackage.json
        "resource": <resource>, // name of the resource in the referenced package
        "required": true, // require the referenced row to exist in the foreign resource?
        "fields": {
          'field1': 'field4',
          'field3': 'field5'
        } // map between this resource and the foreign resource
      }
    ]
  }
}

I like this as well for various reasons. It leaves the concept of constraints for a later time and still allows consumers to optionally infer foreign key constraints.

Thoughts?

rufuspollock · 2013-12-07T20:50:52Z

So along with @jpmckinney's proposal we're going with out of line items. Here's the updated proposal:

{
  "name": "resource-a",
  "schema":{
    "fields":[
      {
        "name": "X"
      }, ...
    ],
    "foreignKeys": [
        {
          "fields": string or array of fields on local resource
          "reference": {
            "datapackage": "datapackage_dep",
            "resource": "resource_X",
            "fields": ["field_X"]
          }
        }
      ]
    ]
  }
}

Questions:

foreignKeys vs references vs relations vs foreign as name for attribute - votes please (+1,+0,0, -0,-1)
camelcase vs underscores (foreignKey vs foreign_key)

rfc @paulfitz @jpmckinney @besquared @davidmiller @sballesteros

jpmckinney · 2013-12-08T16:05:40Z

I prefer foreign keys to "relations", etc. because it is a label that precisely refers to the existing concept.
datapackage_version and last_modified use underscores. Do any current properties use camelcase? The case should be consistent across all properties, which for now would mean sticking to underscores. I don't mind switching everything to camelcase, but then we'd have to change at least datapackage_version and last_modified.

besquared · 2013-12-08T19:37:26Z

There's some mixing here between #21 and #23.

Everything should just standardize on camelCase.

Given that it's likely that primaryKey will become a thing it seems reasonable that foreignKeys should become the other thing. I think it's important to tell people that this might not have the same meaning that it does in their rdbms. This seems ok.

paulfitz · 2013-12-09T16:37:38Z

+1 for foreignKeys and lowerCamelCase.

sballesteros · 2013-12-09T16:39:35Z

+1 for foreignKeys and lowerCamelCase too.

trickvi · 2014-01-22T22:34:31Z

This has become a pretty big feature for me to get into the specification. Any ideas about ETA @rgrp?

trickvi · 2014-01-23T10:04:29Z

One idea I have which would be a nice to have is to create some sort of a fallback sequence. It might make everything more complicated but what I'm thinking is that resource could be an array of resources (most significant resource on the left) and when looking up the key it would just go through the resources in that order until it finds the key being searched for.

This means I can have multiple resources for datasets published in different phases. I'm thinking about a case for budget data where countries have different budget phases. The problematic one is when we have an approved budget which can be adjusted at a later point in time. Adjustments are not necessarily a completely new dataset but changes to different rows of the approved budget.

Let's say we represent this with two resources (marking them with dates for clarity): approved-2013-12-25.csv and adjusted-2014-01-22.csv. This problem could be solved by just making adjusted-2014-01-22.csv contain the whole budget with changes (so it also contains rows which haven't changed). For this the foreign key resource would just be the most recent budget resource. This means that if there are a few adjustments made we'll be copying a lot of the same rows between resources. To compare changes you would have to do a diff.

Another option would be to make adjusted-2014-01-22.csv only contain lines that have changed, with the same schema (fields, primary keys, and foreign keys) and define the resources for the foreign key with an array. Then when trying to reference an item in the budget it would first try to find it in adjusted-2014-01-22.csv and if not found go through approved-2013-12-25.csv. This reduces amount of redundancy and makes it clearer what the changes are but this makes the implementation of foreign keys more difficult.

Just an idea. I'm glad to drop it and use up more disk space if that means we can get foreign keys in as soon as possible.

rufuspollock · 2014-01-23T10:11:15Z

@tryggvib at this point I doubt we're going to expand for this use-case. Right now we have a finalized spec and I think it will go in.

trickvi · 2014-01-23T10:12:16Z

Great. Happy to see this go in as long as I have foreign keys :)

trickvi · 2014-01-23T10:15:27Z

Another use case I came up with though just now was that this would enable you to break up resources to more manageable sizes. For example one resource per year and some foreign key in another resource could include the year which could fallback through the resources broken down by years.

rufuspollock · 2014-02-25T21:58:53Z

FIXED. Foreign Key support is now in - see http://dataprotocols.org/json-table-schema/#foreign-keys

Huge thank-you to everyone who contributed here and please double check for the final implemented version for any errors that need to be corrected or emendations that should be made.

* Updated date/time definitions * Added examples * Improved formatting * Update content/docs/specifications/table-schema.md Co-authored-by: Peter Desmet <peter.desmet.work@gmail.com> * Updated datetime pattern example * Added `any` format note --------- Co-authored-by: Peter Desmet <peter.desmet.work@gmail.com>

trestletech mentioned this issue Apr 3, 2013

Support Categorical Variables in JSON Schema #29

Closed

trestletech mentioned this issue Apr 5, 2013

Clarify DataPackage File Info Section #30

Closed

rufuspollock mentioned this issue Apr 18, 2013

Added data transmission info #28

Merged

rufuspollock mentioned this issue Apr 21, 2013

Files should have a required name (or id) attribute #32

Closed

sballesteros mentioned this issue Oct 16, 2013

sharing schema across resources #71

Closed

jpmckinney mentioned this issue Nov 30, 2013

[JTS] Primary Key / ID attribute support in JSON table schema #21

Closed

sballesteros mentioned this issue Dec 18, 2013

update foreignKeys support to latest spec sballesteros/jts-validator#1

Open

trickvi mentioned this issue Jan 22, 2014

I18N and Metadata Translations for Data Package #42

Closed

jpmckinney mentioned this issue Jan 28, 2014

[JTS] Extend JSON Table Schema to support simple field validation #95

Closed

rufuspollock mentioned this issue Feb 14, 2014

Fields in the resource dict #102

Closed

rufuspollock closed this as completed in a3d3097 Feb 25, 2014

konektaz mentioned this issue Mar 20, 2014

Foreign Key attribute in JSON schema konektaz/where-is-when-is#171

Open

paulfitz mentioned this issue May 21, 2014

[JTS] Support for indexes #121

Closed

roll added this to Open Knowledge Jun 27, 2023

peterdesmet mentioned this issue Jul 3, 2024

Investigate effects of updated date/time definitions frictionlessdata/frictionless-r#171

Open

Foreign Key attribute in JSON schema #23

Foreign Key attribute in JSON schema #23

Comments

rufuspollock commented Feb 15, 2013

rufuspollock commented Feb 15, 2013

mk270 commented Mar 7, 2013

trestletech commented Apr 3, 2013

rufuspollock commented Apr 3, 2013

trestletech commented Apr 3, 2013

trestletech commented Apr 17, 2013

rufuspollock commented Apr 18, 2013

trestletech commented Apr 18, 2013

rufuspollock commented Apr 19, 2013

trestletech commented Apr 19, 2013

rufuspollock commented Apr 21, 2013

trestletech commented Apr 21, 2013

davidmiller commented Apr 30, 2013

gthb commented Jun 17, 2013

sballesteros commented Oct 16, 2013

rufuspollock commented Oct 16, 2013

rufuspollock commented Nov 29, 2013

jpmckinney commented Nov 29, 2013

rufuspollock commented Nov 30, 2013

jpmckinney commented Nov 30, 2013

sballesteros commented Nov 30, 2013

jpmckinney commented Nov 30, 2013

sballesteros commented Nov 30, 2013

jpmckinney commented Nov 30, 2013

sballesteros commented Dec 1, 2013

paulfitz commented Dec 2, 2013

jpmckinney commented Dec 5, 2013

besquared commented Dec 5, 2013

rufuspollock commented Dec 5, 2013

jpmckinney commented Dec 5, 2013

besquared commented Dec 5, 2013

besquared commented Dec 5, 2013

rufuspollock commented Dec 7, 2013

jpmckinney commented Dec 8, 2013

besquared commented Dec 8, 2013

paulfitz commented Dec 9, 2013

sballesteros commented Dec 9, 2013

trickvi commented Jan 22, 2014

trickvi commented Jan 23, 2014

rufuspollock commented Jan 23, 2014

trickvi commented Jan 23, 2014

trickvi commented Jan 23, 2014

rufuspollock commented Feb 25, 2014