Adding Dataset Metadata
When you publish a dataset on the BathHacked socrata site you are asked to fill in metadata about the dataset.
There are a number of fields and you may choose to skip over them when you are first populating your data. But, before you make your dataset public for others to use, its really important to try and complete as much of this metadata as possible.
This metadata provides some basic documentation to helps people to understand your data, describing what it is, where it comes from and how it is licensed.
This guide provides an overview of the key fields you'll be asked to fill in, with some guidance on what to add where.
Tip: as an absolute minimum try to complete:
licence url. This will at least describe your dataset and associate it with an open data licence.
|Title||The title of the dataset.|
|Brief Description||A short description of the dataset. Try to include some useful background information on the dataset, e.g. how it was collected, why it might be useful, and significant limitations, etc. While the entry field is a little small, this is actually the main field for adding developer documentation (although a short description can be added to individual columns). So try and add as much background detail here as possible.|
|Category||The catalog has a pre-defined set of categories used to help organise the datasets. You can choose to add your dataset to a single category|
|Tags/Keywords||Use this field to add any extra keywords that help to describe the dataset. These are added as "topics" in the catalog view, providing another way for users to find datasets.|
Licensing & Attribution
|License Type||This field is supposed to identify the licence for the data. However Socrata doesn't currently include all the licences we may need to use. So instead ignore this field (for now) and use the Additional Licence Detail section below.|
|Data Provided By||This field should give the name of the data publisher. If you are publishing your own dataset, e.g. something you've compiled or collected yourself then ignore this field. If you're re-publishing a third-party dataset, e.g. adding open data from BANES or some other organisation to the catalog, then use the full name of the organisation (e.g. "Bath & North East Somerset Council"). If you're publishing data on behalf of your own organisation using your own Socrata account, then you may still want to include the organisation name here.|
|Source Link||If you're re-publishing a third-party dataset then include a link to where you originally downloaded the data. If there's no download page for the original dataset, then just link to the original file. If you've imported multiple data files then try to link to a useful page that has pointers to the data.|
|Update Frequency||Its useful for developers to know how often the dataset might change. Especially if it is download and integrated into an application rather than used via an API. Use this field to give an idea of how often the data is updated. If you are importing a third-party dataset and are unsure about how often it changes, you might want to ask the original publisher. Failing that then we suggest you set the frequency to be how often you re-import the data. If the data will never get updated (e.g. it was a one-time export, or is an archived dataset) then choose "Never". If the schedule varies then choose "On Change". For all other cases choose "Other".|
Additional Licence Detail
This is a custom section added to provide more detail on how data is licensed.
Tip: Why are licenses important? For a dataset to be open data it must be published under an open data licence. Without that no-one can be really sure about whether the dataset is legally available for re-use. The more open you can make the data, the easier it is for people to re-use. If you have any questions about licensing, then please ask us! (see below).
|Additional Licence Information||Choose the name of the open data licence. If the licence isn't listed, then its not an open data licence. Only use the Ordnance Survey licence if you are re-publishing their data. The majority of UK government and local government data is likely to be available under the UK Open Government Licence, although there are restrictions.|
|Licence URL||Choose the URL for the licence. Make sure it matches what you've selected in the previous field.|
|Additional Attribution Statement||Some licences require that users attribute the source of the dataset. Often all that is needed is the name of the data publisher and a link to their website or to the dataset. That information is covered by the Data Provided By and Source Link fields. But if you want or need to specify a longer attribution statement then include it here. For example the UK Open Government licence recommends a default statement of "Contains public sector information licensed under the Open Government Licence v2.0." you'd need to include that here if the publisher hasn't specified an alternative.|
|Upload Image||You can associate a thumbnail image with your dataset. If you're re-publishing a third-party dataset, and you have permission to re-use their logo, then you could use the organisations logo as a thumbnail. However some organisations have restrictions on how their logo can be used.|
Logo for B&NES datasets: if you're adding a dataset that is published by B&NES then please add their logo as the thumbnail. Please use the image without editing it, e.g. to change aspect ratio.
The rest of the fields on the edit metadata form cover other aspects of how your dataset is published. You're free to ignore them.
However its also possible to add documentation to individual columns in your dataset. When viewing the dataset, click on the column menu and choose "Edit Column Properties". This will give you a form you can use to add a title and description to each column. This is useful to tie more specific documentation to each part of your dataset.