Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DATA ACCURACY/RELIABILITY: Features and discussion #96

Open
8 of 22 tasks
polly64 opened this issue Oct 27, 2021 · 2 comments
Open
8 of 22 tasks

DATA ACCURACY/RELIABILITY: Features and discussion #96

polly64 opened this issue Oct 27, 2021 · 2 comments
Assignees

Comments

@polly64
Copy link

polly64 commented Oct 27, 2021

1.Verification Button

  • Add and test verification feature allowing users to add single verification to each data entry
  • Add confetti burst to thank contributors for verifying data (as they are not rewarded by a footprint colouring)
  • Move confetti burst thank you back to burst across whole map
  • Apply verification feature across all categories
  • Verifications to clear if data entry is replaced (but remain in edit history)

2.Edit History

  • Include edit history viewer in editing mode
  • [X ] Make edit history more visible
  • Include more information on user groups (not individuals) on sign up and clearer signposting to show that though anonymity is recommended for contributors e.g initials/nickname only to protect their privacy, adding a suffix of organisations they belong to is encouraged as this allows users to assess reliability of data sources
    e.g CamdenSecondarySchoolGeography_BC.

3.Source type dropdown

  • Add/test source dropdown using 'Age' category
  • Remove free text box 'Source details' at https://colouringlondon.org/view/age/ to improve security
  • Apply above source features to all categories working with PH
  • Add new dropdown options if necessary - this includes at moment combining media e.g. book, website and specific databases e.g. under 'Age' a range of trusted sources such as the National Heritage List for England, are included.
  • Ask contributors on 'Contributor agreement' Menu page to add sources wherever possible
  • Add features to incentivise adding source - e.g data cannot be added without source dropdown being filled- to discuss

4.Source link

  • Add/test web link to source to confirm source type and provide precise link, and to credit source
  • Identify methods of checking security of weblinks

5.Uncertainty - Question phrasing

  • Provide example of how uncertainty measures can be included (see 'Age' e.g. earliest or latest possible date?)
  • Apply uncertainty features where applicable, working with PH

6.Specifying method of data capture

  • Create icons to indicate method of data capture i.e. whether
    i) in-house bulk upload of existing open datasets produced, monitored and checked by government, professional institutions, academic institutions (and other?)
    ii) crowdsourcing on a building-by-building basis
    iii) use of computational approaches to generate volume data
    iv) live streaming
    v) other

7.Cross checking/feedback loops

  • Test cross checking of data by combining computational and crowdsourcing methods. e.g generation of age data using inference- using historical road network data. checking by experts at local level using crowdsourcing (see 'Age data where 750K data points generated computationally are available for verification by local building experts'.
  • Feedback results of manual crowdsourced/checking to improve automated methods.

8.Data Accuracy text information

  • Add information on data accuracy on dedicated 'Data accuracy' Menu page to includea legal disclaimer to protect host body (checked by host legal advisors).
  • Add information on Data Accuracy page to state that the platform:
    i) cannot vouch for data accuracy
    ii) works to provide as many mechanisms as possible to enable users to assess data reliability (see above)
    iii) relies on user to assess reliability of data using mechanisms and to assess suitability for intended use
    iv) relies on users to help improve accuracy

Notes:
To improve data accuracy and quality/reduce malicious inaccurate input
a) bulk uploads are moderated/recommended by academic partners, at regional level, with CCRP international research partner final decision on release
b) manual entries can only be done building by building- This can be speeded up using the copy and paste tool. We have specifically chosen not to use the ArcGIS highlight large area and paste style option to prevent malicious behaviour and to make it as boring as possible for people to trash data.
c) the following are included to increase reliability of data
name of editor/edit history and date (we need to make last entry more visible)
type of source
source link
d) the verification buttons tells you how many other users agree with date
e) We are building a network of specialist users (see CLHEAG) to check and enrich data. Local planning groups. and local civic societies are set up specifically to oversee change in local areas. It is therefore in their interest to verify and monitor data as well as to enrich
f) for age we are looking to cross check data generated using a number of methods these include:
upload from unknown user
upload from known expert group
upload using historical street network inference
upload using UK gov energy performance certificate data
upload (if ever released by uk gov) of property tax age data
g) we might have a feature where we allow all dates ever
entered for a building to be viewed at once and link to editor name
i) we may include image of facade but would need to do this in a way to keep storage light and also to not link to commercial products- e.g. googlestreetview where they could just change terms and conditions at any stage ( as they have for analysing the streetview data)
k) we will probably include typology dropdown diagrams in the 'Type section' so this will also act as an additional verification
l) we are interested in feedback loops between the automated processes and manual checking and how not to override the specialist input. We are trying to move towards a system which allows you to download say age data and asses the reliability yourself using all the above info.
m) we preventing people deleting and then saving blank edit box. We need to change this so you can only delete if you enter an alternative date.

Statement on data accuracy
'The data are provided "as is", without warranty of any kind, express or implied, including but not limited to the warranties of merchantability, accuracy, fitness for a particular purpose and non-infringement. In no event shall ... add academic Colouring Cities host name... be liable for any reliance that you place on or how you use the data nor any claim, damages or other liability, whether in an action of contract, tort or otherwise, arising from, out of or in connection with the data or the use or other dealings in the data.
As Colouring London data are crowdsourced from multiple sources and may contain errors, your help in adding sources and in verifying data entries is greatly appreciated. Visible edit histories, and phrasing of questions are also used to help you assess the accuracy and reliability of data and its suitability for your intended use, be this an academic paper, a school project or a government policy document.
We will also be introducing other measures including icons to to indicate the way in which the data have been captured, for example through bulk upload from a monitored source, crowdsourcing at building level, computational generation, or live streaming. If you have suggestions for additional ways to improve data accuracy features please do comment at....'

@polly64 polly64 changed the title DATA ACCURACY: Features and discussion DATA ACCURACY/RELIABILITY: Features and discussion Nov 2, 2021
@polly64
Copy link
Author

polly64 commented Oct 27, 2022

@mattnkm @h-petersen shall we start working here on verification tools for Australia? (as @mdsimpson42 will be moving Colouring London issues to core code repository)?
You'll see code available already relating to the above which you can apply immediately an stuff we haven't done yet but want to. Matt what was that automated process you thought of re cross checking data? See also colouring-cities#882

@polly64 polly64 transferred this issue from colouring-cities/colouring-core Dec 27, 2023
@polly64
Copy link
Author

polly64 commented Dec 27, 2023

@polly64 to cjheck

@polly64 polly64 self-assigned this Jan 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant