Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reviewing labelled and predicted changesets from feature classifier #58

Closed
bkowshik opened this issue Jun 14, 2017 · 5 comments
Closed

Comments

@bkowshik
Copy link
Contributor

bkowshik commented Jun 14, 2017

With results from #43, I plan to review 15 changesets in each of the following categories:

  • Labelled problematic and predicted problematic
  • Labelled problematic and predicted good
  • Labelled good and predicted good
  • Labelled good and predicted problematic
@bkowshik
Copy link
Contributor Author

bkowshik commented Jun 14, 2017

Labelled problematic and predicted problematic

Duplicate looking tags

screen shot 2017-06-14 at 11 06 03 am

  • One of the attributes used for training was the number of duplicate looking tags in the feature
  • The idea was to flag features with landuse, landuse_1, etc
  • Not every duplicate looking tag is a 👎 though.
  • Good tags to have
    • building, building:level, etc
    • addr, addr:city, etc
  • Not so good to have
    • landuse, landuse_1, ...
    • surface, surface_1, ...

Users with blocks

Adding leisure=park

screen shot 2017-06-14 at 11 16 31 am

  • We have seen lots of activity around converting existing features to parks.
  • This should totally be part of our Regression Test Suite

Personal information

@bkowshik
Copy link
Contributor Author

Labelled problematic and predicted good

Additional context

screen shot 2017-06-14 at 11 32 21 am

Impossible tags

  • There are some features and should not have some tags.
  • Ex: The node Paris gets a shop=bicycle in https://osmcha.mapbox.com/46978170/
  • Note: Value for name:en=France is inappropriate too

@bkowshik
Copy link
Contributor Author

Labelled good and predicted good

Name translations

screen shot 2017-06-14 at 12 05 28 pm

  • The model currently is making a guess when there is a name translation
  • Ex: In https://osmcha.mapbox.com/48119284/, name:en=Chalcis was added
  • The only related attributes the model gets are feature_name_translation_new_version=8 and feature_name_translation_old_version=7

Wikidata

  • There are inherent properties about every tag and value. Ex: Every Wikidata tag should start with a Q? Ex: Q6529766
  • In https://osmcha.mapbox.com/48448760/, the Wikidata tag has the value of Wikipedia ru:Церковь_Святого_Филиппа_(Ташкент)

General to specific

  • When a value goes from a general value to a more specific value, it mostly is a good thing
  • Ex: in https://osmcha.mapbox.com/48044675/, we go from building=yes -> building=school

@bkowshik
Copy link
Contributor Author

Labelled good and predicted problematic

Pure geometry modifications

screen shot 2017-06-14 at 12 45 39 pm

  • Looks like the model does not have enough information to make this decision
  • A majority of features we currently have for the model are property based
  • The attributes that come close to being relevant are:
    • feature_area and feature_area_old: Since the feature is a node have a value of 0
    • The values for leisure and sport remain unchanged so might not be very useful
  • We definitely need more geometry based attributes. Ex:
    • Distance between new version and old version of feature
    • Number of nodes in the feature

Feature name

Redundant tags

  • There are tags that are reference only and do not contribute similarly to other properties
  • In https://osmcha.mapbox.com/47919135/, the feature gets a description.
  • Skip checking such tags could be a wise thing to do at the current state of the project

@bkowshik
Copy link
Contributor Author

No next actions here. Closing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant