Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

QA/QC Street Data #8186

Closed
johnclary opened this issue Jan 17, 2022 · 5 comments
Closed

QA/QC Street Data #8186

johnclary opened this issue Jan 17, 2022 · 5 comments
Assignees
Labels
Project: Traffic Register 2.0 Knack-based application for managing traffic regulations with GIS integration Service: Apps Application support Service: Dev Infrastructure and engineering Type: Data Generating and delivering datasets and reports Workgroup: ATSD Active Transportation and Street Design Workgroup: OOD Office of the Director

Comments

@johnclary
Copy link
Member

johnclary commented Jan 17, 2022

QA/QC the street data in the traffic registry.

The task is to match every street name in the traffic registry with a canonical street name in the GIS street centerline dataset.

I put together this workbook which is a list of all unique unmatched street name values in the registry, as well as potential matching candidates scored with a matching algorithm.

How it works

  • For each row in the workbook, compare the input column against the potential matches in the match_1, match_2, etc. column.
  • If a match is found, copy and paste the match into the selected_match column
  • If no match is found, select a dropdown option in the issue

How to identify a match

  • Do not modify any of the match values or the input text.
  • Do not create a custom match. The selected_match must be copied from one of the match columns
  • Some of the input values need further automated cleaning before they can be matched.
    • If an input has a direction suffix or prefix, mark it with the issue "More extraction needed". For example, do not match S CONGRESS AVE; mark it with an issue and move on.
    • If an input has a prefix or suffix with additional detail, e.g. 50FT S OF CONGRESS AVE or 6TH ST (NB CURB), mark it with an issue and move on
  • Service roads and highways (e.g. IH 35 SVRD SB) will present problems. Unless you're very confident in the match, mark it with an issue and move on
  • The matching will get more difficult as the match score decreases. If you're unsure; move on. The goal right now is to reduce the list of unmatched streets as quickly as possible, then revisit the matching approach with automation.
@johnclary johnclary added Service: Apps Application support Service: Dev Infrastructure and engineering Type: Data Generating and delivering datasets and reports Workgroup: ATSD Active Transportation and Street Design Workgroup: OOD Office of the Director labels Jan 17, 2022
@ChrispinP
Copy link
Member

ChrispinP commented Jan 19, 2022

This is my list of things that are common occurrences in the results that could assist for the second set of matching.

  • MLK not recognized as abbreviation
  • SR/SVR/SVRD/FR used interchangeable for Service Road
  • TERRACE is shortened to TERR or TER
  • 2 different streets for an intersection are separated by /
  • VALLEY is shortened to V and may not be spaced from street name
  • DRIVEWAY/DRVWY/DRWY used interchangeably
  • PARKWAY/PARK/PKWY used interchangeably
  • MOPAC is not recognized
  • RAILROAD/RR used interchangeably (not a street)
  • ROW can be filtered
  • Inputs in parentheses ( ) can be filtered
  • SH not recognized
  • TERMINUS not recognized (not a street)
  • PARKING LOT/PKNG LOT not recognized (not a street)
  • CROSSING/CROSS/XING may be used interchangeably
  • CURB/CB can be filtered
  • MEDIAN can be filtered
  • CENTERLINE/CENTER LINE/CL can be filtered
  • APPROACH can be filtered
  • DEAD END can be filtered
  • Punctuation such as dashes - and ' apostrophes could be filtered
  • INTERSECTION/ITS can be filtered
  • FEET/FT can be filtered
  • XPWY not recognized as EXPWY/EXPY
  • E or EB can be filtered
  • W or WB can be filtered
  • N or NB can be filtered
  • S or SB can be filtered
  • CITY LIMIT can be filtered
  • TRAVEL LN or TRAFFIC LN can be filtered
  • PROPERTY LINE or PROPERTY LN can be filtered

@ChrispinP
Copy link
Member

first QA complete, ready for review

@johnclary
Copy link
Member Author

amazing work @ChrispinP! i reached out to Nathan W. for feedback:

i ran all of the street data through a matching algorithm, and chrispin went through the remaining 3k values and tried to match them to canonical street names.

i have the QA workbook (link) filtered to give you a sense of the kind of issues that remain.

there's some light cleaning to be done on my side with scripting. and a few more we can probably knock off this list.

but at the end of the day, i think we're going to have import a lot of these regs into the system as-is.

something we could do (and will probably need anyway), is have fields like "street detail", "from street detail", and "to street detail". these would be freeform text fields and can be concatenated with the other fields to form the reg text.

i'm proposing we import the unmatched street name values into these "detail" fields. they can be cleaned up later, but that will unblock us from moving on to building the actual knack app

@johnclary johnclary added the Project: Traffic Register 2.0 Knack-based application for managing traffic regulations with GIS integration label Jan 28, 2022
@dianamartin
Copy link
Contributor

Synced with Chrispin 5/19, he did confirm that he did this task, but unsure if the Street Data QC is 100 accurate.

@ChrispinP
Copy link
Member

ChrispinP commented Jun 8, 2022

no further action needed, closing as complete. John has provide both spreadsheets of data for the Regulations & Reg Docs with street data.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Project: Traffic Register 2.0 Knack-based application for managing traffic regulations with GIS integration Service: Apps Application support Service: Dev Infrastructure and engineering Type: Data Generating and delivering datasets and reports Workgroup: ATSD Active Transportation and Street Design Workgroup: OOD Office of the Director
Projects
None yet
Development

No branches or pull requests

3 participants