Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect deceased/name parsing from twitter fields #204

Closed
rgreinho opened this issue Sep 30, 2019 · 0 comments · Fixed by #206
Assignees
Labels
Milestone

Comments

@rgreinho
Copy link
Member

@rgreinho rgreinho commented Sep 30, 2019

Issue Type

  • Bug report

Current Behavior

The article Arrested field in http://austintexas.gov/news/traffic-fatality-60-4 throws the parser off:

Output:

{
    "case": "19-2661710",
    "crash": 60,
    "date": "09/23/2019",
    "fatalities": [
      {
        "age": 69,
        "dob": null,
        "ethnicity": "Undefined",
        "first": "Christian",
        "gender": "Undefined",
        "generation": "",
        "last": "White",
        "middle": "Livingston | White male | 01/09/1993 Debrah Callison |"
      }
    ],
    "latitude": 0.0,
    "link": "http://austintexas.gov/news/traffic-fatality-60-4",
    "location": "9000 block of S. Congress Avenue",
    "longitude": 0.0,
    "notes": "The preliminary investigation shows that the white, 2018 Toyota RAV-4 driven by Callison was traveling southbound in the outside lane in the 9000 block of S. Congress when she struck Christian Livingson. Mr. Livingston was walking in the roadway rather than on the east curb line sidewalk available to him. He was pronounced deceased at the scene.\n\n\tDebrah Callison remained on scene. She was found to be intoxicated and was arrested for DWI. There are no additional charges anticipated at this time.\n Debrah Callison",
    "time": "08:23 PM"
  }

Expected Behavior

The deceased field contains valid values that we should be able to parse.

Possible Solution

  • Ignore the arrested field.
  • The value in the twitter field if correct as well. There may be a bug with the logic to extract the values from this field.
@rgreinho rgreinho added the kind/bug label Sep 30, 2019
@rgreinho rgreinho changed the title Incorrect name parsing Incorrect deceased/name parsing from twitter fields Sep 30, 2019
@rgreinho rgreinho self-assigned this Sep 30, 2019
@rgreinho rgreinho added this to To do in ScrAPD 3.0.0 Sep 30, 2019
@rgreinho rgreinho moved this from To do to In progress in ScrAPD 3.0.0 Sep 30, 2019
rgreinho added a commit to rgreinho/scrapd that referenced this issue Sep 30, 2019
Reverts the logic tokenizing the twitter description to its previous
version. This allows us to simplify processing crashes with:

* multiple fatalities
* arrested fields

Fixes scrapd#204
@rgreinho rgreinho mentioned this issue Sep 30, 2019
2 of 2 tasks complete
@rgreinho rgreinho added this to the ScrAPD 3 milestone Oct 2, 2019
rgreinho added a commit to rgreinho/scrapd that referenced this issue Oct 4, 2019
Reverts the logic tokenizing the twitter description to its previous
version. This allows us to simplify processing crashes with:

* multiple fatalities
* arrested fields

Fixes scrapd#204
@rgreinho rgreinho closed this in #206 Oct 4, 2019
ScrAPD 3.0.0 automation moved this from In progress to Done Oct 4, 2019
rgreinho added a commit that referenced this issue Oct 4, 2019
Reverts the logic tokenizing the twitter description to its previous
version. This allows us to simplify processing crashes with:

* multiple fatalities
* arrested fields

Fixes #204
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
ScrAPD 3.0.0
  
Done
1 participant
You can’t perform that action at this time.