# work log - ground truth - evaluate disagreements

# Table of Contents

- [Setup](#Setup)

    - [Setup - Imports](#Setup---Imports)
    - [Setup - Initialize Django](#Setup---Initialize-Django)
    - [Setup - Tools](#Setup---Tools)

        - [Tool - copy `Article_Data` to user `ground_truth`](#Tool---copy-Article_Data-to-user-ground_truth)
        - [Tool - delete `Article_Data`](#Tool---delete-Article_Data)
        - [Tool - rebuild `Reliability_Names` for an article](#Tool---rebuild-Reliability_Names-for-an-article)

            - [Delete existing `Reliability_Names` for article](#Delete-existing-Reliability_Names-for-article)
            - [Make new `Reliability_Names`](#Make-new-Reliability_Names)

- [Evaluate disagreements](#Evaluate-disagreements)

    - [Tag disagreements as TODO](#Tag-disagreements-as-TODO)
    - [View disagreements](#View-disagreements)
    
        - [Disagreement evaluation](#Disagreement-evaluation)
        - [Disagreement resolution](#Disagreement-resolution)
        - [Resolution logs](#Resolution-logs)
        
            - [Evaluation log](#Evaluation-log)
            - [Ground-truth coding fixed](#Ground-truth-coding-fixed)
            - [`Reliability_Names` records merged](#Reliability_Names-records-merged)
            - [Deleted `Reliability_Names` records](#Deleted-Reliability_Names-records)

- [Notes](#Notes)

    - [Notes and questions](#Notes-and-questions)
    - [Errors](#Errors)

- [TODO](#TODO)

    - [Coding to look into](#Coding-to-look-into)
    - [Debugging](#Debugging)

- [DONE](#DONE)

    - [quotes that contain paragraph break](#quotes-that-contain-paragraph-break)

- [NEXT](#DONE)

# Setup

- Back to [Table of Contents](#Table-of-Contents)

## Setup - Imports

- Back to [Table of Contents](#Table-of-Contents)

In [1]:
import datetime
import json
import six

print( "packages imported at " + str( datetime.datetime.now() ) )

packages imported at 2017-06-26 23:15:09.566103


In [2]:
%pwd

'/home/jonathanmorgan/work/sourcenet/django/research/work/msu_phd_work'

## Setup - Initialize Django

- Back to [Table of Contents](#Table-of-Contents)

First, initialize my dev django project, so I can run code in this notebook that references my django models and can talk to the database using my project's settings.

You need to have installed your virtualenv with django as a kernel, then select that kernel for this notebook.

In [3]:
%run django_init.py

django initialized at 2017-06-27 03:15:14.280982


Import any `sourcenet` or `sourcenet_analysis` models or classes.

In [5]:
# django imports
from django.contrib.auth.models import User

# sourcenet shared
from sourcenet.shared.person_details import PersonDetails

# sourcenet models.
from sourcenet.models import Article
from sourcenet.models import Article_Data
from sourcenet.models import Article_Subject
from sourcenet.models import Person
from sourcenet.shared.sourcenet_base import SourcenetBase
from sourcenet.tests.models.test_Article_Data_model import Article_Data_Copy_Tester

# sourcenet article_coding
from sourcenet.article_coding.article_coding import ArticleCoder
from sourcenet.article_coding.manual_coding.manual_article_coder import ManualArticleCoder

# sourcenet_analysis models.
from sourcenet_analysis.models import Reliability_Names
from sourcenet_analysis.reliability.reliability_names_builder import ReliabilityNamesBuilder

print( "sourcenet and sourcenet_analysis packages imported at " + str( datetime.datetime.now() ) )

sourcenet and sourcenet_analysis packages imported at 2017-06-27 03:31:10.313403


## Setup - Tools

- Back to [Table of Contents](#Table-of-Contents)

### Tool - copy Article_Data to user ground_truth

- Back to [Table of Contents](#Table-of-Contents)

Retrieve the ground truth user, then make a deep copy of an Article_Data record, assigning it to the ground truth user.

In [6]:
def copy_to_ground_truth_user( source_article_data_id_IN ):

    '''
    Accepts ID of Article_Data instance to copy to ground_truth user,
        for correcting coding error made by human coder.  Performs a deep
        copy of Article_Data instance, then assignes it to the ground_truth
        user.  Prints any validation errors, returns the new Article_Data.
    '''
    
    # return reference
    new_article_data_instance_OUT = -1
    
    # declare variables
    ground_truth_user = None
    ground_truth_user_id = -1
    id_of_article_data_to_copy = -1
    new_article_data = None
    new_article_data_id = -1
    validation_error_list = None
    validation_error_count = -1
    validation_error = None

    # set ID of article data we want to copy.
    id_of_article_data_to_copy = source_article_data_id_IN

    # get the ground_truth user's ID.
    ground_truth_user = SourcenetBase.get_ground_truth_coding_user()
    ground_truth_user_id = ground_truth_user.id

    # make the copy
    new_article_data = Article_Data.make_deep_copy( id_of_article_data_to_copy,
                                                    new_coder_user_id_IN = ground_truth_user_id )
    new_article_data_id = new_article_data.id

    # validate it.
    validation_error_list = Article_Data_Copy_Tester.validate_article_data_deep_copy( original_article_data_id_IN = id_of_article_data_to_copy,
                                                                                      copy_article_data_id_IN = new_article_data_id,
                                                                                      copy_coder_user_id_IN = ground_truth_user_id )

    # get error count:
    validation_error_count = len( validation_error_list )
    if ( validation_error_count > 0 ):

        # loop and output messages
        for validation_error in validation_error_list:

            print( "- Validation erorr: " + str( validation_error ) )

        #-- END loop over validation errors. --#

    else:

        # no errors - success!
        print( "Record copy a success (as far as we know)!" )

    #-- END check to see if validation errors --#

    print( "copied Article_Data id " + str( id_of_article_data_to_copy ) + " INTO Article_Data id " + str( new_article_data_id ) + " at " + str( datetime.datetime.now() ) )
    
    new_article_data_instance_OUT = new_article_data
    
    return new_article_data_instance_OUT

#-- END function copy_to_ground_truth_user() --#

print( "function copy_to_ground_truth_user() defined at " + str( datetime.datetime.now() ) )

function copy_to_ground_truth_user() defined at 2017-06-27 04:44:31.077522


In [None]:
# Example: set ID of article data we want to copy.
#copy_to_ground_truth_user( 2342 )

### Tool - delete Article_Data

- Back to [Table of Contents](#Table-of-Contents)

Delete the Article_Data whose ID you specify (intended only when you accidentally create a "`ground_truth`").

In [7]:
def delete_article_data( article_data_id_IN ):

    # declare variables
    article_data_id = -1
    article_data = None
    do_delete = False

    # set ID.
    article_data_id = article_data_id_IN

    # get model instance
    article_data = Article_Data.objects.get( id = article_data_id )

    # got something?
    if ( article_data is not None ):

        # yes.  Delete?
        if ( do_delete == True ):

            # delete.
            print( "Deleting Article_Data: " + str( article_data ) )
            article_data.delete()

        else:

            # no delete.
            print( "Found Article_Data: " + str( article_data ) + ", but not deleting." )

        #-- END check to see if we delete --#

    #-- END check to see if Article_Data match. --#
    
#-- END function delete_article_data() --#

print( "function delete_article_data() defined at " + str( datetime.datetime.now() ) )

function delete_article_data() defined at 2017-06-27 04:44:49.740607


### Tool - rebuild Reliability_Names for an article

- Back to [Table of Contents](#Table-of-Contents)

Steps:

- retrieve the Reliability_Names row(s) for article with a paritcular ID, and filter on label if one provided.
- delete the selected Reliability_Names row(s).
- set up a call to the Reliability_Names program that just generates data for:

    - the article in question
    - users in a desired order.
    - etc.

#### Delete existing Reliability_Names for article

- Back to [Table of Contents](#Table-of-Contents)

In [8]:
def delete_reliability_names_for_article( article_id_IN ):

    # declare variables
    article_id = -1
    label = ""
    do_delete = False
    row_string_list = None

    # first, get existing Reliability_Names rows for article and label.
    article_id = article_id_IN
    label = "prelim_month"
    do_delete = True

    # Do the delete
    row_string_list = Reliability_Names.delete_reliabilty_names_for_article( article_id,
                                                                             label_IN = label,
                                                                             do_delete_IN = do_delete )

    # print the strings.
    for row_string in row_string_list:

        # print it.
        print( row_string )

    #-- END loop over row strings --#

#-- END function delete_reliability_names_for_article() --#

print( "function delete_reliability_names_for_article() defined at " + str( datetime.datetime.now() ) )

function delete_reliability_names_for_article() defined at 2017-06-27 04:44:53.506888


#### Make new Reliability_Names

- Back to [Table of Contents](#Table-of-Contents)

In [9]:
def rebuild_reliability_names_for_article( article_id_IN, delete_existing_first_IN = True ):
    
    '''
    Remove existing Reliability_Names records for article, then rebuild them
        from related Article_Data that matches any specified criteria.
        
    Detailed logic:
    - remove old Reliability_Names for that article ( [Delete existing `Reliability_Names` for article](#Delete-existing-Reliability_Names-for-article) ).  Make sure to specify both label and Article ID, so you don't delete more than you intend.
    - re-run Reliability_Names creation for the article ( [Make new `Reliability_Names`](#Make-new-Reliability_Names) ).  Specify:

        - Article ID list (just put the ID of the article you want to reprocess in the list).
        - label: make sure this is the same as the label of the rest of your Reliability_Names records ("prelim_month").
        - Tag list: If you want to make even more certain that you don't do something unexpected, also specify the article tags that make up your current data set, so if you accidentally specify the ID of an article not in your data set, it won't process.  Current tag is "grp_month".
        - Coders to assign to which index in the Reliability_Names record, and in what priority.  You can assign multiple coders to a given index, for example, when multiple coders coded subsets of a data set, and you want their combined coding to be used as "coder 1" or "coder 2", for example.  See the cell for an example.
        - Automated coder type: You can specify the particular automated coding type you want for automated coder, to filter out coding done by other automated methods.  See the cell for an example for "OpenCalais v2".
    '''
    
    # django imports
    #from django.contrib.auth.models import User

    # sourcenet imports
    #from sourcenet.shared.sourcenet_base import SourcenetBase

    # sourcenet_analysis imports
    #from sourcenet_analysis.reliability.reliability_names_builder import ReliabilityNamesBuilder

    # declare variables
    my_reliability_instance = None
    tag_in_list = []
    article_id_in_list = []
    label = ""

    # declare variables - user setup
    current_coder = None
    current_coder_id = -1
    current_index = -1

    # declare variables - Article_Data filtering.
    coder_type = ""

    # delete old Reliability_Names?
    if ( delete_existing_first_IN == True ):
        
        # delete first
        delete_reliability_names_for_article( article_id_IN )
        
    #-- END check to see if we delete first --#
    
    # make reliability instance
    my_reliability_instance = ReliabilityNamesBuilder()

    #===============================================================================
    # configure
    #===============================================================================

    # list of tags of articles we want to process.
    tag_in_list = [ "grp_month", ]

    # list of IDs of articles we want to process:
    article_id_in_list = [ article_id_IN, ]

    # label to associate with results, for subsequent lookup.
    label = "prelim_month"

    # ! ====> map coders to indices

    # set it up so that...

    # ...the ground truth user has highest priority (4) for index 1...
    current_coder = SourcenetBase.get_ground_truth_coding_user()
    current_coder_id = current_coder.id
    current_index = 1
    current_priority = 4
    my_reliability_instance.add_coder_at_index( current_coder_id, current_index, priority_IN = current_priority )

    # ...coder ID 8 is priority 3 for index 1...
    current_coder_id = 8
    current_index = 1
    current_priority = 3
    my_reliability_instance.add_coder_at_index( current_coder_id, current_index, priority_IN = current_priority )

    # ...coder ID 9 is priority 2 for index 1...
    current_coder_id = 9
    current_index = 1
    current_priority = 2
    my_reliability_instance.add_coder_at_index( current_coder_id, current_index, priority_IN = current_priority )

    # ...coder ID 10 is priority 1 for index 1...
    current_coder_id = 10
    current_index = 1
    current_priority = 1
    my_reliability_instance.add_coder_at_index( current_coder_id, current_index, priority_IN = current_priority )

    # ...and automated coder (2) is index 2
    current_coder = SourcenetBase.get_automated_coding_user()
    current_coder_id = current_coder.id
    current_index = 2
    current_priority = 1
    my_reliability_instance.add_coder_at_index( current_coder_id, current_index, priority_IN = current_priority )

    # and only look at coding by those users.  And...

    # configure so that it limits to automated coder_type of OpenCalais_REST_API_v2.
    coder_type = "OpenCalais_REST_API_v2"
    #my_reliability_instance.limit_to_automated_coder_type = "OpenCalais_REST_API_v2"
    my_reliability_instance.automated_coder_type_include_list.append( coder_type )

    # output debug JSON to file
    #my_reliability_instance.debug_output_json_file_path = "/home/jonathanmorgan/" + label + ".json"

    #===============================================================================
    # process
    #===============================================================================

    # process articles
    my_reliability_instance.process_articles( tag_in_list,
                                              article_id_in_list_IN = article_id_in_list )

    # output to database.
    my_reliability_instance.output_reliability_data( label )

#-- END function rebuild_reliability_names_for_article() --#

print( "function rebuild_reliability_names_for_article() defined at " + str( datetime.datetime.now() ) )

function rebuild_reliability_names_for_article() defined at 2017-06-27 04:44:55.412160


# Evaluate disagreements

- Back to [Table of Contents](#Table-of-Contents)

Need to go through each disagreement and make sure that the ground truth is correct.  In the interest of accuracy/precision/recall, my human coding serves as ground truth to compare computer against.  So, will look at all the disagreements and make sure that the human coding is right.  This isn't perfect.  The error where both incorrectly agree is still unaddressed, and would effectively require me to re-code all the articles (which I could do...).  But, better than not checking.

## Tag disagreements as TODO

- Back to [Table of Contents](#Table-of-Contents)

First, assign "TODO" tag to all disagreements using the "View reliability name information" screen:

- [http://research.local/sourcenet/sourcenet/analysis/reliability/names/disagreement/view](http://research.local/sourcenet/sourcenet/analysis/reliability/names/disagreement/view).

To do this:

- First, enter the following in the fields there:

    - **Label:** "prelim_month"
    - **Coders to compare (1 through ==>):** 2
    - **Reliability names filter type:** Select "Disagree (only rows with disagreement between coders)"
    
- Click the "**Submit Query**" button.  This should load all the disagreement rows (424 after removing single-word names).
- Click the "**(all)**" link in the "**select**" column header to check the checkbox next to all of the records.
- In the "**Reliability names action:**" field, select "Add tag(s) to selected".
- In the "**Tag(s) - (comma-delimited):**" field, enter "`TODO`" (without the quotes).
- Click the "**Do Action**" button.

## View disagreements

- Back to [Table of Contents](#Table-of-Contents)

Evaluate disagreements using the "View reliability name information" screen:

- [http://research.local/sourcenet/sourcenet/analysis/reliability/names/disagreement/view](http://research.local/sourcenet/sourcenet/analysis/reliability/names/disagreement/view)

To start, enter the following in fields there:

- **Label:** "prelim_month"
- **Coders to compare (1 through ==>):** 2
- **Reliability names filter type:** Select "Lookup"
- **[Lookup] - Reliability_Names tags (comma-delimited):** Enter "`TODO`" (without the quotes).

Then click the "**Submit Query**" button.

You should see all the records with disagreements that still need to be evaluated (we remove "TODO" from records as we go to keep track of which we have evaluated).  To start, the same 424 that had disagreements after removing single names should be assigned "TODO" tag.

### Disagreement evaluation

- Back to [Table of Contents](#Table-of-Contents)

Need to look at each instance where there is a disagreement and make sure the human coding is correct.

Most are probably instances where the computer screwed up, but since we are calling this human coding "ground truth", want to winnow out as much human error as possible.

For each disagreement, to check for coder error (like just capturing a name part for a person whose full name was in the story), click the "Article ID" in the column that has a link to article ID. It will take you to a view of the article where all the people who coded the article are included, with each detection of a mention or quotation displayed next to the paragraph where the person was originally first detected.

If the disagreement deals with mentions only, and if the person shouldn't instead have been quoted, it is OK to skip fixing it if the human coder was in error since those are not included in this work.  It is also OK to fix if you want.

### Disagreement resolution

For each disagreement, click on the article ID link in the row to go to the article and check to see if the human coding for the disagreement in question is correct ( [http://research.local/sourcenet/sourcenet/article/article_data/view_with_text/](http://research.local/sourcenet/sourcenet/article/article_data/view_with_text/) ).

#### Human coder error

If human coder did not detect person or made some other kind of error:

- Setup (set variable values, then run the cell):

In [10]:
# Setup variables of interest.
resolve_article_id = 21120
human_article_data_id = 2348

- use the function "`copy_to_ground_truth_user()`" defined in section [Tool - copy Article_Data to user ground_truth](#Tool---copy-Article_Data-to-user-ground_truth) to create a copy of the person's `Article_Data` and assign it to coder "`ground_truth`".  Make a code cell and set up a call to "`copy_to_ground_truth_user()`", passing it the ID of the `Article_Data` you want to copy to `ground_truth`.  Example:
    
        # copy Article_Data 12345 to ground_truth user.
        copy_to_ground_truth_user( 12345 )

In [11]:
# copy Article_Data to ground_truth user.
copy_to_ground_truth_user( human_article_data_id )

Original Article_Data ID = 2348; Copy Article_Data ID = 3332


Article_Data_Notes ( count = 1 ):
- 2101 - Person Store JSON (likely from manual coding via article-code view). of type "json" for article_data: 2348 - minnesota1 - no coder_type -- Article: 21120 - Dec 05, 2009, Religion ( C3 ), UID: 12C757A1EDED04E0 - Founders aim to make retreat center a place to meet with God - Wingshadow sits on 60 acres on the Flat River near Greenville ( Grand Rapids Press, The )

Article_Author ( count = 1 ):

- 2423 (AA) - Ogg, Aaron ( id = 425; type = staff; capture_method = OpenCalais_REST_API_v2 )

----> Alternate_Author_Match ( count = 0 ):

Article_Subject ( count = 4 ):

- 8144 (AS) - Lewis, John ( id = 1761; capture_method = OpenCalais_REST_API_v2 ) (quoted; individual)

----> Alternate_Subject_Match ( count = 0 ):

----> Article_Subject_Mention ( count = 1 ):
----> - 21667 -  ( graf: 3; from word: 54; to word: 55; index: 342 ) - 8144 (AS) - Lewis, John ( id = 1761; capture_method = OpenCala

<Article_Data: 3332 - ground_truth - no coder_type -- Article: 21120 - Dec 05, 2009, Religion ( C3 ), UID: 12C757A1EDED04E0 - Founders aim to make retreat center a place to meet with God - Wingshadow sits on 60 acres on the Flat River near Greenville ( Grand Rapids Press, The )>

- fix the coding in `ground_truth`'s coding record:

    - If you want to stay logged in as your normal user while processing an error, do the following in a separate browser (I like Opera).
    - if this is the first time you've used the "`ground_truth`" user, log into the django admin ( [http://research.local/sourcenet/admin/](http://research.local/sourcenet/admin/) ) and:

        - set or reset the "`ground_truth`" user's password.
        - give it "staff status".

    - log in to the coding tool ( [http://research.local/sourcenet/sourcenet/article/code/](http://research.local/sourcenet/sourcenet/article/code/) ) as the "`ground_truth`" user and fix the coding for the article in question, then save.

- In the Reliability_Names disagreement view ( [http://research.local/sourcenet/sourcenet/analysis/reliability/names/disagreement/view](http://research.local/sourcenet/sourcenet/analysis/reliability/names/disagreement/view) ), remove the "`TODO`" tag from any items related to this disagreement and save.  Place this information in the [Evaluation log](#Evaluation-log) and the [Ground truth coding fixed](#Ground-truth-coding-fixed) log.

    - Click the checkbox in the "**select**" column next to the record whose evaluation is complete.
    - In the "**Reliability names action:**" field, select "_Remove tag(s) from selected_".
    - In the "**Tag(s) - (comma-delimited):**" field, enter "_`TODO`_" (without the quotes).
    - Click the "**Do Action**" button.

- rebuild Reliability_Names for just that article.

    - make a code cell and call "`rebuild_reliability_names_for_article()`", passing it the ID of the article whose Reliability_Names records you want to rebuild.  It will automatically delete existing and then rebuild, using all the right parameters.  Example:

            # rebuild Reliability_Names for article 12345
            rebuild_reliability_names_for_article( 12345 )

In [12]:
# rebuild Reliability_Names for article
rebuild_reliability_names_for_article( resolve_article_id )

Found 7 records.
- delete()-ing: 9698 - label: prelim_month - article ID: 21120 - Aaron Ogg ( 425 ) - coders: 12 ====> 1 - 8; 1; 425 ====> 2 - 2; 1; 425
- delete()-ing: 9704 - label: prelim_month - article ID: 21120 - C. John Lewis ( 2820 ) - coders: 12 ====> 1 - 8; 0; 0 ====> 2 - 2; 1; 2820
- delete()-ing: 9703 - label: prelim_month - article ID: 21120 - Dawn Lewis ( 2821 ) - coders: 12 ====> 1 - 8; 0; 0 ====> 2 - 2; 1; 2821
- delete()-ing: 9699 - label: prelim_month - article ID: 21120 - John Lewis ( 1761 ) - coders: 12 ====> 1 - 8; 1; 1761 ====> 2 - 2; 1; 1761
- delete()-ing: 9700 - label: prelim_month - article ID: 21120 - Stephen Lewis ( 1762 ) - coders: 12 ====> 1 - 8; 1; 1762 ====> 2 - 2; 1; 1762
- delete()-ing: 9702 - label: prelim_month - article ID: 21120 - A.W. Tozer ( 1764 ) - coders: 12 ====> 1 - 8; 1; 1764 ====> 2 - 2; 1; 1764
- delete()-ing: 9701 - label: prelim_month - article ID: 21120 - Dallas Willard ( 1763 ) - coders: 12 ====> 1 - 8; 1; 1763 ====> 2 - 2; 0; 0
- Arti

- Then, you'll need to re-fix any other problems with the article. Specifically:

    - load just the Reliability_Names for this article - [http://research.local/sourcenet/sourcenet/analysis/reliability/names/disagreement/view](http://research.local/sourcenet/sourcenet/analysis/reliability/names/disagreement/view):

        - **Label:** "prelim_month"
        - **Coders to compare (1 through ==>):** 2
        - **Reliability names filter type:** Select "Lookup"
        - **[Lookup] - Associated Article IDs (comma-delimited):** Enter "`<article_id>`," (without the quotes).

    - check for single names, either to remove, or to tie an erroneously parsed name to the correct person (forgot to capture first name, for example).

        - If two people that should be tied together are not, you'll need to merge the two rows.  To merge two rows:

            - In the "**select**" checkbox, click the checkbox next to the erroneous entry that you want to merge into the correct entry.
            - In the "**merge INTO**" checkbox, click the checkbox next to the entry INTO WHICH you want to merge.
            - In the "**Reliability names action:**" field, select "Merge Coding --> FROM 'select' TO 'merge INTO'".
            - Click the "**Do Action**" button.

    - add again the "TODO" tag to any rows with disagreement, or if no disagreements, to the row that initiated this work.

        - Click the checkbox in the "**select**" column next to any records that are either disagreements or the person who initiated this work.
        - In the "**Reliability names action:**" field, select "_Add tag(s) to selected_".
        - In the "**Tag(s) - (comma-delimited):**" field, enter "_`TODO`_" (without the quotes).
        - Click the "**Do Action**" button.

If there is a problem where human and computer coding of same person are so different they split into different rows, merge the computer row into the human row, then remove the computer row.

- TK

Once you've evaluated and verified the human coding, remove the "`TODO`" tag from the current record (either from the single-article view above if you've removed all disagreements, or from the disagreement view if not):

- Click the checkbox in the "**select**" column next to the record whose evaluation is complete.
- In the "**Reliability names action:**" field, select "_Remove tag(s) from selected_".
- In the "**Tag(s) - (comma-delimited):**" field, enter "_`TODO`_" (without the quotes).
- Click the "**Do Action**" button.

### Resolution logs

- Back to [Table of Contents](#Table-of-Contents)

Table of Reliability_Names records with disagreements, then separate tables of those where:

- human coding had to be fixed.
- records for the same person needed to be merged together.
- coding had to be deleted.

#### Evaluation log

- Back to [Table-of-Contents](#Table-of-Contents)

Track each Reliability_Names that we evaluate:

| ID | Name | Article | Article_Data_List | Status | Error? (SHB = Should Have Been)| Notes |
|------|------|------|------|------|------|------|
| 8362 | Jack Hagedorn | Article [20645](http://research.local/sourcenet/sourcenet/article/article_data/view_with_text/?article_id=20645) | Article_Data: [2322 (coder=8)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=20645&article_data_id_select=2322); [2984 (coder=2)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=20645&article_data_id_select=2984) | CORRECT | Mentioned, SHB Quoted | None |
| 8367 | Bo Damstra | Article [20647](http://research.local/sourcenet/sourcenet/article/article_data/view_with_text/?article_id=20647) | Article_Data: [2355 (coder=10)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=20647&article_data_id_select=2355) | CORRECT | MISSED | None |
| 8369 | Kyle Moody | Article [20647](http://research.local/sourcenet/sourcenet/article/article_data/view_with_text/?article_id=20647) | Article_Data: [2355 (coder=10)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=20647&article_data_id_select=2355); [2978 (coder=2)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=20647&article_data_id_select=2978) | CORRECT | Mentioned, SHB Quoted | None |
| 8385 | Tony Baker | Article [20653](http://research.local/sourcenet/sourcenet/article/article_data/view_with_text/?article_id=20653) | Article_Data: [2323 (coder=8)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=20653&article_data_id_select=2323); [2985 (coder=2)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=20653&article_data_id_select=2985) | CORRECT | Mentioned, SHB Quoted | Odd grammar - no said verb |
| 8420 | Ahlanna Holmes | Article [20663](http://research.local/sourcenet/sourcenet/article/article_data/view_with_text/?article_id=20663) | Article_Data: [2321 (coder=9)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=20663&article_data_id_select=2321); [2986 (coder=2)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=20663&article_data_id_select=2986) | CORRECT | Quoted, SHB Mentioned | No clue - 4 mo. old murder victim |
| 8502 | Dean Agee | Article [20695](http://research.local/sourcenet/sourcenet/article/article_data/view_with_text/?article_id=20695) | Article_Data: [2358 (coder=10)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=20695&article_data_id_select=2358) | CORRECT | MISSED | Missed entirely |
| 8576 | Dan Strikwerda | Article [20722](http://research.local/sourcenet/sourcenet/article/article_data/view_with_text/?article_id=20722) | Article_Data: [2325 (coder=8)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=20722&article_data_id_select=2325); [2981 (coder=2)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=20722&article_data_id_select=2981) | CORRECT | Mentioned, SHB Quoted | None |
| 8591 | Jennifer Granholm | Article [20724](http://research.local/sourcenet/sourcenet/article/article_data/view_with_text/?article_id=20724) | Article_Data: [2394 (coder=9)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=20724&article_data_id_select=2394); [2988 (coder=2)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=20724&article_data_id_select=2988) | CORRECT | Quoted, SHB Mentioned | "say" used, compound subject, not a direct quote.  |
| 8625 | Calvin Bosman | Article [20739](http://research.local/sourcenet/sourcenet/article/article_data/view_with_text/?article_id=20739) | Article_Data: [2326 (coder=8)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=20739&article_data_id_select=2326); [2980 (coder=2)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=20739&article_data_id_select=2980) | CORRECT | Quoted, SHB Mentioned | None |
| 8621 | Curtis Jacobs | Article [20739](http://research.local/sourcenet/sourcenet/article/article_data/view_with_text/?article_id=20739) | Article_Data: [2326 (coder=8)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=20739&article_data_id_select=2326); [2980 (coder=2)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=20739&article_data_id_select=2980) | CORRECT | Quoted, SHB Mentioned | Might be related to 3-gram - "As the parents of Curtis Jacobs openly described..." |
| 10625 | John Chapin | Article [20749](http://research.local/sourcenet/sourcenet/article/article_data/view_with_text/?article_id=20749) | Article_Data: [3318 (coder=13)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=20749&article_data_id_select=3318); [3001 (coder=2)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=20749&article_data_id_select=3001) | INCOMPLETE | Coder missed a person | Coder missed a person because of limitation of coding application ("John and Laura Chapin"), since fixed so I could fix in ground truth. |
| 8844 | Creative Arts Martin Luther King Jr. | Article [20813](http://research.local/sourcenet/sourcenet/article/article_data/view_with_text/?article_id=20813) | Article_Data: [2991 (coder=2)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=20813&article_data_id_select=2991) | CORRECT | Not a person | Human got it right (well, sort of - they got the actual name, but they shouldn't have captured it at all).|
| 8845 | Harrison Park | Article [20813](http://research.local/sourcenet/sourcenet/article/article_data/view_with_text/?article_id=20813) | Article_Data: [2991 (coder=2)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=20813&article_data_id_select=2991) | CORRECT | Not a person | None |
| 8843 | Martin Luther King Jr. | Article [20813](http://research.local/sourcenet/sourcenet/article/article_data/view_with_text/?article_id=20813) | Article_Data: [2408 (coder=10)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=20813&article_data_id_select=2408) | ERROR | Not a person | Protocol states only capture name when referencing person, not as part of name of building, etc.  Removed from ground truth. |
| 8853 | George Heartwell | Article [20815](http://research.local/sourcenet/sourcenet/article/article_data/view_with_text/?article_id=20815) | Article_Data: [2409 (coder=10)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=20815&article_data_id_select=2409); [2992 (coder=2)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=20815&article_data_id_select=2992) | CORRECT | MISSED | meeting story - "voiced by Mayor George Heartwell", then "he said" was in next paragraph. |
| 10635 | George Heartwell | Article [20815](http://research.local/sourcenet/sourcenet/article/article_data/view_with_text/?article_id=20815) | Article_Data: [3319 (coder=13)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=20815&article_data_id_select=3319); [2992 (coder=2)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=20815&article_data_id_select=2992) | CORRECT | Mentioned, SHB Quoted | None |
| 10637 | James White | Article [20815](http://research.local/sourcenet/sourcenet/article/article_data/view_with_text/?article_id=20815) | Article_Data: [3319 (coder=13)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=20815&article_data_id_select=3319); [2992 (coder=2)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=20815&article_data_id_select=2992) | ERROR | Coder missed a person | None |
| 8858 | Carl Levin | Article [20818](http://research.local/sourcenet/sourcenet/article/article_data/view_with_text/?article_id=20818) | Article_Data: [2473 (coder=9)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=20818&article_data_id_select=2473) | CORRECT | MISSED | mention |
| 10650 | Peter Hoekstra | Article [20843](http://research.local/sourcenet/sourcenet/article/article_data/view_with_text/?article_id=20843) | Article_Data: [3321 (coder=13)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=20843&article_data_id_select=3321); [3000 (coder=2)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=20843&article_data_id_select=3000) | CORRECT | Mentioned, SHB Quoted | "he said" in next paragraph. |
| 10648 | Armand Robinson | Article [20843](http://research.local/sourcenet/sourcenet/article/article_data/view_with_text/?article_id=20843) | Article_Data: [3321 (coder=13)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=20843&article_data_id_select=3321); [3000 (coder=2)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=20843&article_data_id_select=3000) | CORRECT | Mentioned, SHB Quoted | '"I'm not sure this is a winnable conflict," said the 80-year-old East Grand Rapids resident, an Obama supporter.' in next paragraph. |
| 8991 | Fred Meijer | Article [20854](http://research.local/sourcenet/sourcenet/article/article_data/view_with_text/?article_id=20854) | Article_Data: [3012 (coder=2)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=20854&article_data_id_select=3012) | ERROR | MISSED | "..., and Fred and Lena Meijer and J.A. Woollam Foundation..." - not part of foundation name - comma fail |
| 8990 | Lena Meijer | Article [20854](http://research.local/sourcenet/sourcenet/article/article_data/view_with_text/?article_id=20854) | Article_Data: [3012 (coder=2)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=20854&article_data_id_select=3012) | ERROR | MISSED | "..., and Fred and Lena Meijer and J.A. Woollam Foundation..." - not part of foundation name - comma fail |
| 9129 | Dick Ball | Article [20902](http://research.local/sourcenet/sourcenet/article/article_data/view_with_text/?article_id=20902) | Article_Data: [2477 (coder=9)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=20902&article_data_id_select=2477); [3020 (coder=2)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=20902&article_data_id_select=3020) | CORRECT | Mentioned, SHB Quoted | Rep. Dick Ball, R-Bennington Township, worried lawmakers were rushing into changes that could dilute the quality of education simply to chase "an unknown chance for unknown hundreds of millions of dollars." |
| 9125 | Wayne Kuipers | Article [20902](http://research.local/sourcenet/sourcenet/article/article_data/view_with_text/?article_id=20902) | Article_Data: [2477 (coder=9)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=20902&article_data_id_select=2477); [3020 (coder=2)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=20902&article_data_id_select=3020) | CORRECT | Mentioned, SHB Quoted | Sen. Wayne Kuipers, R-Holland, chairman of the Senate Education Committee, said... |
| 10660 | Arne Duncan | Article [20919](http://research.local/sourcenet/sourcenet/article/article_data/view_with_text/?article_id=20919) | Article_Data: [3322 (coder=13)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=20919&article_data_id_select=3322); [3010 (coder=2)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=20919&article_data_id_select=3010) | ERROR | Quoted, SHB Mentioned | "...U.S. Secretary of Education Arne Duncan has said he hopes to see..." not a quote. |
| 9225 | Godfrey Lee | Article [20929](http://research.local/sourcenet/sourcenet/article/article_data/view_with_text/?article_id=20929) | Article_Data: [3009 (coder=2)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=20929&article_data_id_select=3009) | CORRECT | Not a person | "...the Godfrey Lee school district..." |
| 9227 | SuAnn Bruggink | Article [20929](http://research.local/sourcenet/sourcenet/article/article_data/view_with_text/?article_id=20929) | Article_Data: [2335 (coder=8)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=20929&article_data_id_select=2335) | CORRECT | MISSED | None |
| 10665 | Brandon Dillon | Article [20930](http://research.local/sourcenet/sourcenet/article/article_data/view_with_text/?article_id=20930) | Article_Data: [3323 (coder=13)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=20930&article_data_id_select=3323); [3006 (coder=2)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=20930&article_data_id_select=3006) | ERROR | MISSED | Coder missed this person. |
| 10261 | Gary Scholten | Article [20968](http://research.local/sourcenet/sourcenet/article/article_data/view_with_text/?article_id=20968) | Article_Data: [2479 (coder=9)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=20968&article_data_id_select=2479) | CORRECT | MISSED | "...the system, Register of Deeds Gary Scholten said." |
| 9349 | Fred Meijer | Article [20981](http://research.local/sourcenet/sourcenet/article/article_data/view_with_text/?article_id=20981) | Article_Data: [3027 (coder=2)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=20981&article_data_id_select=3027) | ERROR | MISSED | Fred and Lena Meijer and J.A. Woollam Foundation" - not part of foundation name - comma fail |
| 9353 | Stephen Neumer | Article [20981](http://research.local/sourcenet/sourcenet/article/article_data/view_with_text/?article_id=20981) | Article_Data: [2337 (coder=8)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=20981&article_data_id_select=2337); [3027 (coder=2)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=20981&article_data_id_select=3027) | CORRECT | Mentioned, SHB Quoted | A representative for McClendon, attorney Stephen Neumer, confirmed the development. - not "said"... |
| 9403 | Dick Morris | Article [21001](http://research.local/sourcenet/sourcenet/article/article_data/view_with_text/?article_id=21001) | Article_Data: [3032 (coder=2)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=21001&article_data_id_select=3032) | ERROR | Human coder MISSED | None |
| 9408 | Jeff Hawkins | Article [21007](http://research.local/sourcenet/sourcenet/article/article_data/view_with_text/?article_id=21007) | Article_Data: [2440 (coder=10)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=21007&article_data_id_select=2440) | ERROR | Miscategorized as Author, not Subject | None |
| 9414 | Jeff Hawkins | Article [21007](http://research.local/sourcenet/sourcenet/article/article_data/view_with_text/?article_id=21007) | Article_Data: [3024 (coder=2)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=21007&article_data_id_select=3024) | ERROR | Miscategorized as Author, not Subject | None |
| 9412 | Robert Homan | Article [21007](http://research.local/sourcenet/sourcenet/article/article_data/view_with_text/?article_id=21007) | Article_Data: [3024 (coder=2)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=21007&article_data_id_select=3024) | ERROR | MISSED | None |
| 9411 | Jan Lastocy | Article [21007](http://research.local/sourcenet/sourcenet/article/article_data/view_with_text/?article_id=21007) | Article_Data: [2440 (coder=10)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=21007&article_data_id_select=2440); [3024 (coder=2)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=21007&article_data_id_select=3024) | CORRECT | Mentioned, SHB Quoted | None |
| 9435 | Port Sheldon | Article [21017](http://research.local/sourcenet/sourcenet/article/article_data/view_with_text/?article_id=21017) | Article_Data: [3021 (coder=2)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=21017&article_data_id_select=3021) | CORRECT | ERROR, Place names - List of Township names | "...spread across Holland, Park, Olive and Port Sheldon townships." |
| 9436 | Olive Sheldon | Article [21017](http://research.local/sourcenet/sourcenet/article/article_data/view_with_text/?article_id=21017) | Article_Data: [3021 (coder=2)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=21017&article_data_id_select=3021) | CORRECT | ERROR, Place names - List of Township names | "...spread across Holland, Park, Olive and Port Sheldon townships." |
| 9443 | Burton Quick | Article [21023](http://research.local/sourcenet/sourcenet/article/article_data/view_with_text/?article_id=21023) | Article_Data: [3026 (coder=2)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=21023&article_data_id_select=3026) | CORRECT | Place name | "...killing of Burton Quick Stop owner Mohammed Ghannam,..." |
| 9442 | Chris Cameron | Article [21023](http://research.local/sourcenet/sourcenet/article/article_data/view_with_text/?article_id=21023) | Article_Data: [2481 (coder=9)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=21023&article_data_id_select=2481); [3026 (coder=2)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=21023&article_data_id_select=3026) | CORRECT | Mentioned, SHB Quoted | "he said" followed name of murder victim, two paragraphs after Cameron was introduced. |
| 9440 | Robert Fryling | Article [21023](http://research.local/sourcenet/sourcenet/article/article_data/view_with_text/?article_id=21023) | Article_Data: [2481 (coder=9)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=21023&article_data_id_select=2481); [3026 (coder=2)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=21023&article_data_id_select=3026) | CORRECT | Quoted, SHB Mentioned | "he said" followed name of murder victim, but referred to Cameron, introduced two paragraphs earlier. |
| 9484 | Brent Vander Kolk | Article [21043](http://research.local/sourcenet/sourcenet/article/article_data/view_with_text/?article_id=21043) | Article_Data: [2342 (coder=8)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=21043&article_data_id_select=2342); [3040 (coder=2)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=21043&article_data_id_select=3040) | ERROR | Not really a quote. | "Vander Kolk earlier had described Beene as a well-liked and respected probation officer with a clean work record." |
| 9502 | Meg Ryan | Article [21047](http://research.local/sourcenet/sourcenet/article/article_data/view_with_text/?article_id=21047) | Article_Data: [2442 (coder=10)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=21047&article_data_id_select=2442) | CORRECT | MISSED | There was a "/" after the name, and maybe because of how it was phrased? - "...and the Meg Ryan/Ashley Tisdale film "Sleepless Beauty" is tentative for February." |
| 9503 | Ashley Tisdale | Article [21047](http://research.local/sourcenet/sourcenet/article/article_data/view_with_text/?article_id=21047) | Article_Data: [2442 (coder=10)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=21047&article_data_id_select=2442) | CORRECT | MISSED | There was a "/" before the name: "...and the Meg Ryan/Ashley Tisdale film "Sleepless Beauty" is tentative for February." |
| 9512 | Jennifer Granholm | Article [21051](http://research.local/sourcenet/sourcenet/article/article_data/view_with_text/?article_id=21051) | Article_Data: [2484 (coder=9)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=21051&article_data_id_select=2484); [3039 (coder=2)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=21051&article_data_id_select=3039) | CORRECT | Quoted, SHB Mentioned | verb "announced": Gov. Jennifer Granholm announced the appointment Friday... |
| 9593 | Patti Vab Syzkle | Article [21080](http://research.local/sourcenet/sourcenet/article/article_data/view_with_text/?article_id=21080) | Article_Data: [3037 (coder=2)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=21080&article_data_id_select=3037) | CORRECT | misspelling in article | Actually Patti Van Syzkle |
| 9588 | Patti Van Syzkle | Article [21080](http://research.local/sourcenet/sourcenet/article/article_data/view_with_text/?article_id=21080) | Article_Data: [2343 (coder=8)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=21080&article_data_id_select=2343); [3037 (coder=2)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=21080&article_data_id_select=3037) | CORRECT | Mentioned, SHB Quoted | Direct quote was attributed to misspelled name (the item above this in this table).  Tough one.|
| 9659 | Justin Amash | Article [21108](http://research.local/sourcenet/sourcenet/article/article_data/view_with_text/?article_id=21108) | Article_Data: [2345 (coder=8)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=21108&article_data_id_select=2345); [3046 (coder=2)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=21108&article_data_id_select=3046) | CORRECT | Mentioned, SHB quoted | Dumb one - two direct quotes in the two sentences subsequent to this person's introduction. |
| 9669 | Sculpture Park | Article [21109](http://research.local/sourcenet/sourcenet/article/article_data/view_with_text/?article_id=21109) | Article_Data: [3045 (coder=2)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=21109&article_data_id_select=3045) | CORRECT | Location Name | Separated by ampersand from rest of "a final tour of Frederik Meijer Gardens & Sculpture Park with his family at his side."  |
| 9692 | Ellsworth Kelly | Article [21116](http://research.local/sourcenet/sourcenet/article/article_data/view_with_text/?article_id=21116) | Article_Data: [2447 (coder=10)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=21116&article_data_id_select=2447) | CORRECT | MISSED | "...most notably Ellsworth Kelly's "Blue White," a 25-foot- tall wall sculpture..." |
| 9693 | Rembrandt van Rijn | Article [21116](http://research.local/sourcenet/sourcenet/article/article_data/view_with_text/?article_id=21116) | Article_Data: [2447 (coder=10)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=21116&article_data_id_select=2447) | CORRECT | MISSED | "...is Rembrandt van Rijn's "The Three Crosses," an early impression from the fourth state..." |
| 9701 | Dallas Willard | Article [21120](http://research.local/sourcenet/sourcenet/article/article_data/view_with_text/?article_id=21120) | Article_Data: [2348 (coder=8)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=21120&article_data_id_select=2348) | CORRECT | MISSED | None |
| 9703 | Dawn Lewis | Article [21120](http://research.local/sourcenet/sourcenet/article/article_data/view_with_text/?article_id=21120) | Article_Data: [3050 (coder=2)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=21120&article_data_id_select=3050) | ERROR | MISSED | "Lewis, 56, and his wife, Dawn, 54, live in the nine-bedroom center. Dawn Lewis cooks meals with volunteers' help." |

#### Ground truth coding fixed

- Back to [Table-of-Contents](#Table-of-Contents)

For some, the error will be on the part of the human coder.  For human error, we create a new "`ground_truth`" record that we will correct, so we preserve original coding (and evidence of errors) in case we want or need that information later.  Below, we have a table of the articles where we had to fix ground truth.  To find the original coding, click the Article link.

| ID | Name | Article | Article_Data_List | Status | Error? (SHB = Should Have Been)| Notes |
|------|------|------|------|------|------|------|
| 10625 | John Chapin | Article [20749](http://research.local/sourcenet/sourcenet/article/article_data/view_with_text/?article_id=20749) | Article_Data: [3318 (coder=13)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=20749&article_data_id_select=3318); [3001 (coder=2)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=20749&article_data_id_select=3001) | INCOMPLETE | Coder missed a person | Coder missed a person because of limitation of coding protocol ("John and Laura Chapin") |
| 8843 | Martin Luther King Jr. | Article [20813](http://research.local/sourcenet/sourcenet/article/article_data/view_with_text/?article_id=20813) | Article_Data: [2408 (coder=10)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=20813&article_data_id_select=2408) | ERROR | Not a person | Protocol states only capture name when referencing person, not as part of name of building, etc.  Removed from ground truth. |
| 10637 | James White | Article [20815](http://research.local/sourcenet/sourcenet/article/article_data/view_with_text/?article_id=20815) | Article_Data: [3319 (coder=13)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=20815&article_data_id_select=3319); [2992 (coder=2)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=20815&article_data_id_select=2992) | ERROR | Coder missed a person | None |
| 8991 | Fred Meijer | Article [20854](http://research.local/sourcenet/sourcenet/article/article_data/view_with_text/?article_id=20854) | Article_Data: [3012 (coder=2)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=20854&article_data_id_select=3012) | ERROR | MISSED | "..., and Fred and Lena Meijer and J.A. Woollam Foundation..." - not part of foundation name - comma fail |
| 8990 | Lena Meijer | Article [20854](http://research.local/sourcenet/sourcenet/article/article_data/view_with_text/?article_id=20854) | Article_Data: [3012 (coder=2)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=20854&article_data_id_select=3012) | ERROR | MISSED | "..., and Fred and Lena Meijer and J.A. Woollam Foundation..." - not part of foundation name - comma fail |
| 10660 | Arne Duncan | Article [20919](http://research.local/sourcenet/sourcenet/article/article_data/view_with_text/?article_id=20919) | Article_Data: [3322 (coder=13)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=20919&article_data_id_select=3322); [3010 (coder=2)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=20919&article_data_id_select=3010) | ERROR | Quoted, SHB Mentioned | "...U.S. Secretary of Education Arne Duncan has said he hopes to see..." not a quote. |
| 10665 | Brandon Dillon | Article [20930](http://research.local/sourcenet/sourcenet/article/article_data/view_with_text/?article_id=20930) | Article_Data: [3323 (coder=13)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=20930&article_data_id_select=3323); [3006 (coder=2)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=20930&article_data_id_select=3006) | ERROR | MISSED | Coder missed this person. |
| 9349 | Fred Meijer | Article [20981](http://research.local/sourcenet/sourcenet/article/article_data/view_with_text/?article_id=20981) | Article_Data: [3027 (coder=2)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=20981&article_data_id_select=3027) | ERROR | MISSED | "..., and Fred and Lena Meijer and J.A. Woollam Foundation..." - not part of foundation name - comma fail |
| 9403 | Dick Morris | Article [21001](http://research.local/sourcenet/sourcenet/article/article_data/view_with_text/?article_id=21001) | Article_Data: [3032 (coder=2)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=21001&article_data_id_select=3032) | ERROR | Human coder MISSED | None |
| 9408 | Jeff Hawkins | Article [21007](http://research.local/sourcenet/sourcenet/article/article_data/view_with_text/?article_id=21007) | Article_Data: [2440 (coder=10)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=21007&article_data_id_select=2440) | ERROR | Miscategorized as Author, not Subject | None |
| 9414 | Jeff Hawkins | Article [21007](http://research.local/sourcenet/sourcenet/article/article_data/view_with_text/?article_id=21007) | Article_Data: [3024 (coder=2)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=21007&article_data_id_select=3024) | ERROR | Miscategorized as Author, not Subject | None |
| 9412 | Robert Homan | Article [21007](http://research.local/sourcenet/sourcenet/article/article_data/view_with_text/?article_id=21007) | Article_Data: [3024 (coder=2)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=21007&article_data_id_select=3024) | ERROR | MISSED | None |
| 9484 | Brent Vander Kolk | Article [21043](http://research.local/sourcenet/sourcenet/article/article_data/view_with_text/?article_id=21043) | Article_Data: [2342 (coder=8)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=21043&article_data_id_select=2342); [3040 (coder=2)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=21043&article_data_id_select=3040) | ERROR | Not really a quote. | "Vander Kolk earlier had described Beene as a well-liked and respected probation officer with a clean work record." |
| 9507 | Ivette Reyes | Article [21049](http://research.local/sourcenet/sourcenet/article/article_data/view_with_text/?article_id=21049) | Article_Data: [2443 (coder=10)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=21049&article_data_id_select=2443); [3034 (coder=2)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=21049&article_data_id_select=3034) | CORRECT | OpenCalais missed first name... | "This time of year is when the lines usually swell at Ivette Reyes' money transfer business" |


#### Reliability_Names records merged

- Back to [Table-of-Contents](#Table-of-Contents)

For some, need to merge a single-name detection by Calais with full-name detection by ground_truth (an OpenCalais error - did not detect full name - combined with lookup error - didn't lookup the right person since missed part of his or her name).  Will still have subsequently deleted one or more duplicate rows.

| ID FROM | ID INTO | Article | Article_Data | Article_Subject |
|------|------|------|------|------|
| 9506 | 9507 | Article [21049](http://research.local/sourcenet/sourcenet/article/article_data/view_with_text/?article_id=21049) | FROM [3034](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=21049&article_data_id_select=3034)<br />TO [2443](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=21049&article_data_id_select=2443) | 8494 (AS) - Reyes, Ivette ( id = 1899; capture_method = None ) (quoted; individual) ( quotes: 1; mentions: 1 ) ==> Name: Ivette Reyes |
| 9992 | 9993 | Article [22281](http://research.local/sourcenet/sourcenet/article/article_data/view_with_text/?article_id=22281) | FROM [3083](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=22281&article_data_id_select=3083) TO [2635](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=22281&article_data_id_select=2635) | 9369 (AS) - Tassell, Leslie ( id = 2328; capture_method = None ) (mentioned; individual) ==> name: Leslie E. Tassell |
| 9989 | 9988 | Article [23169](http://research.local/sourcenet/sourcenet/article/article_data/view_with_text/?article_id=23169) | FROM [3195](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=23169&article_data_id_select=3195) TO [2719](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=23169&article_data_id_select=2719) | 12020 (AS) - Keller ( id = 2903; capture_method = OpenCalais_REST_API_v2 ) (quoted; individual) ==> name: Keller |


In [None]:
reliability_names_id_from = "9989"
reliability_names_id_to = "9988"
article_id = "23169"
article_data_id_from = "3195"
article_data_id_to = "2719"
article_subject = "12020 (AS) - Keller ( id = 2903; capture_method = OpenCalais_REST_API_v2 ) (quoted; individual) ==> name: Keller"

markdown_string = "| "
markdown_string += reliability_names_id_from
markdown_string += " | "
markdown_string += reliability_names_id_to
markdown_string += " | Article ["
markdown_string += article_id
markdown_string += "](http://research.local/sourcenet/sourcenet/article/article_data/view_with_text/?article_id="
markdown_string += article_id
markdown_string += ") | FROM ["
markdown_string += article_data_id_from
markdown_string += "](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id="
markdown_string += article_id
markdown_string += "&article_data_id_select="
markdown_string += article_data_id_from
markdown_string += ") TO ["
markdown_string += article_data_id_to
markdown_string += "](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id="
markdown_string += article_id
markdown_string += "&article_data_id_select="
markdown_string += article_data_id_to
markdown_string += ") | "
markdown_string += article_subject
markdown_string += " |"

print( "Reliabilty_Names merge Markdown:\n" + markdown_string )

#### Deleted Reliability_Names records

- Back to [Table-of-Contents](#Table-of-Contents)

| ID | Article | Article_Data | Article_Subject | Type |
|------|------|------|------|------|
| 8618 | Article [20739](http://research.local/sourcenet/sourcenet/article/article_data/view_with_text/?article_id=20739) | Article_Data [2980](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=20739&article_data_id_select=2980) | 11006 (AS) - Christopher ( id = 2776; capture_method = OpenCalais_REST_API_v2 ) (mentioned; individual) ==> name: Christopher | CORRECT |


In [None]:
# folded this code into the Reliability_Names delete screen (sourcenet_analysis/views.py-->reliability_names_disagreement_view().
'''
reliability_names_id = "7956"
article_id = "21509"
article_data_id = "1660"
article_subject = "5498 (AS) - Jaidon ( id = 875; capture_method = OpenCalais_REST_API_v1 ) (mentioned; individual) ==> name: Jaidon"
    
markdown_string = "| "
markdown_string += reliability_names_id
markdown_string += " | Article ["
markdown_string += article_id
markdown_string += "](http://research.local/sourcenet/sourcenet/article/article_data/view_with_text/?article_id="
markdown_string += article_id
markdown_string += ") | Article_Data ["
markdown_string += article_data_id
markdown_string += "](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id="
markdown_string += article_id
markdown_string += "&article_data_id_select="
markdown_string += article_data_id
markdown_string += ") | "
markdown_string += article_subject
markdown_string += " |"

print( "Reliability_Names removal Markdown:\n" + markdown_string )
'''

# Notes

- Back to [Table of Contents](#Table-of-Contents)

## Notes and questions

- Back to [Table of Contents](#Table-of-Contents)

Notes and questions:

- TK

## Errors

- Back to [Table of Contents](#Table-of-Contents)

Errors:

- | 9129 | Dick Ball | Article [20902](http://research.local/sourcenet/sourcenet/article/article_data/view_with_text/?article_id=20902) | Article_Data: [2477 (coder=9)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=20902&article_data_id_select=2477); [3020 (coder=2)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=20902&article_data_id_select=3020) | CORRECT | Mentioned, SHB Quoted | Rep. Dick Ball, R-Bennington Township, worried lawmakers were rushing into changes that could dilute the quality of education simply to chase "an unknown chance for unknown hundreds of millions of dollars." |
- | 9125 | Wayne Kuipers | Article [20902](http://research.local/sourcenet/sourcenet/article/article_data/view_with_text/?article_id=20902) | Article_Data: [2477 (coder=9)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=20902&article_data_id_select=2477); [3020 (coder=2)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=20902&article_data_id_select=3020) | CORRECT | Mentioned, SHB Quoted | Sen. Wayne Kuipers, R-Holland, chairman of the Senate Education Committee, said... |
- | 9502 | Meg Ryan | Article [21047](http://research.local/sourcenet/sourcenet/article/article_data/view_with_text/?article_id=21047) | Article_Data: [2442 (coder=10)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=21047&article_data_id_select=2442) | CORRECT | MISSED | There was a "/" after the name, and maybe because of how it was phrased? - "...and the Meg Ryan/Ashley Tisdale film "Sleepless Beauty" is tentative for February." |
- | 9669 | Sculpture Park | Article [21109](http://research.local/sourcenet/sourcenet/article/article_data/view_with_text/?article_id=21109) | Article_Data: [3045 (coder=2)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=21109&article_data_id_select=3045) | CORRECT | Location Name | Separated by ampersand from rest of "a final tour of Frederik Meijer Gardens & Sculpture Park with his family at his side."  |
- | 9692 | Ellsworth Kelly | Article [21116](http://research.local/sourcenet/sourcenet/article/article_data/view_with_text/?article_id=21116) | Article_Data: [2447 (coder=10)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=21116&article_data_id_select=2447) | CORRECT | MISSED | "...most notably Ellsworth Kelly's "Blue White," a 25-foot- tall wall sculpture..." |
- | 9693 | Rembrandt van Rijn | Article [21116](http://research.local/sourcenet/sourcenet/article/article_data/view_with_text/?article_id=21116) | Article_Data: [2447 (coder=10)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=21116&article_data_id_select=2447) | CORRECT | MISSED | "...is Rembrandt van Rijn's "The Three Crosses," an early impression from the fourth state..." |
    - | 9704 | C. John Lewis | Article [21120](http://research.local/sourcenet/sourcenet/article/article_data/view_with_text/?article_id=21120) | Article_Data: [3050 (coder=2)](http://research.local/sourcenet/sourcenet/article/article_data/view/?article_id=21120&article_data_id_select=3050) | CORRECT | ERROR | Created this name string from last part of one paragraph and first part of next: `...in Wake Forest, N.C.</p><p>John Lewis graduated from...` |

# TODO

- Back to [Table of Contents](#Table-of-Contents)

TODO:

- Want a way to limit to disagreements where quoted?  Might not - this is a start to assessing erroneous agreement.  If yes, 1 < coding time < 4 hours.

    - problem - `Reliability_Names.person_type` only has three values - "author", "subject", "source" - might need a row-level measure of "`has_mention`", "`has_quote`" to more readily capture rows where disagreement is over quoted-or-not.

## Coding to look into

- Back to [Table of Contents](#Table-of-Contents)

Coding decisions to look at more closely:

- TK

## Debugging

- Back to [Table of Contents](#Table-of-Contents)

Issues to debug:

- TK

# DONE

- Back to [Table of Contents](#Table-of-Contents)

## quotes that contain paragraph break

- Back to [Table of Contents](#Table-of-Contents)

Quotes with newlines in them (not sure how that is captured on the way to the server, in the database, etc.) break the article coder: [http://research.local/sourcenet/sourcenet/article/code/](http://research.local/sourcenet/sourcenet/article/code/).

When you load JSON that contains quote text that spans lines, the newlines within the text cause the JSON parsing to break.  Looks like it is read and parsed correctly when submitted to serrver (except for the graf number - evaluates to -1 - so that is a bug, too, since there are no newlines in any of the text we are looking at, just paragraph breaks).

How to fix?:

- First try stripping out any stretches of multiple white space characters and substituting a space.  This should work with all of the rest of the code on the server.  Can implement in javascript, and for sanity check also in Python that processes received JSON.
- If rest of code doesn't play nice with reformatting, then maybe figure out how to escape the carriage returns and line feeds, and might need to update the "find in text" functions, too.
- turns out that fixing this in cases when the quotation spans paragraphs might then break things when there are extra spaces within a paragraph.  So, leaving it as is for now, need to fix that paragraph in the article.

Examples:

- Article 21001: [http://research.local/sourcenet/sourcenet/article/article_data/view_with_text/?article_id=21001](http://research.local/sourcenet/sourcenet/article/article_data/view_with_text/?article_id=21001)
    
    - user minnesota1, article 21001
    - user ground_truth, article 21001 (copied from minnesota1).

# NEXT

- Back to [Table of Contents](#Table-of-Contents)