GitHub - appeler/clean-names: Deduplicate and parse list of `dirty names'

Clean Names

The script takes a csv file with column 'Name' containing 'dirty names' --- names with all different formats: lastname firstname, firstname lastname, middlename lastname firstname etc. (see sample input file). And it produces a csv file that has all the columns of the original csv file and the following columns: 'uniqid', 'FirstName', 'MiddleInitial/Name', 'LastName', 'RomanNumeral', 'Title', 'Suffix'. The script takes out duplicate names by default (see sample output file).

Application

The script was used to fix names in CF-Scores from Database on Ideology, Money in Politics, and Elections. Processed database with clean names posted on Harvard DVN.

Installation

Clone this repository

git clone https://github.com/soodoku/clean-names.git

Navigate to clean-names
Run python setup.py install

Using Clean Names

Usage: process_names.py [options]

Command Line Options

 	-h, 	    --help show this help message and exit  
 	-o OUTFILE, --out=OUTFILE  
                  	Output file in CSV (default: sample_output.csv)  
  -c COLUMN,  --column=COLUMN  
                  	Column name in CSV that contains Names (default: Name)    
   -a, 	    --all      	
    			Export all names (do not take duplicate names out)  (default: False)

Example

 python process_names.py -a sample_input.csv

License

Scripts are released under the MIT License

Name		Name	Last commit message	Last commit date
Latest commit History 59 Commits
.gitignore		.gitignore
.travis.yml		.travis.yml
ReadMe.md		ReadMe.md
appveyor.yml		appveyor.yml
names.py		names.py
process_names.py		process_names.py
requirements.txt		requirements.txt
sample_input.csv		sample_input.csv
sample_output.csv		sample_output.csv
setup.cfg		setup.cfg
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.gitignore

.gitignore

.travis.yml

.travis.yml

ReadMe.md

ReadMe.md

appveyor.yml

appveyor.yml

names.py

names.py

process_names.py

process_names.py

requirements.txt

requirements.txt

sample_input.csv

sample_input.csv

sample_output.csv

sample_output.csv

setup.cfg

setup.cfg

setup.py

setup.py

Repository files navigation

Clean Names

Application

Installation

Using Clean Names

Command Line Options

Example

License

About

Releases

Packages

Contributors 3

Languages

appeler/clean-names

Folders and files

Latest commit

History

Repository files navigation

Clean Names

Application

Installation

Using Clean Names

Command Line Options

Example

License

About

Topics

Resources

Stars

Watchers

Forks

Languages