Skip to content
A maximum-strength name parser for record linkage.
Python
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
.circleci
docs
nominally
requirements
stubs/unidecode
test
.gitignore
CODE_OF_CONDUCT.md
CONTRIBUTING.md
LICENSE
MANIFEST.in
README.md
noxfile.py
pylintrc
requirements.txt
setup.cfg
setup.py

README.md

Nominally Logo

nominally: a maximum-strength name parser for record linkage

License: AGPL 3.0+ Distributed via PyPI Maintainability rated at Code Climate Builds at CircleCI Test coverage at Coveralls Documentation at Read the Docs Latest commit at GitHub

🔗 Names

Nominally simplifies and parses a personal name written in Western name order into six core fields: title, first, middle, last, suffix, and nickname.

Typically, nominally is used to parse entire lists or pd.Series of names en masse. This package includes a command line tool to parse a single name for convenient one-off testing and examples.

Nominally produces fields primarily suitable for comparisons across or within datasets. As such, names come out formatted for data without regard to human syntactic preference: de von ausfern, mr johann g rather than Mr. Johann G. de von Ausfern.

📓 Getting Started

Call parse_name() to parse out the six core fields:

$ python -q
>>> from nominally import parse_name
>>> parse_name("Blankinsop, Jr., Mr. James 'Jimmy'")
{
  'title': 'mr',
  'first': 'james',
  'middle': '',
  'last': 'blankinsop',
  'suffix': 'jr',
  'nickname': 'jimmy'
}

Dive into the Name class to parse and recreate a string...

>>> from nominally import Name
>>> n = Name("DR. PEACHES BARTKOWICZ")
>>> n
Name({'title': 'dr', 'first': 'peaches', 'middle': '', 'last': 'bartkowicz', 'suffix': '', 'nickname': ''})
>>> str(n)
'bartkowicz, dr peaches'

...or use the dict...

>>> dict(n)
{
  'title': 'dr',
  'first': 'peaches',
  'middle': '',
  'last': 'bartkowicz',
  'suffix': '',
  'nickname': ''
}
>>> list(n.values())
['dr', 'peaches', '', 'bartkowicz', '', '']

...or retrieve a more elaborate set of attributes...

>>> n.report()
{
  'raw': 'DR. PEACHES BARTKOWICZ',
  'cleaned': {'dr peaches bartkowicz'},
  'parsed': 'bartkowicz, dr peaches',
  'list': ['dr', 'peaches', '', 'bartkowicz', '', ''],
  'title': 'dr',
  'first': 'peaches',
  'middle': '',
  'last': 'bartkowicz',
  'suffix': '',
  'nickname': ''
}

...or capture individual attributes.

>>> n.first
'peaches'
>>> n['last']
'bartkowicz'
>>> n.get('title')
'dr'
>>> n.raw
'DR. PEACHES BARTKOWICZ'

🖥️ Command Line

For a quick report, invoke the nominally command line tool:

$ nominally "DR. PEACHES BARTKOWICZ"
       raw: DR. PEACHES BARTKOWICZ
   cleaned: dr. peaches bartkowicz
    parsed: bartkowicz, dr peaches
      list: ['dr', 'peaches', '', 'bartkowicz', '', '']
     title: dr
     first: peaches
    middle:
      last: bartkowicz
    suffix:
  nickname:

🔬 Worked Examples

Binder hosts live Jupyter notebooks walking through examples of nominally.

     csv.ipynb on mybinder.org

     pandas_simple.ipynb on mybinder.org

These notebooks and additional examples reside in the Nominally Examples repository.

🧙‍ Author

Matt VanEseltine

https://pypi.org/user/matvan/

matvan@umich.edu

https://github.com/vaneseltine

https://twitter.com/vaneseltine

https://stackoverflow.com/users/7846185/matt-vaneseltine

💡 Acknowledgements

Nominally started as a fork of the python-nameparser package, and has benefitted considerably from this origin⸺especially the wealth of examples and tests developed for python-nameparser.

You can’t perform that action at this time.