Skip to content

Dart best-effort parsing for names with unknown formats (like unstructured user input)

License

Notifications You must be signed in to change notification settings

jack-r-warren/best_effort_parser

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

79 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Best Effort Parser

Author: Jack Warren

Build Status Coverage Status Pub

Parse unstructured user input, with customizable behavior and output types. Parsing is currently available for names and dates.

Name Parsing

best_effort_parser/name.dart

Provides parsing of names by categorizing different parts:

  • family: A person's last name(s)
  • given: A person's first and middle name(s)
  • dropping particle: particle(s) before the person's last name that are ignored if only the last name is shown
  • non-dropping particle: particles(s) before the person's last name that are not ignored if only the last name is shown
  • suffix: abbreviations after a person's last name

Features handling of a wide range of formats: beyond just "first last" and "last, first", particles and suffixes are parsed from any reasonably correct position a user may place them.

Example:

name_example.dart

import 'package:best_effort_parser/name.dart';

void main(List<String> arguments) =>
    print(NameParser.basic().parse(arguments.join(' ')).diagnosticString());

Demo:

λ  dart name_example.dart 'Jack Warren'
[Given]: Jack [Family]: Warren

λ  dart name_example.dart 'La Fontaine, Jean de'
[Given]: Jean [Dropping Particle]: de [Non-dropping Particle]: La [Family]: Fontaine

λ  dart name_example.dart 'Gates, Bill III'
[Given]: Bill [Family]: Gates [Suffix]: III

λ  dart name_example.dart 'Willem de Kooning'
[Given]: Willem [Dropping Particle]: de [Family]: Kooning

Customization of both parsing and output type is available.

Date Parsing

best_effort_parser/date.dart

Provides parsing of dates by collecting years, months, and days and assembling those parts into a list. Each entry in that output list represents a singular date, so a string containing multiple dates or a range will have multiple entries in its output.

Example:

date_example.dart

import 'package:best_effort_parser/date.dart';

void main(List<String> arguments) => 
    DateParser.basic().parse(arguments.join(' ')).forEach(print);

Demo:

λ  dart date_example.dart 'January 1st, 2019'
[Day]: 1 [Month]: 1 [Year]: 2019

λ  dart date_example.dart '1/2/3'
[Day]: 2 [Month]: 1 [Year]: 2003

λ  dart date_example.dart '10/10/90 - 3/13/18'
[Day]: 10 [Month]: 10 [Year]: 1990
[Day]: 13 [Month]: 3 [Year]: 2018

λ  dart date_example.dart 'Spring-Summer 2010'
[Month]: 3 [Year]: 2010
[Month]: 6 [Year]: 2010

λ  dart date_example.dart '1999-6-15'
[Day]: 15 [Month]: 6 [Year]: 1999

λ  dart date_example.dart '40 20 10'
[Day]: 20 [Month]: 10 [Year]: 1940

As seen in the last example especially, the parser will do its best even in the face of very odd input. In that example, 40 can't be a day or month, and 20 can't be a month, so a year-day-month format will be used for that date only.

Customization of both parsing and output type is available.

Feature requests and bugs

Please file feature requests and bugs at the issue tracker.

About

Dart best-effort parsing for names with unknown formats (like unstructured user input)

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published