Skip to content

dgileadi/usaddress-java

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

usaddress-java

Javadoc

usaddress-java is a Java port of the usaddress Python library for parsing unstructured United States address strings into address components, using NLP methods.

What this can do: Using a probabilistic model, it makes educated guesses in identifying address components, even in tricky cases where rule-based parsers typically break down.

What this cannot do: It cannot identify address components with perfect accuracy, nor can it verify that a given address is correct/valid.

How to use the usaddress-java library

  1. Install the usaddress Maven dependency:

    <dependency>
      <groupId>io.github.dgileadi.usaddress</groupId>
      <artifactId>usaddress</artifactId>
      <version>1.0.0</version>
    </dependency>
  2. Parse some addresses!

    Note that parse and parseAndClean are different methods:

    import io.github.dgileadi.usaddress.Address;
    import io.github.dgileadi.usaddress.AddressParser;
    
    ..
    
    String address = "123 Main St. Suite 100 Chicago, IL";
    
    // The parse method will split your address string into components, and label each component.
    Address parsed = AddressParser.parse(address);
    
    // The parseAndClean method will try to be a little smarter.
    // It will merge consecutive components and strip commas.
    Address parsed = AddressParser.parseAndClean(address);

For more details you may read the API documentation.

Building & testing the code in this repo

To build a development version of usaddress on your machine, run the following code in your command line:

git clone https://github.com/dgileadi/usaddress-java.git
cd usaddress-java
mvn clean install

Copyright

Copyright (c) 2023 David Gileadi.

Original code copyright (c) 2014 Atlanta Journal Constitution.

Released under the MIT License.

About

Java port of the usaddress python library

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages