Skip to content

mdonigian/Crocodilopolis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Crocodilopolis

A library for parsing city data out of legal entity names

Legal entity names E.g: "Microsoft New York", "Goldman Sachs Los Angeles", "Bank of Switzerland Geneve", "Max's New York Towing Company", often contain tokens that reflect the city that the legal entity operates in. Crocodilopolis (named after the longest continuously inhabited city) allows this data to be parsed from these names. The current version (v 0.1) requires a country (in ISO_3166_A2) format to be provided, and for entities located in the United States the state must be provided in the two letter format as well. The next version (v 0.2) will remove this restriction and use a detection method to parse country and state data when it is available.

Example
```java RegionIdentifier regionIdentifier = new RegionIdentifier(); String entityName = "Max's New York Towing Company"; String stateProvince = "ny"; String country = "us"; Assert.assertEquals("new york", regionIdentifier.getCityRegionUS(entityName, stateProvince, country)); ```
Notes
The RegionIdentifier class should be instantiated only once and injected into classes that will utilize it. The instatiated class requires ~1.2GB of memory. This is because it loads a global list of cities into memory. The next version (v 0.2) will reduce this memory footprint.

About

Library for parsing city data out of legal entity names

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages