Skip to content

Parser for string values according to the OSM opening hours specification

License

Notifications You must be signed in to change notification settings

simonpoole/OpeningHoursParser

Repository files navigation

build status Quality Gate Status Coverage sonarcloud bugs sonarcould maintainability sonarcloud security sonarcloud reliability Donate with liberapay

OpeningHoursParser

This is a very simplistic parser for string values according to the OSM opening hours specification. It is used in a number of OpenStreetMap projects, for example in Vespucci. As the opening hours specification is currently reasonably stable you shouldn't expect lots of activity in this repository.

It parses 147'209 (91%) of 161'265 unique test strings in non-strict mode. The remaining 14'056 are likely valid errors, spot checking shows that they have obvious issues. In strict mode a further 15'993 fail (total 30'072).

"in the wild" there are currently 3.4 million opening hour values in OSM, of these we successfully parse 99.1%, leaving 31'000 that fail (in non-strict mode). Parsing a single value successfully on a 15 year old PC (slower than a modern mobile phone for these kind of workloads) takes on average 0.05 ms, or put differently we can easily parse 20'000 values per second in a single thread. Parsing a non-compliant value takes 0.1 ms on average, this is somewhat slower as we restart parsing, potentially multiple times, to determine as many issues as possible in one go. This could certainly be sped up by not using a parser generator and manually coding, however given that it takes just 3 minutes to parse all above mentioned 3.4 million values, and in real life scenarios many other factors will dominate the run time, this would be a substantial waste of time.

Deviations from the grammar as of this version of the opening hours specification in all modes:

  • case-insensitive
  • leading 0s in times optional
  • unicode EN DASH (U+2013) EM DASH (U+2014) characters are allowed for hyphen
  • various unicode whitespace characters are ignored

In strict mode time ranges that have a numerically earlier ending time than start and do not use the extended time format will cause a parse exception to be thrown, this can be overridden when calling the parser.

In non-strict mode the following further differences are allowed:

  • three-character weekday abbreviations
  • German two-letter weekday abbreviations
  • single 0 for minutes
  • minutes in times optional
  • ignore spaces and more than one leading zeros in minutes
  • "." and "h" as minutes separators
  • AM and PM time specifications are allowed (plus A.M. and P.M.)
  • holidays in weekday range
  • superfluous ":" after weekday range
  • 24/7 rules with preceding year, week and date selectors are corrected to 00:00-24:00 time spans
  • single digit month day numbers (Jan 1 instead of Jan 01)
  • list of month days after months ( Jan 1,4,5 )
  • " to " and some translations of it in lieu of a hyphen for ranges
  • date ranges that do not have the month day specified are corrected (Jan - Feb 15 -> Jan 1 - Feb 15)

Converting the data structures generated by parsing back to strings will result in correct data according to the specification.

Usage

try {
	OpeningHoursParser parser = new OpeningHoursParser(
		new ByteArrayInputStream(line.getBytes()));
	List<Rule> rules = parser.rules(strict, strictTime);
	// ...
} catch(OpeningHoursParseException e) {
	// ...
	// e.getExceptions() will return a List<OpeningHoursParseException> 
	// containing more than one Exception if more than one issue was found 
}

Detailed documentation can be found in the JavaDoc.

Including in your project

We publish releases to Maven Central. The following snippets for build.gradle will make OpeningHoursParser available in your Gradle project:

repositories {
    mavenCentral()
}
dependencies {
    compile "ch.poole:OpeningHoursParser:0.28.2"
}

Tests

Besides unit tests, we run the parser on ~160'000 opening hours values from the OSM database and compare with previous output to detect any changes in behaviour. If you are running these on windows, you may need to add org.gradle.jvmargs=-Dfile.encoding=UTF-8 to your gradle.properties to force correct use of the UTF-8 encoded test input.

Building

The project uses gradle for building. Standard gradle tasks for the java plugin can be found here https://docs.gradle.org/current/userguide/java_plugin.html. They can be invoked on the command line by running gradlew or gradlew.bat with the name of the task, for example gradlew jar to create the jar archive.

Note: the project has no runtime dependencies, and the jar file resulting from the build process is self-sufficient.

Contributing

Pull requests are welcome.

As this library is used in a number of Android projects please restrict the use of Java 8 features and APIs to the Android supported subset see https://developer.android.com/studio/write/java8-support. Check that any APIs used are supported on all Android platform API versions 10 and later, this is mainly an issue for APIs that were introduced with Java 7. This restriction will likely be relaxed for future versions.

Currently, the library is self-contained and doesn't have any runtime dependencies outside of basic Java support, it would be nice if we can keep it like that.

Translation

The error messages produced by the library are translated on transifex, see https://www.transifex.com/openstreetmap/openinghoursparser/.

About

Parser for string values according to the OSM opening hours specification

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Languages