Skip to content

bensentropy/currencydetect

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Currency Detect - natural language currency code extraction

A library to detect currency symbols in natural language by use of simple heuristics in a rule based system.

The rules are as follows:

  1. check if the first three characters of the string is a currency code.

  2. check if the currency symbol has a one to one mapping from symbol to currency code i.e ฿ to THB

  3. check if the TLD of the url where the text was found has a one to one mapping with a currency symbol (i.e .th to THB). If more then one country uses the respective currency symbol select the currency with the largest GDP. TLDs are excluded if that they are outside the 90% threshold band of a log transformed linear model of the two variables; GDP and Tld counts (i.e .me .tk).

  4. if USD if text uses a decimal point or to EUR if the text has a decimal comma.

  5. if all else fails revert to USD

These rules are currently a speculative approximation with much room for improvement, suggestions are welcome.

Resources

Installation

Leiningen

Clojars Project

Usage

(ns testproject.core
  (:require [currencydetect.core :refer [parse-price]]))

(parse-price "£20.00" "http://www.r.co.uk")
=>  {:amount 20.0M, :code "GBP", :tld "uk"}

(parse-price "NZD 10")
=>  {:amount 10M, :code "NZD", :tld nil}

License

Copyright © 2015 Ben Olsen Distributed under the Apache License 2.0.

About

natural language currency code extraction

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published