Permalink
Browse files

Brought everything up to 2013 standards, and bye Hoe, hello Bundler

  • Loading branch information...
1 parent 3fc7106 commit 4e2fe86744909c9e7b89901c3597a26459cab42f @peterc committed Mar 7, 2013
View
17 .gitignore
@@ -0,0 +1,17 @@
+*.gem
+*.rbc
+.bundle
+.config
+.yardoc
+Gemfile.lock
+InstalledFiles
+_yardoc
+coverage
+doc/
+lib/bundler/man
+pkg
+rdoc
+spec/reports
+test/tmp
+test/version_tmp
+tmp
View
4 Gemfile
@@ -0,0 +1,4 @@
+source 'https://rubygems.org'
+
+# Specify your gem's dependencies in whatlanguage2.gemspec
+gemspec
View
22 LICENSE.txt
@@ -0,0 +1,22 @@
+Copyright (c) 2008-2013 Peter Cooper
+
+MIT License
+
+Permission is hereby granted, free of charge, to any person obtaining
+a copy of this software and associated documentation files (the
+"Software"), to deal in the Software without restriction, including
+without limitation the rights to use, copy, modify, merge, publish,
+distribute, sublicense, and/or sell copies of the Software, and to
+permit persons to whom the Software is furnished to do so, subject to
+the following conditions:
+
+The above copyright notice and this permission notice shall be
+included in all copies or substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
+LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
+OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
+WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
View
0 README
No changes.
View
57 README.txt → README.md
@@ -1,23 +1,21 @@
-whatlanguage
- by Peter Cooper
- http://www.petercooper.co.uk/
- http://www.rubyinside.com/
+# whatlanguage
+
+by Peter Cooper
-== DESCRIPTION:
-
Text language detection. Quick, fast, memory efficient, and all in pure Ruby. Uses Bloom filters for aforementioned speed and memory benefits.
-Works with Dutch, English, Farsi, French, German, Swedish, Portuguese, Russian and Spanish out of the box.
+Works with Dutch, English, Farsi, French, German, Italian, Pinyin, Swedish, Portuguese, Russian and Spanish out of the box.
-== FEATURES/PROBLEMS:
+## Important note
-* It can be made far more efficient at the comparison stage, but all in good time..! It still beats literal dictionary approaches.
-* No filter selection yet, you get 'em all loaded.
-* Tests are reasonably light.
+This library was first built in 2007 and has received a few minor updates over the years. There are now more efficient and effective algorithms for doing language detection which I am investigating for a WhatLanguage 2.0.
+
+This library has been updated to be distributed and to work on modern Ruby implementations but other than that, has had no improvements.
-== SYNOPSIS:
+## Synopsis
+
+Full Example
- Full Example
require 'whatlanguage'
texts = []
@@ -30,37 +28,40 @@ Works with Dutch, English, Farsi, French, German, Swedish, Portuguese, Russian a
texts.each { |text| puts "#{text[0..18]}... is in #{text.language.to_s.capitalize}" }
- Initialize WhatLanguage with all filters
+Initialize WhatLanguage with all filters
+
wl = WhatLanguage.new(:all)
- Return language with best score
+Return language with best score
+
wl.language(text)
- Return hash with scores for all relevant languages
+Return hash with scores for all relevant languages
+
wl.process_text(text)
- Convenience method on String
+Convenience method on String
+
"This is a test".language # => "English"
-== REQUIREMENTS:
+## Requirements
-* None, minor libraries (BloominSimple and BitField) included with this release.
+None, minor libraries (BloominSimple and BitField) included with this release.
-== INSTALLATION:
+## Installation
- gem sources -a http://gems.github.com
- sudo gem install peterc-whatlanguage
+ gem install whatlanguage
- To test, go into irb, then:
+To test, go into irb, then:
- require 'whatlanguage'
- "Je suis un homme".language
+ require 'whatlanguage'
+ "Je suis un homme".language
-== LICENSE:
+## License
-(The MIT License)
+MIT License
-Copyright (c) 2007-2008 Peter Cooper
+Copyright (c) 2007-2013 Peter Cooper
Permission is hereby granted, free of charge, to any person obtaining
a copy of this software and associated documentation files (the
View
19 Rakefile
@@ -1,17 +1,6 @@
-# -*- ruby -*-
+require "bundler/gem_tasks"
+require 'rake/testtask'
-require 'rubygems'
-require 'hoe'
-require './lib/whatlanguage.rb'
+Rake::TestTask.new
-Hoe.new('whatlanguage', WhatLanguage::VERSION) do |p|
- p.rubyforge_name = 'whatlanguage'
- p.author = 'Peter Cooper'
- p.email = 'whatlanguage@peterc.org'
- p.summary = 'Fast, quick, textual language detection'
- p.description = p.paragraphs_of('README.txt', 2..5).join("\n\n")
- p.url = "http://rubyforge.org/projects/whatlanguage/"
- p.changes = p.paragraphs_of('History.txt', 0..1).join("\n\n")
-end
-
-# vim: syntax=Ruby
+task :default => :test
View
7 lib/whatlanguage.rb
@@ -1,9 +1,8 @@
-require File.join(File.dirname(__FILE__), 'bloominsimple')
+require 'whatlanguage/bloominsimple'
+require 'whatlanguage/bitfield'
require 'digest/sha1'
-class WhatLanguage
- VERSION = '1.0.3'
-
+class WhatLanguage
HASHER = lambda { |item| Digest::SHA1.digest(item.downcase.strip).unpack("VV") }
BITFIELD_WIDTH = 2_000_000
View
0 lib/bitfield.rb → lib/whatlanguage/bitfield.rb
File renamed without changes.
View
2 lib/bloominsimple.rb → lib/whatlanguage/bloominsimple.rb
@@ -36,7 +36,7 @@
# # => nooooooo: false
# # => newyorkcity: false
-require File.join(File.dirname(__FILE__), 'bitfield')
+require 'whatlanguage/bitfield'
class BloominSimple
attr_accessor :bitfield, :hasher
View
3 lib/whatlanguage/version.rb
@@ -0,0 +1,3 @@
+class WhatLanguage
+ VERSION = '1.0.4'
+end
View
4 test/test_whatlanguage.rb
@@ -1,7 +1,7 @@
-# -*- coding: utf-8 -*-
+# encoding: utf-8
require "test/unit"
-require File.join(File.dirname(__FILE__), "..", "lib", "whatlanguage")
+require 'whatlanguage'
class TestWhatLanguage < Test::Unit::TestCase
def setup
View
55 whatlanguage.gemspec
@@ -1,40 +1,19 @@
-Gem::Specification.new do |s|
- s.name = "whatlanguage"
- s.version = "1.0.3"
- s.date = "2008-09-29"
- s.summary = "Natural language detection for text samples"
- s.email = "whatlanguage@peterc.org"
- s.homepage = "http://github.com/peterc/whatlanguage"
- s.description = "WhatLanguage rapidly detects the language of a sample of text"
- s.has_rdoc = true
- s.authors = ["Peter Cooper"]
- s.files = [
-"build_filter.rb",
-"build_lang_from_wordlists.rb",
-"example.rb",
-"History.txt",
-"lang/dutch.lang",
-"lang/english.lang",
-"lang/farsi.lang",
-"lang/french.lang",
-"lang/german.lang",
-"lang/italian.lang",
-"lang/pinyin.lang",
-"lang/portuguese.lang",
-"lang/russian.lang",
-"lang/spanish.lang",
-"lang/swedish.lang",
-"lib/bitfield.rb",
-"lib/bloominsimple.rb",
-"lib/whatlanguage.rb",
-"Manifest.txt",
-"Rakefile",
-"README",
-"README.txt",
-"test/test_whatlanguage.rb",
-"whatlanguage.gemspec"]
+# -*- encoding: utf-8 -*-
+lib = File.expand_path('../lib', __FILE__)
+$LOAD_PATH.unshift(lib) unless $LOAD_PATH.include?(lib)
+require 'whatlanguage/version'
- s.rdoc_options = ["--main", "README.txt"]
- s.extra_rdoc_files = ["History.txt", "Manifest.txt", "README.txt"]
-end
+Gem::Specification.new do |gem|
+ gem.name = "whatlanguage"
+ gem.version = WhatLanguage::VERSION
+ gem.authors = ["Peter Cooper"]
+ gem.email = ["git@peterc.org"]
+ gem.description = %q{WhatLanguage rapidly detects the language of a sample of text}
+ gem.summary = %q{Natural language detection for text samples}
+ gem.homepage = "https://github.com/peterc/whatlanguage"
+ gem.files = `git ls-files`.split($/)
+ gem.executables = gem.files.grep(%r{^bin/}).map{ |f| File.basename(f) }
+ gem.test_files = gem.files.grep(%r{^(test|spec|features)/})
+ gem.require_paths = ["lib"]
+end

0 comments on commit 4e2fe86

Please sign in to comment.