Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Loading…

Parsing Rules Output from SpamAssassin Report #3

Merged
merged 4 commits into from

2 participants

@sleroux

Hey!

I needed a way to better structure the report data from the report command so I added some regex'ing to extract out the score/rule name/rule text from the SpamAssassin report into a list of hashes. Should help in analyzing the SpamAssassin output.

I also modularized the code a bit to clean it up if that's alright.

@noeticpenguin noeticpenguin merged commit 67c901e into from
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
This page is out of date. Refresh to see the latest.
View
1  Gemfile
@@ -7,7 +7,6 @@ source "http://rubygems.org"
# Include everything needed to run rake, tests, features, etc.
group :development do
gem "rspec", "~> 2.3.0"
- gem "cucumber", ">= 0"
gem "bundler" # , "~> 1.0.0"
gem "jeweler" # , "~> 1.5.2"
end
View
9 Gemfile.lock
@@ -1,15 +1,7 @@
GEM
remote: http://rubygems.org/
specs:
- builder (3.1.4)
- cucumber (1.2.1)
- builder (>= 2.1.2)
- diff-lcs (>= 1.1.3)
- gherkin (~> 2.11.0)
- json (>= 1.4.6)
diff-lcs (1.1.3)
- gherkin (2.11.5)
- json (>= 1.4.6)
git (1.2.5)
jeweler (1.8.4)
bundler (~> 1.0)
@@ -34,6 +26,5 @@ PLATFORMS
DEPENDENCIES
bundler
- cucumber
jeweler
rspec (~> 2.3.0)
View
6 Rakefile
@@ -18,7 +18,8 @@ Jeweler::Tasks.new do |gem|
gem.summary = %Q{Gem provides a direct ruby interface to spamd running on localhost or remotely}
gem.description = %Q{This gem makes it easy for developers to hand a body of text to spam assassin and ask get it's spam score, spam report etc. Supports the full Spamc protocol.}
gem.email = "kjp@brightleafsoftware.com"
- gem.authors = ["Kevin Poorman"]
+ gem.authors = ["Kevin Poorman", "Stephan Leroux"]
+ gem.files.include 'lib/RubySpamAssassin/**.rb'
# Include your dependencies below. Runtime dependencies are required when using your gem,
# and development dependencies are only needed for development (ie running rake tasks, tests, etc)
# gem.add_runtime_dependency 'jabber4r', '> 0.1'
@@ -37,7 +38,4 @@ RSpec::Core::RakeTask.new(:rcov) do |spec|
spec.rcov = true
end
-require 'cucumber/rake/task'
-Cucumber::Rake::Task.new(:features)
-
task :default => :spec
View
21 RubySpamAssassin.gemspec
@@ -5,11 +5,11 @@
Gem::Specification.new do |s|
s.name = "RubySpamAssassin"
- s.version = "1.0.2"
+ s.version = "1.0.3"
s.required_rubygems_version = Gem::Requirement.new(">= 0") if s.respond_to? :required_rubygems_version=
- s.authors = ["Kevin Poorman"]
- s.date = "2013-01-22"
+ s.authors = ["Kevin Poorman", "Stephan Leroux"]
+ s.date = "2013-03-11"
s.description = "This gem makes it easy for developers to hand a body of text to spam assassin and ask get it's spam score, spam report etc. Supports the full Spamc protocol."
s.email = "kjp@brightleafsoftware.com"
s.extra_rdoc_files = [
@@ -26,17 +26,21 @@ Gem::Specification.new do |s|
"Rakefile",
"RubySpamAssassin.gemspec",
"VERSION",
- "features/RubySpamAssassin.feature",
- "features/step_definitions/RubySpamAssassin_steps.rb",
- "features/support/env.rb",
"lib/RubySpamAssassin.rb",
+ "lib/RubySpamAssassin/report_parser.rb",
+ "lib/RubySpamAssassin/spam_client.rb",
+ "lib/RubySpamAssassin/spam_result.rb",
+ "spec/RubySpamAssassin/report_parser_spec.rb",
+ "spec/RubySpamAssassin/spam_client_spec.rb",
+ "spec/RubySpamAssassin/spam_result_spec.rb",
"spec/RubySpamAssassin_spec.rb",
+ "spec/data/spam_test1.txt",
"spec/spec_helper.rb"
]
s.homepage = "http://noeticpenguin.github.com/RubySpamAssassin/"
s.licenses = ["MIT"]
s.require_paths = ["lib"]
- s.rubygems_version = "1.8.24"
+ s.rubygems_version = "1.8.23"
s.summary = "Gem provides a direct ruby interface to spamd running on localhost or remotely"
if s.respond_to? :specification_version then
@@ -44,18 +48,15 @@ Gem::Specification.new do |s|
if Gem::Version.new(Gem::VERSION) >= Gem::Version.new('1.2.0') then
s.add_development_dependency(%q<rspec>, ["~> 2.3.0"])
- s.add_development_dependency(%q<cucumber>, [">= 0"])
s.add_development_dependency(%q<bundler>, [">= 0"])
s.add_development_dependency(%q<jeweler>, [">= 0"])
else
s.add_dependency(%q<rspec>, ["~> 2.3.0"])
- s.add_dependency(%q<cucumber>, [">= 0"])
s.add_dependency(%q<bundler>, [">= 0"])
s.add_dependency(%q<jeweler>, [">= 0"])
end
else
s.add_dependency(%q<rspec>, ["~> 2.3.0"])
- s.add_dependency(%q<cucumber>, [">= 0"])
s.add_dependency(%q<bundler>, [">= 0"])
s.add_dependency(%q<jeweler>, [">= 0"])
end
View
2  VERSION
@@ -1 +1 @@
-1.0.2
+1.0.3
View
9 features/RubySpamAssassin.feature
@@ -1,9 +0,0 @@
-Feature: something something
- In order to something something
- A user something something
- something something something
-
- Scenario: something something
- Given inspiration
- When I create a sweet new gem
- Then everyone should see how awesome I am
View
0  features/step_definitions/RubySpamAssassin_steps.rb
No changes.
View
13 features/support/env.rb
@@ -1,13 +0,0 @@
-require 'bundler'
-begin
- Bundler.setup(:default, :development)
-rescue Bundler::BundlerError => e
- $stderr.puts e.message
- $stderr.puts "Run `bundle install` to install missing gems"
- exit e.status_code
-end
-
-$LOAD_PATH.unshift(File.dirname(__FILE__) + '/../../lib')
-require 'RubySpamAssassin'
-
-require 'rspec/expectations'
View
101 lib/RubySpamAssassin.rb
@@ -1,100 +1,5 @@
module RubySpamAssassin
- class SpamResult
-
- attr_accessor :response_version,
- :response_code,
- :response_message,
- :spam,
- :score,
- :threshold,
- :tags,
- :report,
- :content_length
-
- #returns true if the message was spam, otherwise false
- def spam?
- (@spam == "True" || @spam == "Yes") ? true : false
- end
- end
-
- class SpamClient
-
- require 'socket'
- require 'timeout'
-
- def initialize(host="localhost", port=783, timeout=5)
- @port = port
- @host = host
- @timeout =timeout
- @socket = TCPSocket.open(@host, @port)
- end
-
- def reconnect
- @socket = @socket || TCPSocket.open(@host, @port)
- end
-
- def send_symbol(message)
- protocol_response = send_message("SYMBOLS", message)
- result = process_headers protocol_response[0...2]
- result.tags = protocol_response[3...-1].join(" ").split(',')
- end
-
- def check(message)
- protocol_response = send_message("CHECK", message)
- result = process_headers protocol_response[0...2]
- end
-
- def report(message)
- protocol_response = send_message("REPORT", message)
- result = process_headers protocol_response[0...2]
- result.report = protocol_response[3..-1].join
- end
-
- def report_ifspam(message)
- result = report(message).spam?
- end
-
- def skip
- protocol_response = send_message("SKIP", message)
- end
-
- def ping
- protocol_response = send_message("PING", message)
- result = process_headers protocol_response[0]
- end
-
- alias :process :report
-
- private
- def send_message(command, message)
- length = message.length
- @socket.write(command + " SPAMC/1.2\r\n")
- @socket.write("Content-length: " + length.to_s + "\r\n\r\n")
- @socket.write(message)
- @socket.shutdown(1) #have to shutdown sending side to get response
- response = @socket.readlines
- @socket.close #might as well close it now
-
- response
- end
-
- def process_headers(headers)
- result = SpamResult.new
- headers.each do |line|
- case line.chomp
- when /(.+)\/(.+) (.+) (.+)/ then
- result.response_version = $2
- result.response_code = $3
- result.response_message = $4
- when /^Spam: (.+) ; (.+) . (.+)$/ then
- result.score = $2
- result.spam = $1
- result.threshold = $3
- when /Content-length: (.+)/ then
- result.content_length = $1
- end
- end
- result
- end
- end
+ autoload(:SpamClient, "RubySpamAssassin/spam_client")
+ autoload(:SpamResult, "RubySpamAssassin/spam_result")
+ autoload(:ReportParser, "RubySpamAssassin/report_parser")
end
View
21 lib/RubySpamAssassin/report_parser.rb
@@ -0,0 +1,21 @@
+class RubySpamAssassin::ReportParser
+ LINE_REGEXP = /-$/
+ RULE_REGEXP = /[0-9]*[.][0-9]\s\w*\s/
+
+ def self.parse(report_text)
+ last_part = report_text.split(LINE_REGEXP)[1].sub(/^[\n\r]./,'').chomp.chomp
+ pts_rules = last_part.gsub(RULE_REGEXP).collect { |sub| sub.chomp(' ') }
+ rule_texts = last_part.split(RULE_REGEXP).collect { |text| text.delete("\n").squeeze.chomp(' ').sub(/^\s/, '') }
+
+ rules = []
+ pts_rules.each_with_index do |pts_rule, i|
+ rules << {
+ :pts => pts_rule.split(' ')[0].to_f,
+ :rule => pts_rule.split(' ')[1],
+ :text => rule_texts[i + 1]
+ }
+ end
+
+ rules
+ end
+end
View
81 lib/RubySpamAssassin/spam_client.rb
@@ -0,0 +1,81 @@
+class RubySpamAssassin::SpamClient
+ require 'socket'
+ require 'timeout'
+
+ def initialize(host="localhost", port=783, timeout=5)
+ @port = port
+ @host = host
+ @timeout =timeout
+ @socket = TCPSocket.open(@host, @port)
+ end
+
+ def reconnect
+ @socket = @socket || TCPSocket.open(@host, @port)
+ end
+
+ def send_symbol(message)
+ protocol_response = send_message("SYMBOLS", message)
+ result = process_headers protocol_response[0...2]
+ result.tags = protocol_response[3...-1].join(" ").split(',')
+ end
+
+ def check(message)
+ protocol_response = send_message("CHECK", message)
+ result = process_headers protocol_response[0...2]
+ end
+
+ def report(message)
+ protocol_response = send_message("REPORT", message)
+ result = process_headers protocol_response[0...2]
+ result.report = protocol_response[3..-1].join
+ result.rules = RubySpamAssassin::ReportParser.parse(result.report)
+ result
+ end
+
+ def report_ifspam(message)
+ result = report(message).spam?
+ end
+
+ def skip
+ protocol_response = send_message("SKIP", message)
+ end
+
+ def ping
+ protocol_response = send_message("PING", message)
+ result = process_headers protocol_response[0]
+ end
+
+ alias :process :report
+
+ private
+ def send_message(command, message)
+ length = message.length
+ @socket.write(command + " SPAMC/1.2\r\n")
+ @socket.write("Content-length: " + length.to_s + "\r\n\r\n")
+ @socket.write(message)
+ @socket.shutdown(1) #have to shutdown sending side to get response
+ response = @socket.readlines
+ @socket.close #might as well close it now
+
+ response
+ end
+
+ def process_headers(headers)
+ result = RubySpamAssassin::SpamResult.new
+ headers.each do |line|
+ case line.chomp
+ when /(.+)\/(.+) (.+) (.+)/ then
+ result.response_version = $2
+ result.response_code = $3
+ result.response_message = $4
+ when /^Spam: (.+) ; (.+) . (.+)$/ then
+ result.score = $2
+ result.spam = $1
+ result.threshold = $3
+ when /Content-length: (.+)/ then
+ result.content_length = $1
+ end
+ end
+ result
+ end
+end
View
17 lib/RubySpamAssassin/spam_result.rb
@@ -0,0 +1,17 @@
+class RubySpamAssassin::SpamResult
+ attr_accessor :response_version,
+ :response_code,
+ :response_message,
+ :spam,
+ :score,
+ :threshold,
+ :tags,
+ :report,
+ :content_length,
+ :rules
+
+ #returns true if the message was spam, otherwise false
+ def spam?
+ (@spam == "True" || @spam == "Yes") ? true : false
+ end
+end
View
14 spec/RubySpamAssassin/report_parser_spec.rb
@@ -0,0 +1,14 @@
+require_relative '../spec_helper'
+
+describe "ReportParser" do
+ it "should parse the report text into an informative hash" do
+ spam = File.read('spec/data/spam_test1.txt')
+ result = RubySpamAssassin::ReportParser.parse(spam)
+ result.length.equal?(6)
+
+ # Check contents of first one to make sure text/points are formatted correctly
+ result[0][:pts].equal?(0.5)
+ result[0][:rule].equal?('DATE_IN_PAST_24_48')
+ result[0][:text].equal?('Date: is 24 to 48 hours before Received: date')
+ end
+end
View
1  spec/RubySpamAssassin/spam_client_spec.rb
@@ -0,0 +1 @@
+require_relative '../spec_helper'
View
1  spec/RubySpamAssassin/spam_result_spec.rb
@@ -0,0 +1 @@
+require_relative '../spec_helper'
View
4 spec/RubySpamAssassin_spec.rb
@@ -1,7 +1,5 @@
require File.expand_path(File.dirname(__FILE__) + '/spec_helper')
describe "Rubyspamassassin" do
- it "fails" do
- fail "hey buddy, you should probably rename this file and start specing for real"
- end
+
end
View
39 spec/data/spam_test1.txt
@@ -0,0 +1,39 @@
+X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on
+ scanner.com
+X-Spam-Level: ***
+X-Spam-Status: No, score=3.4 required=5.0 tests=DATE_IN_PAST_24_48,
+ HTML_MESSAGE,HTML_MIME_NO_HTML_TAG,INVALID_MSGID,MIME_HTML_ONLY,
+ UNPARSEABLE_RELAY autolearn=no version=3.3.2
+Received: from dummyurl.com for <dummy@dummy.com>; Fri, 8 Mar 2013 14:53:15 -0500
+Date: Thu, 7 Mar 2013 00:02:33 -0500
+From: Dummy <dummy@dummy.com>
+Reply-To: Dummy <dummy@dummy.com>
+To: dummy@dummy.com
+Message-Id: testdummy@dummy.com
+Subject: Work Report
+Mime-Version: 1.0
+Content-Type: text/html; charset=utf-8
+Auto-Submitted: auto-generated
+
+Hey,
+
+Was wondering if I could get a copy of the work report you made yesterday. Thanks!
+Spam detection software, running on the system "scan1.blue.postageapp.com", has
+identified this incoming email as possible spam. The original message
+has been attached to this so you can view it (if it isn't spam) or label
+similar future email. If you have any questions, see
+the administrator of that system for details.
+
+Content preview: Hey, Was wondering if I could get a copy of the work report
+ you made yesterday. Thanks! [...]
+
+Content analysis details: (3.4 points, 5.0 required)
+
+ pts rule name description
+---- ---------------------- --------------------------------------------------
+ 0.5 DATE_IN_PAST_24_48 Date: is 24 to 48 hours before Received: date
+ 0.0 HTML_MESSAGE BODY: HTML included in message
+ 1.1 MIME_HTML_ONLY BODY: Message only has text/html MIME parts
+ 0.6 HTML_MIME_NO_HTML_TAG HTML-only message, but there is no HTML tag
+ 1.2 INVALID_MSGID Message-Id is not valid, according to RFC 2822
+ 0.0 UNPARSEABLE_RELAY Informational: message has unparseable relay lines
View
1  spec/spec_helper.rb
@@ -8,5 +8,4 @@
Dir["#{File.dirname(__FILE__)}/support/**/*.rb"].each {|f| require f}
RSpec.configure do |config|
-
end
Something went wrong with that request. Please try again.