Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rake snorby:soft_reset task is broken #102

Closed
rvineyard opened this issue Jul 30, 2011 · 10 comments
Closed

rake snorby:soft_reset task is broken #102

rvineyard opened this issue Jul 30, 2011 · 10 comments

Comments

@rvineyard
Copy link

So I logged in to Snorby this morning and noticed that yet again my dashboard was "stuck" - events were flowing in but the numbers and graphs were not getting updated, this time not even after manually restarting the delayed_job and CacheJob tasks. I did a bit of digging and ran across the :soft_reset task in lib/tasks/snorby.rake, however in its present form it is broken.

On line 63, the code currently reads:
Snorby::Worker.reset_cache(:all, true)

Considering that the :reset_cache method is defined in the Jobs class, this is probably more appropriate:
Snorby::Jobs.reset_cache(:all, true)

Now, after making that change, things initially look as I would expect, but then it errors out with the dreaded "undefined method `ip_src' for nil:NilClass" error that I haven't otherwise seen since the upgrade to 2.3.2.

I say upgrade, but really it was a pave and rebuild - I dropped my snort database, deleted my old snorby directory, pulled a fresh copy from the git master and started clean with a "rake snorby:setup".

Here's a trace of my modified :soft_reset task:

bash-3.2$ rake --trace snorby:soft_reset

<--- snip --->

Processing Batch 1 of 1...
Sensor 12: - fetching sensor metrics
Sensor 12: - building proto counts
Sensor 12: - fetch_event_count
Sensor 12: - fetching tcp count
Sensor 12: - fetching udp count
Sensor 12: - fetching icmp count
Sensor 12: - fetching severity metrics
Sensor 12: - fetching src ip metrics
rake aborted!
undefined method `ip_src' for nil:NilClass
/var/www/snorby/ezprint/ruby/1.9.1/gems/activesupport-3.0.5/lib/active_support/whiny_nil.rb:48:in `method_missing'
/var/www/snorby/lib/snorby/jobs/cache_helper.rb:144:in `block in fetch_src_ip_metrics'
/var/www/snorby/ezprint/ruby/1.9.1/gems/dm-core-1.1.0/lib/dm-core/collection.rb:507:in `block in each'
/var/www/snorby/ezprint/ruby/1.9.1/gems/dm-core-1.1.0/lib/dm-core/support/lazy_array.rb:411:in `block in each'
/var/www/snorby/ezprint/ruby/1.9.1/gems/dm-core-1.1.0/lib/dm-core/support/lazy_array.rb:411:in `each'
/var/www/snorby/ezprint/ruby/1.9.1/gems/dm-core-1.1.0/lib/dm-core/support/lazy_array.rb:411:in `each'
/var/www/snorby/ezprint/ruby/1.9.1/gems/dm-core-1.1.0/lib/dm-core/collection.rb:504:in `each'
/var/www/snorby/lib/snorby/jobs/cache_helper.rb:144:in `group_by'
/var/www/snorby/lib/snorby/jobs/cache_helper.rb:144:in `fetch_src_ip_metrics'
/var/www/snorby/lib/snorby/jobs/daily_cache_job.rb:140:in `block in build_cache'
/var/www/snorby/ezprint/ruby/1.9.1/gems/dm-chunked_query-0.3.0/lib/dm-chunked_query/chunks.rb:99:in `block in each'
/var/www/snorby/ezprint/ruby/1.9.1/gems/dm-chunked_query-0.3.0/lib/dm-chunked_query/chunks.rb:98:in `times'
/var/www/snorby/ezprint/ruby/1.9.1/gems/dm-chunked_query-0.3.0/lib/dm-chunked_query/chunks.rb:98:in `each'
/var/www/snorby/ezprint/ruby/1.9.1/gems/dm-chunked_query-0.3.0/lib/dm-chunked_query/mixin.rb:46:in `each_chunk'
/var/www/snorby/lib/snorby/jobs/daily_cache_job.rb:125:in `build_cache'
/var/www/snorby/lib/snorby/jobs/daily_cache_job.rb:41:in `block in perform'
/var/www/snorby/ezprint/ruby/1.9.1/gems/dm-core-1.1.0/lib/dm-core/collection.rb:507:in `block in each'
/var/www/snorby/ezprint/ruby/1.9.1/gems/dm-core-1.1.0/lib/dm-core/support/lazy_array.rb:411:in `block in each'
/var/www/snorby/ezprint/ruby/1.9.1/gems/dm-core-1.1.0/lib/dm-core/support/lazy_array.rb:411:in `each'
/var/www/snorby/ezprint/ruby/1.9.1/gems/dm-core-1.1.0/lib/dm-core/support/lazy_array.rb:411:in `each'
/var/www/snorby/ezprint/ruby/1.9.1/gems/dm-core-1.1.0/lib/dm-core/collection.rb:504:in `each'
/var/www/snorby/lib/snorby/jobs/daily_cache_job.rb:39:in `perform'
/var/www/snorby/lib/snorby/jobs.rb:103:in `reset_cache'
/var/www/snorby/lib/tasks/snorby.rake:63:in `block (2 levels) in <top (required)>'
/usr/local/lib/ruby/gems/1.9.1/gems/rake-0.9.2/lib/rake/task.rb:205:in `call'
/usr/local/lib/ruby/gems/1.9.1/gems/rake-0.9.2/lib/rake/task.rb:205:in `block in execute'
/usr/local/lib/ruby/gems/1.9.1/gems/rake-0.9.2/lib/rake/task.rb:200:in `each'
/usr/local/lib/ruby/gems/1.9.1/gems/rake-0.9.2/lib/rake/task.rb:200:in `execute'
/usr/local/lib/ruby/gems/1.9.1/gems/rake-0.9.2/lib/rake/task.rb:158:in `block in invoke_with_call_chain'
/usr/local/lib/ruby/1.9.1/monitor.rb:201:in `mon_synchronize'
/usr/local/lib/ruby/gems/1.9.1/gems/rake-0.9.2/lib/rake/task.rb:151:in `invoke_with_call_chain'
/usr/local/lib/ruby/gems/1.9.1/gems/rake-0.9.2/lib/rake/task.rb:144:in `invoke'
/usr/local/lib/ruby/gems/1.9.1/gems/rake-0.9.2/lib/rake/application.rb:112:in `invoke_task'
/usr/local/lib/ruby/gems/1.9.1/gems/rake-0.9.2/lib/rake/application.rb:90:in `block (2 levels) in top_level'
/usr/local/lib/ruby/gems/1.9.1/gems/rake-0.9.2/lib/rake/application.rb:90:in `each'
/usr/local/lib/ruby/gems/1.9.1/gems/rake-0.9.2/lib/rake/application.rb:90:in `block in top_level'
/usr/local/lib/ruby/gems/1.9.1/gems/rake-0.9.2/lib/rake/application.rb:129:in `standard_exception_handling'
/usr/local/lib/ruby/gems/1.9.1/gems/rake-0.9.2/lib/rake/application.rb:84:in `top_level'
/usr/local/lib/ruby/gems/1.9.1/gems/rake-0.9.2/lib/rake/application.rb:62:in `block in run'
/usr/local/lib/ruby/gems/1.9.1/gems/rake-0.9.2/lib/rake/application.rb:129:in `standard_exception_handling'
/usr/local/lib/ruby/gems/1.9.1/gems/rake-0.9.2/lib/rake/application.rb:59:in `run'
/usr/local/lib/ruby/gems/1.9.1/gems/rake-0.9.2/bin/rake:32:in `<top (required)>'
/usr/local/bin/rake:19:in `load'
/usr/local/bin/rake:19:in `<main>'
Tasks: TOP => snorby:soft_reset
@rvineyard
Copy link
Author

Oh, and I technically have 32 sensors (really one box with 32 copies of snort) and about 8000 alerts in the database now, for what that's worth. Most of the problems I was experiencing previously may have been due to having several million alerts, but I've tuned my signatures substantially since then. Notice the cache rebuild dies on Sensor 12...

@djcas9
Copy link
Contributor

djcas9 commented Jul 31, 2011

@Vineyard It looks like you have an event with no source_ip - I'll need to figure out a way to store events without all the proper information. I can't seem to figure out why some peoples setup stores events like that. What was the signature for that event?

@rvineyard
Copy link
Author

@mephux - I did a bit of digging and there are quite a few impossible events in my database. I'm guessing it's something wonky in barnyard2 - either the fact that I've got IPv6 turned on (and the schema's not ready for it yet) or that my sensors don't always shut down gracefully, especially while I'm developing / debugging new code. I don't see any v6 addresses in my alert logs though, so I'm leaning toward the latter. This may take some doing to track down - you mentioned others are having this problem too?

@rvineyard
Copy link
Author

@mephux, since you asked, here are what I believe the 43 offending events (assuming my sql is correct here):

mysql> select event.sid, event.cid, signature.sig_sid, signature.sig_name, inet_ntoa(iphdr.ip_src) as ip_src, inet_ntoa(iphdr.ip_dst) as ip_src, event.timestamp from iphdr natural join event join signature on (event.signature=signature.sig_id) where (event.sid, event.cid) in (select iphdr.sid, iphdr.cid from iphdr where ip_src = 0 or ip_dst = 0) order by event.timestamp asc;
+-----+------+---------+--------------------------------------------------------+---------+---------+---------------------+
| sid | cid  | sig_sid | sig_name                                               | ip_src  | ip_src  | timestamp           |
+-----+------+---------+--------------------------------------------------------+---------+---------+---------------------+
|   1 |  156 |      10 | frag3: Bogus fragmentation packet. Possible BSD attack | 4.0.0.0 | 0.0.0.0 | 2011-07-30 05:32:33 |
|  15 |  224 |      10 | frag3: Bogus fragmentation packet. Possible BSD attack | 0.0.0.0 | 0.0.0.0 | 2011-07-30 15:14:23 |
|  15 |  225 |      10 | frag3: Bogus fragmentation packet. Possible BSD attack | 0.0.0.0 | 0.0.0.0 | 2011-07-30 15:14:23 |
|  17 |  265 |      10 | frag3: Bogus fragmentation packet. Possible BSD attack | 0.0.0.0 | 0.0.0.0 | 2011-07-30 15:14:23 |
|  17 |  266 |      10 | frag3: Bogus fragmentation packet. Possible BSD attack | 0.0.0.0 | 0.0.0.0 | 2011-07-30 15:14:23 |
|  15 |  226 |      10 | frag3: Bogus fragmentation packet. Possible BSD attack | 0.0.0.0 | 0.0.0.0 | 2011-07-30 15:14:26 |
|  15 |  227 |      10 | frag3: Bogus fragmentation packet. Possible BSD attack | 0.0.0.0 | 0.0.0.0 | 2011-07-30 15:14:26 |
|  17 |  267 |      10 | frag3: Bogus fragmentation packet. Possible BSD attack | 0.0.0.0 | 0.0.0.0 | 2011-07-30 15:14:26 |
|  17 |  268 |      10 | frag3: Bogus fragmentation packet. Possible BSD attack | 0.0.0.0 | 0.0.0.0 | 2011-07-30 15:14:26 |
|  17 |  269 |      10 | frag3: Bogus fragmentation packet. Possible BSD attack | 0.0.0.0 | 0.0.0.0 | 2011-07-30 15:14:28 |
|  15 |  228 |      10 | frag3: Bogus fragmentation packet. Possible BSD attack | 0.0.0.0 | 0.0.0.0 | 2011-07-30 15:14:29 |
|  15 |  229 |      10 | frag3: Bogus fragmentation packet. Possible BSD attack | 0.0.0.0 | 0.0.0.0 | 2011-07-30 15:14:29 |
|  17 |  270 |      10 | frag3: Bogus fragmentation packet. Possible BSD attack | 0.0.0.0 | 0.0.0.0 | 2011-07-30 15:14:29 |
|  15 |  230 |      10 | frag3: Bogus fragmentation packet. Possible BSD attack | 0.0.0.0 | 0.0.0.0 | 2011-07-30 15:14:31 |
|  15 |  231 |      10 | frag3: Bogus fragmentation packet. Possible BSD attack | 0.0.0.0 | 0.0.0.0 | 2011-07-30 15:14:31 |
|  17 |  271 |      10 | frag3: Bogus fragmentation packet. Possible BSD attack | 0.0.0.0 | 0.0.0.0 | 2011-07-30 15:14:31 |
|  17 |  272 |      10 | frag3: Bogus fragmentation packet. Possible BSD attack | 0.0.0.0 | 0.0.0.0 | 2011-07-30 15:14:31 |
|  22 |  830 |      10 | frag3: Bogus fragmentation packet. Possible BSD attack | 4.0.0.0 | 0.0.0.0 | 2011-07-30 16:42:07 |
|  29 |  243 |      10 | frag3: Bogus fragmentation packet. Possible BSD attack | 4.0.0.0 | 0.0.0.0 | 2011-07-30 16:42:07 |
|  22 |  831 |      10 | frag3: Bogus fragmentation packet. Possible BSD attack | 4.0.0.0 | 0.0.0.0 | 2011-07-30 16:42:08 |
|  22 |  832 |      10 | frag3: Bogus fragmentation packet. Possible BSD attack | 4.0.0.0 | 0.0.0.0 | 2011-07-30 16:42:35 |
|  29 |  244 |      10 | frag3: Bogus fragmentation packet. Possible BSD attack | 4.0.0.0 | 0.0.0.0 | 2011-07-30 16:42:36 |
|   6 | 1173 |      10 | frag3: Bogus fragmentation packet. Possible BSD attack | 4.0.0.0 | 0.0.0.0 | 2011-07-31 00:33:26 |
|   7 |  364 |      10 | frag3: Bogus fragmentation packet. Possible BSD attack | 0.0.0.0 | 0.0.0.0 | 2011-07-31 00:58:31 |
|   7 |  365 |      10 | frag3: Bogus fragmentation packet. Possible BSD attack | 0.0.0.0 | 0.0.0.0 | 2011-07-31 00:58:31 |
|  25 |  328 |      10 | frag3: Bogus fragmentation packet. Possible BSD attack | 0.0.0.0 | 0.0.0.0 | 2011-07-31 02:58:36 |
|  25 |  329 |      10 | frag3: Bogus fragmentation packet. Possible BSD attack | 0.0.0.0 | 0.0.0.0 | 2011-07-31 02:58:36 |
|  17 |  434 |      10 | frag3: Bogus fragmentation packet. Possible BSD attack | 0.0.0.0 | 0.0.0.0 | 2011-07-31 13:09:29 |
|  15 |  361 |      10 | frag3: Bogus fragmentation packet. Possible BSD attack | 0.0.0.0 | 0.0.0.0 | 2011-07-31 13:09:30 |
|  15 |  362 |      10 | frag3: Bogus fragmentation packet. Possible BSD attack | 0.0.0.0 | 0.0.0.0 | 2011-07-31 13:09:30 |
|  17 |  435 |      10 | frag3: Bogus fragmentation packet. Possible BSD attack | 0.0.0.0 | 0.0.0.0 | 2011-07-31 13:09:30 |
|  15 |  363 |      10 | frag3: Bogus fragmentation packet. Possible BSD attack | 0.0.0.0 | 0.0.0.0 | 2011-07-31 13:09:32 |
|  15 |  364 |      10 | frag3: Bogus fragmentation packet. Possible BSD attack | 0.0.0.0 | 0.0.0.0 | 2011-07-31 13:09:32 |
|  17 |  436 |      10 | frag3: Bogus fragmentation packet. Possible BSD attack | 0.0.0.0 | 0.0.0.0 | 2011-07-31 13:09:32 |
|  17 |  437 |      10 | frag3: Bogus fragmentation packet. Possible BSD attack | 0.0.0.0 | 0.0.0.0 | 2011-07-31 13:09:33 |
|  17 |  438 |      10 | frag3: Bogus fragmentation packet. Possible BSD attack | 0.0.0.0 | 0.0.0.0 | 2011-07-31 13:09:35 |
|  17 |  439 |      10 | frag3: Bogus fragmentation packet. Possible BSD attack | 0.0.0.0 | 0.0.0.0 | 2011-07-31 13:09:35 |
|  15 |  365 |      10 | frag3: Bogus fragmentation packet. Possible BSD attack | 0.0.0.0 | 0.0.0.0 | 2011-07-31 13:09:36 |
|  15 |  366 |      10 | frag3: Bogus fragmentation packet. Possible BSD attack | 0.0.0.0 | 0.0.0.0 | 2011-07-31 13:09:36 |
|  15 |  367 |      10 | frag3: Bogus fragmentation packet. Possible BSD attack | 0.0.0.0 | 0.0.0.0 | 2011-07-31 13:09:38 |
|  15 |  368 |      10 | frag3: Bogus fragmentation packet. Possible BSD attack | 0.0.0.0 | 0.0.0.0 | 2011-07-31 13:09:38 |
|  17 |  440 |      10 | frag3: Bogus fragmentation packet. Possible BSD attack | 0.0.0.0 | 0.0.0.0 | 2011-07-31 13:09:38 |
|  17 |  441 |      10 | frag3: Bogus fragmentation packet. Possible BSD attack | 0.0.0.0 | 0.0.0.0 | 2011-07-31 13:09:39 |
+-----+------+---------+--------------------------------------------------------+---------+---------+---------------------+
43 rows in set (0.12 sec)

@rvineyard
Copy link
Author

Ok, that first one can't be me messing around... there's no way I was up at 5:32am on Saturday :-p

@rvineyard
Copy link
Author

Another dirty hack (I still haven't figured out why I'm getting the bogus records in my database) which seems to make the cache-building workers happy... I'm sure there's a better way to do this but I got tired of working through all the JOINs you have to do to in order to accomplish this with a single query. Since I'm using InnoDB on the backend, the next best thing is to run it all as a single transaction. I've got a cronjob feeding this into mysql every ten minutes:

begin;
delete from iphdr where ip_src = 0 or ip_dst = 0;
delete from caches where (sid, cid) not in (select iphdr.sid as sid, iphdr.cid as cid from iphdr);
delete from daily_caches where (sid, cid) not in (select iphdr.sid as sid, iphdr.cid as cid from iphdr);
delete from data where (sid, cid) not in (select iphdr.sid as sid, iphdr.cid as cid from iphdr);
delete from event where (sid, cid) not in (select iphdr.sid as sid, iphdr.cid as cid from iphdr);
delete from favorites where (sid, cid) not in (select iphdr.sid as sid, iphdr.cid as cid from iphdr);
delete from notes where (sid, cid) not in (select iphdr.sid as sid, iphdr.cid as cid from iphdr);
delete from opt where (sid, cid) not in (select iphdr.sid as sid, iphdr.cid as cid from iphdr);
delete from icmphdr where (sid, cid) not in (select iphdr.sid as sid, iphdr.cid as cid from iphdr);
delete from tcphdr where (sid, cid) not in (select iphdr.sid as sid, iphdr.cid as cid from iphdr);
delete from udphdr where (sid, cid) not in (select iphdr.sid as sid, iphdr.cid as cid from iphdr);
commit;

@djcas9
Copy link
Contributor

djcas9 commented Aug 1, 2011

@Vineyard I am also going to build proper handeling of this into the cache jobs. Just to make sure.

@rvineyard
Copy link
Author

@mephux when you say "proper handling" - do you mean deleting the bogus records like I am doing (which is probably the right way to handle them, since none of their other attributes make sense), or just skipping over them when generating the caches?

I used to experience the NilClass crash issue much more often, and before I realized the connection between this error and the bogus events in my snort database one of the fixes I implemented was to ensure that required fields were always checked for Nil before populating, ensuring that an invalid event object was never created.

Unfortunately, this had severe performance implications (at least the way I implemented it) and did not completely alleviate the problem.

On the plus side, after running the sql commands above and modifying the code for the soft_reset task, it appears that I am now able to rebuild my caches and charts at will whenever something gets fubared.

Crossing my fingers I won't have to do that very often now that most of the other issues that were affecting me have been resolved. :-)

@vrou
Copy link

vrou commented Aug 4, 2011

I had the same problem with my install, and after using vineyards query, it all works again, thank you vineyard for sharing this with us :)

Let's hope that somebody locates the problem, and fixes it the right way.

@cventers
Copy link

I have the same problem with Snorby 2.3.11. The SQL query above seems to help, but it looks like I'm also going to need a cron job to re-engage the sensor cache job after it fails.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants