New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dashboard / Caching #54
Comments
@corelanc0d3r the cache is built in 30 minute increments. Can you check the caches/daily_caches table and make sure the timestamps are correct and in the proper timezone. |
timestamps in caches are correct and in the proper timezone. Sample database entry : 2011-02-28 22:30:00 The dashboard, however, shows date & time differently : |
Are the run_at times being calculated properly in the delayed_jobs table? When a job runs and complets it will increment it's completion time by 30 minutes for the caches table and 24 hours for the daily_caches table. |
yeah, it looks like they are incremented properly, and the record shows the date & time in the correct format. |
I also noticed that the Last updated time is ahead of the current time... so looks like there is some kind of date/time formatting/interpretation issue |
I had a similar problem, just try this: and then it should work. Good luck |
I wrote about this the other day on the IRC...not sure if you got a chance to read it. I actually tracked the problem down to the query that the job runs to update the cache. It looks for events that happen in the future...instead of between the present time and 30 minutes in the past which i assume is the intended behabior. Clearing the cache and manually invoking the job will make the worker look through your entire event list and update the cache table and subsequently the dashboard however any future events will still be plaqued with the problem. You can do a tail on your log to see this behavior. However you will only notice the erroneos query with the worker runs its job on the 30 minute schedule... not when manually envoked. Also, Hope this helps. |
what would be the fix for looking for events in the future ? |
any ideas on how to fix the issue & make the dashboard work properly ? |
I tried to fix this all weekend. I think I just came across a post that could explain what's happening. Apparently there is a bug in DataMapper and how it reads dates from MySQL. So, it can write the dates correctly, but when it reads the dates back out (i.e. when it tries to use the value @sensor.cache.last.ran_at) it is off by one. I'm on the East Coast, so when the sensor_cache_job.rb script gets the time from Time.now it shows -5:00 since it's EST. However, the value read from the ran_at field shows -4:00. I believe it's because this bug that was just recently discovered and fixed: datamapper/do@9e369b7#commitcomment-271516 |
Well, now I'm confused. It looks like Snorby is using do_mysql-0.10.3.gem. That's the latest with the DST fix. |
I went ahead and tried to adjust it by one anyway even though the DST bug was supposedly fixed. I can share this sensor_cache_job.rb file with anyone that's interested, but I need to clean it up first. I tried a lot of stuff before I got to this point. But, basically, doing this seems to have fixed it. I'm still waiting for the SensorCacheJob to run on it's own (in 6 minutes) but it worked in my manual test. I think this will enable the dashboard to refresh as it is supposed to. start_time = Time.parse("#{@sensor.cache.last.ran_at}") + 1.hour |
I just wanted to follow up on the comments I posted last night. I made the changes as noted above (for the dst thing) as well as some small changes to some of the logic of the sensor_cache_job.rb. My dashboard has been working consistently since those changes were put into place. After I clean up the code and do a diff on the original file, I can recommend these minor changes to the Snorby devs. |
that's good news - and as soon as the devs have reviewed it, I would be more than happy to test it |
@dcarrith good work.. I look forward to seeing your changes. |
@dcarrith not sure on the +1.hour issue.. this just sounds like your system time/sensor times are not on the correct timezones. Example, if i added |
Snort and Snorby are on the same box. And, the timezone is set correctly to EST. I did notice that my router (running dd-wrt) is set to UTC. So, I'll need to fix that, but I don't think that's what was causing the issues I was seeing. I'm not sure what would happen if I just made that one change. It's likely that you would also need to make the other changes I made. How should I go about sending this updated file? I guess I could create a git repository and upload it to that. I'll look into it tonight and let you know. |
In my setup everything is on one box. The timestamps for the system as well as the DB are correct. When I logged in today it was showing last updated at 2 hours in the future and no results for today although there were events listed. I ran: Which forced it to update correctly however it now shows the last updated time 1 hour in the future just as dcarrith describes. |
30 minutes rolled around and the last updated time (Though still 1 hour in the future) incremented by 30 minutes and I can see that the job ran. The dashboard however does not contain new events that occurred since I last forced an update. So I cleared the cache again and ran the job as described above and the dashboard updated and still shows 1 hour in the future as the last updated time. Hopefully some of this info helps. I think Snorby 2 is awesome and just want to help out. |
I determined earlier that my dd-wrt based router was set to the wrong DST setting. So, I fixed it, and it didn't have an impact on the "fix" I put into place on my Snorby box. I still get events updated every 30 minutes as I should. So, I don't think that was the cause of the problem. I really think it was the datamapper issue I mentioned earlier. Perhaps they thought they fixed it, but only fixed it for whatever test case they were using. I'm still going to do some more digging and I seem to have broken the event counts on the top signatures list...so I need to fix that too. |
I am really confused from this discussion. So, how can we fix this problem? |
I've been meaning to reply to this post, but haven't had time to fix the counts on the "Last 5 Unique Events" that I seem to have broken while trying to fix the dashboard. My dashboard is regularly updating though. I'll try to track down the counts issue and reply back within the week. |
@dcarrith thank you for all your hard work on this issue. I'm still a bit confused on the fix, I have been in production since I released snorby and all of my times have been accurate (numerous installs on pretty much very OS besides windows). I am not disputing that there is an issue but i'm confused by what causes it and why I have never experienced it. Let's work together on this once you post you fix. Thanks a lot dcarrith and great work man. |
I wonder if this is a glitch that only appears for certain users based on their timezone or time configurations. I'm using NTP and have my time zone set appropriately. I wonder if this bug will dissapear if I just use UTC. I will try that tonight and report back. |
I'm running Ubuntu Desktop 10.04 64-bit. cat /etc/timezone
America/New_York date +"%:z"
-05:00 dpkg-reconfigure tzdata
Local time is now: Mon Mar 7 11:16:33 EST 2011. Does anyone see anything wrong with that output? I live in VA. Here is some more info about my install: apache2ctl -v
Server version: Apache/2.2.14 (Ubuntu) passenger -v
Phusion Passenger version 3.0.3 mysql --version
mysql Ver 14.14 Distrib 5.1.41, for debian-linux-gnu (x86_64) using readline 6.1 ruby -v
ruby 1.9.2p180 (2011-02-18 revision 30909) [x86_64-linux] gem -v
1.3.7 rails -v
Rails 3.0.3 DataMapper MySQL adapter: |
Doubt I can be much help, but I've been experiencing the same issue for some time now. My dashboard is currently 30 minutes into the future and will increase by 30 min every time I run the jobs manually. It resets back to only 30 minutes in the future overnight. I don't have much of an understanding of the Snorby backend, but I'll be glad to provide any information to help track down the cause. Fedora 12 |
I was playing with the timezones at machine where is Snorby and I noticed that dashboard is working corectly when the timezone is set to anything except CET. When the timezone is set to CET, LastUpdate information of dashboard is 1 hour in the future. So I think the bug must by in timezones. |
I reset my timezone to UTC with sudo dpkg-reconfigure tzdata, dropped the snorbyDB and rebuilt with sudo rake snorby:setup RAILS_ENV=production. It seems to be working properly now. I still would prefer to have things display in the approapriate time zone so I will play around some more to see if I can narrow the problem down. Maybe it has something to do with DST @matherej - I was using CST and dcarrith was using EST so I dont think its necessarily just one time zone that is having the issue. |
@cyberconsole: take a look at my post above: https://github.com/Snorby/snorby/issues/#issue/54/comment/828861 I also think it's a DST issue. If that's the case, then this Sunday when the DST switches back (or forth...whatever) the issue should go away. At that point, I may have to revert the changes I made. We'll find out this weekend I guess. |
My timezone is CET. When I set timezone CET in ubuntu, snorby dashboard dont work corectly. I always have lastupdate 1 hour in the future. But when I change timezone to GMT-1 what is the same time, dashboard work. So, it must by something wrong with setting of timezones. |
@matherej: This does fix (GMT-2 for me) the update issue on the dashboard, but the alerts are still in the future. The graf shows alerts 2 hours ahead (for me). |
Just wanted to jump in here and say that now that I've moved from CST to CDT, my dashboard seems to be working correctly and has not jumped to future timestamps since. |
what is the fix ? I don't want to mess up timestamps or timezones and I am still left with a broken system |
Yeah, won't snorby team release a fix to this soon? I can change the time on my server manually, but that's not what i really want to do since this seems more of a snorby issue. |
Nevermind. Due to the change of the time here in europe this night, the problem solved itself. |
This problem still exists for me. I set my timezone to UTC, dropped the snorby database and rebuilt with rake snorby:setup RAILS_ENV=production in /var/www/snorby. I then rebooted the box. Still wasn't working, so I removed the caches and updated them via the method above (post #6 in this thread). I waited for them to update, then even restarted the worker for good measure. I still have nothing in the dashboard, but everything else seems to work fine. |
So I started over and rolled my own - had been using Insta-Snorby before. I did the bundle install of the gems it wanted, so I'm using do_mysql 0.10.3. I'm in EST, and my time shows correctly on the dashboard (last updated is the immediately previous 30 minute time tick - correct). However, I have nothing in the dashboard. My jobs are running properly, as far as I can tell, and I have events and severities working properly, they just don't show up in the dashboard. My caches table gets a new entry every half hour, but my daily_caches table never gets anything. I tried the Snorby::Jobs.clear_cache(true) |
I think I'm having entirely different problem. My dashboard times are now correct, but when I try to manually start the sensor cache job, I get: Loading development environment (Rails 3.0.5) Any ideas as to why? |
I've discussed my progress fighting with that particular issue ^^^ here: Now my workers don't crash, but they also don't seem to work properly (the graphs are broken) unless I'm supervising them, it's bizarre. I have noticed a lot of NULL src_ips and dst_ips records in the caches table in my database, and when I delete them things seem to improve a bit but I feel like I'm shooting in the dark. |
Thanks! At least that lets me know that it's NULL src_ips and dest_ips in the database that causes the problem. It seems to me that this is a bug. Unfortunately I don't know ruby, but I would think it would be trivial to add code that wouldn't barf on them. I suppose I could make a cron job that automatically ran a mysql query like "use snorby; delete * from caches where src_ips IS NULL;" but it sure seems like there would be a better fix! Maybe just change the database schema so that the default value for src_ips and dst_ips is no longer null? However, I think that there is actually a different problem, because those entries where src_ips and dst_ips are NULL are clearly junk entries. They have no data in them, and both are always NULL together. So really the problem is that those entries are somehow junk, and whatever is producing them has a bug in it. Thanks for the tip. |
That doesn't work anyway, as the junk data just gets regenerated by the SensorCache job as soon as I delete it from the database. I guess it's just corruption elsewhere. But why? Doing a hard reset with rake snorby:hard_reset in the snorby root dir fixes it temporarily, but them I lose all my old data! I'm starting to think about moving to a different front-end, but this one is so nice in so many ways! |
Actually, that's exactly what I'm doing right now. I have scripts set to run every minute to restart any of the worker jobs if they're not running, and a mysql delete nulls query running every minute as well... so far, so good. I'm also running the SensorCacheJob with (true) instead of (false) and everything's dandy - it's like the watched pot that never boils. It's a horrendous hack, but it seems to work for me to some extent. |
also to wmjosiah: I think the database NULLs are a side effect of the real problem. The objects that the database records correspond to should not be getting created with these properties set to nil (the ruby equivalent to NULL). |
Yes, I agree - this is indeed an awesome frontend. I've already used it to find many, many problems on our network and fix them. It would be nice if the graphs worked, but not essential - it's still better than anything else. @Vineyard: when I delete the NULL records from the caches table, then run the SensorCacheJob, it comes right back and crashes the SensorCacheJob again, so I might have a slightly different problem than you, though clearly quite related. If you fix it in the ruby code (I agree with you about the root of the problem), then I'd love to hear about it! Until then, I'll just live without the dashboard. |
For me the graphs are pretty much essential to manage my volume of alerts. I've made many modifications to the code in hopes of fixing this, but I can't seem to get anything to stay working for more than a few days. It's to the point that I'm doing my best to get up to speed on the current version of Rails so that I can hopefully have a better idea what I'm looking at - this is such a critical feature for me that trying to keep these graphs generating has become almost a full-time job for me :-( |
Since I haven't had anymore time to work on this, I thought I would just make the file I modified (sensor_cache_jobs.rb) available to people. That way, you can give it a shot and see if it gets your dashboard to a working state. My dashboard is currently working and has been for a few weeks now. Here's the git repo I set up to contribute this file. It should be public. I think the following steps will be necessary to get your dashboard back up and running after copying this file into place: cd /var/www/snorby (or wherever you have put the snorby directory)Before you do anything, you might want to stop snort and barnyard2 if you have those guys running. sudo rails c
If you run into the ip_src = nil issue during processing and the job terminates, you will have to connect to your database and find the most recent caches table entry and delete it. Then, re-run either the SensorCacheJob or DailyCacheJob, whichever one terminated prematurely. That's the only way I've found to get past the ip_src = nil thing. |
Even if I delete the entire caches table, I still run into the "undefined method `ip_src' for nil:NilClass |
This issue has been fixed in Snorby 2.3.1 |
it looks like the dashboard / caching is not really working stable for me.
Events are delivered to the database (table events), and I can see them in the events view in snorby.
The worker processes are running... but the dashboard does not get updated.
When I remove the jobs & add them again, nothing changes.
When I clear the caches table, remove the jobs & add them again, the cache is rebuilt.
After the cache is rebuilt, it does not seem to update anymore.
How can I troubleshoot this ?
The text was updated successfully, but these errors were encountered: