New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Crashes after 2k (or 3k) devices, before update no limit (30k+ devices) #191
Comments
Unfortunately so much has changed since then I couldn't even begin to guess.
Having the *exact* messages it gives when crashing would be a start;
additionally, running it via the debug instructions at
https://www.kismetwireless.net/docs/readme/debugging/
Otherwise you'll need to start going through and try to isolate when the
change for your setup occurred; this most likely involves building the
releases in between then and now manually and finding out if the same
behavior happens.
…On Mon, Nov 18, 2019 at 2:09 PM Stephan Martin ***@***.***> wrote:
Hello,
im running kismet release from the repo on a rpi3 with kali.
In July this year i setup the whole thing and after some fiddling with the
config i was able to run up to 30k devices before the memory of the rpi3
was full.
The device was running like this over several sessions with no issue.
Then couple of days ago i "apt upgrade" and got a newer kismet version.
Since that day kismet stops after either 2000 (from 2000 to 2200) or 3000
(3000 to 3300) devices with no visible issue.
One time i was able to see the cpu load, it went to 100% for a couple of
seconds, then kismet stops. Memory at that point is at 25%.
Also the kismet databases are mostly around 30mb, some of them even at
exactly 30mb.
I now changed sd-card, placed the logging location to a usb stick, no
changes.
Also i cannot find any logs or entrys, it just "disappears", syslog shows
nothing
How can i help to debug this? I have no idea where to start looking
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#191?email_source=notifications&email_token=AFKJYY2W32UVHYXJHZZMQD3QULR7PA5CNFSM4JOYXLYKYY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4H2EE5AQ>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AFKJYY3KQLI6DJC5OYSVWNDQULR7PANCNFSM4JOYXLYA>
.
|
I'll try to do some tests as well and see if I can get it to ramp up CPU use. |
So far the only high cpu states i've monitored are when it's updating the database log for all the devices that have changed; i'll keep monitoring however. |
The biggest changes I can think of that might impact this, since early summer, are the detector that the SD card can't keep up, and the shortening of the write period to try to compensate for slow sd cards. If you're on a pi3 logging to sd, knowing the exact error it exits with would be quite helpful since it would immediatley show if your SD card is taking too long to flush the db. |
I might have found the (or a) culprit - it wasn't updating the last-logged time, so as more and more devices were added, the amount of work it took to save them increased because it kept re-saving previously logged devices. The latest git has a fix in it, and it'll go into tonights nightlies, if you get a chance to try them tomorrow or build it yourself. |
your idea looks right, the more devices i have the higher the cpu loads get at about every 1k of devices. at 1k the cpu just goes a bit up, at 2k quite a bit, and at 3k it always goes to 100%. the output in the stderr is then this:
i made a script fetching the top 5 CPU items during that time:
so yes kismet is writing to the usb-stick and that pushes it to the timeout. i will switch to the nightly build and give it a test in the next days! |
I'd also suggest not using ntfs; that's going to add a significant slowdown
to all your disk IO as well since it has to go through FUSE.
…On Wed, Nov 20, 2019 at 1:38 PM Stephan Martin ***@***.***> wrote:
your idea looks right, the more devices i have the higher the cpu loads
get at about every 1k of devices. at 1k the cpu just goes a bit up, at 2k
quite a bit, and at 3k it always goes to 100%.
the output in the stderr is then this:
Stack trace (most recent call last) in thread 3333: #8 Object
"[0xffffffffffffffff], at 0xffffffffffffffff, in #7 FATAL - Capture source
did not get PING from Kismet for over 15 seconds; shutting down FATAL -
Capture source did not get PING from Kismet for over 15 seconds; shutting
down Object "/lib/aarch64-linux-gnu/libc.so.6, at 0x7f956aa70b, in #6
Object "/lib/aarch64-linux-gnu/libpthread.so.0, at 0x7f95d0b887, in #5
Object "/lib/aarch64-linux-gnu/libgomp.so.1, at 0x7f957820b3, in #4 Object
"kismet, at 0x556777e9df, in #3 Object
"/lib/aarch64-linux-gnu/libstdc++.so.6, at 0x7f958f52cb, in __cxa_throw #2
Object "/lib/aarch64-linux-gnu/libstdc++.so.6, at 0x7f958f4fff, in
std::terminate() #1 Object "/lib/aarch64-linux-gnu/libstdc++.so.6, at
0x7f958f4fab, in #0 Object "kismet, at 0x5567909633, in
TerminationHandler() FATAL - Capture source did not get PING from Kismet
for over 15 seconds; shutting down FATAL - Capture source did not get PING
from Kismet for over 15 seconds; shutting down debug - seen multiple
basicrates?
i made a script fetching the top 5 CPU items during that time:
root 7099 12.2 20.1 1212908 189552 ? Sl 18:42 2:19 kismet root 370 10.9
0.2 4948 2100 ? Ds 18:06 6:02 /sbin/mount.ntfs /dev/sda1 /media/usb -o
rw,sync,noexec,nosuid,nodev,users root 133 10.9 0.0 0 0 ? R 18:06 6:04
[usb-storage] root 562 7.8 3.4 124592 32144 ? Ssl 18:06 4:17
/usr/bin/python3 /home/pi/warpigui.py gpsd 7100 3.1 0.3 7220 3320 ? S<s
18:42 0:35 gpsd /dev/serial0
so yes kismet is writing to the usb-stick and that pushes it to the
timeout.
i will switch to the nightly build and give it a test in the next days!
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#191?email_source=notifications&email_token=AAZAOCEIWCCYSGGIICPOE5LQUV73FA5CNFSM4JOYXLYKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEETVBFI#issuecomment-556224661>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAZAOCEEDY2S3XA4GZW5QITQUV73FANCNFSM4JOYXLYA>
.
|
i switched to the git version 2019-11-21-21r36edada5-1, and to a vfat usb stick. the issue remains. as ive seen this warning in the logs, i made a small pull request to fix a text bug: #192 also ive increased the write delay from 30 to 60, no change to the result. |
I'll look at the diff when i get chance this weekend.
That implies your USB, or your storage, can't handle the IO load. On a pi3
this doesn't surprise me, and using NTFS over FUSE is going to *seriously*
impact your filesystem performance, as well - like as much as 50% or more
performance lost.
So far with the latest nightlies I can only replicate modest CPU growth,
nothing runaway. 3500 devices takes about 8% more than 100 devices which
is roughly to be expected.
The runaway logging was definitely going to cause a performance impact, but
that should be addressed now.
…On Fri, Nov 22, 2019 at 9:39 AM Stephan Martin ***@***.***> wrote:
i switched to the git version 2019-11-21-21r36edada5-1, and to a vfat usb
stick. the issue remains.
as ive seen this warning in the logs, i made a small pull request to fix a
text bug: #192 <#192>
also ive increased the write delay from 30 to 60, no change to the result.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#191?email_source=notifications&email_token=AAZAOCBXH2IOPRVTZP6PQNLQU7VLVA5CNFSM4JOYXLYKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEE52DHA#issuecomment-557556124>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAZAOCGZZTRCVC5OV72GFF3QU7VLVANCNFSM4JOYXLYA>
.
|
yes the ntfs usb stick was no good idea from me, but it was a try to solve the issue as i thought my sd card might be the issue. with the "old" kismet version from july i did not have this problem. To clarify it, the rpi is running at around 3-5% cpu load, but at about every 1,1k devices the cpu load spikes. At ~1,1k the spike is to maybe 40%, then it returns to the 3-5%, at 2,2k it already hits 100% for a short time, then it returns to 3-5% (if no timeout happens) and latest at around 3,3k devices it goes to 100% with the timeout. Is there something i can turn off to reduce the load on the file system? |
If your logging system can't keep up it's really an indicator that your
hardware isn't sufficient for the environment you're trying to use it in.
A pi3 is pretty terrible at SD, USB, and ethernet, which doens't make it a
particularly well suited device if you've got thousands or more of devices
you're trying to log.
You can turn off logging options, yes; look at kismet_logging.conf and
https://www.kismetwireless.net/docs/readme/performance_and_memory/
…On Sat, Nov 23, 2019 at 2:37 AM Stephan Martin ***@***.***> wrote:
yes the ntfs usb stick was no good idea from me, but it was a try to solve
the issue as i thought my sd card might be the issue.
with the "old" kismet version from july i did not have this problem.
To clarify it, the rpi is running at around 3-5% cpu load, but at about
every 1,1k devices the cpu load spikes. At ~1,1k the spike is to maybe 40%,
then it returns to the 3-5%, at 2,2k it already hits 100% for a short time,
then it returns to 3-5% (if no timeout happens) and latest at around 3,3k
devices it goes to 100% with the timeout.
Is there something i can turn off to reduce the load on the file system?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#191?email_source=notifications&email_token=AFKJYY7A33BCWHRAWGDK3VLQVDMVJA5CNFSM4JOYXLYKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEE7PV4Y#issuecomment-557775603>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AFKJYY7C5H2AOIO3BZTKFXDQVDMVJANCNFSM4JOYXLYA>
.
|
Ok, i take it that the rpi is not strong enough. Thanks for helping me! I will then make a script that auto-restarts and merges the db files for me |
this looks like it could be related to the channel summary code. why it's happening now and not before, i have no idea. why it's linked to 1000 devices, also no idea - that makes no sense, but i can replicate SOME cpu jump at 1000 devices so it looks like something is going on. Manually forcing the channel calculations to 10 seconds drops the load, so it's definitely something related to probing the channels in the device list. I'll work on optimizing it more. |
ok, i was running with kis_log_channel_history=false but a look at the code gives no use for that parameter: https://github.com/kismetwireless/kismet/search?q=kis_log_channel_history&unscoped_q=kis_log_channel_history Also the kis_log_channel_history_rate is not used: https://github.com/kismetwireless/kismet/search?q=kis_log_channel_history_rate&type=Code is there a way for me to try this? |
It looks like at some point in the past months, something may have changed in the gnu::parallel C++ code; just pushed some changes that remove use of the parallelization and saw some dramatic improvements in my test data, will have to wait to run it through natural data collection to see if it still holds true. The logging option missing is a bug; but it won't impact anything but logging size. All it would control is if it dumps the channel status to the kismetdb - which it doesn't do currently anyhow. |
ive given it a update today and was able to run 13k devices with no issue. i stopped as my battery was running low... so this is now back fully working for me 👍 |
Hello,
im running kismet release from the repo on a rpi3 with kali.
In July this year i setup the whole thing and after some fiddling with the config i was able to run up to 30k devices before the memory of the rpi3 was full.
The device was running like this over several sessions with no issue.
Then couple of days ago i "apt upgrade" and got a newer kismet version.
Since that day kismet stops after either 2000 (from 2000 to 2200) or 3000 (3000 to 3300) devices with no visible issue.
One time i was able to see the cpu load, it went to 100% for a couple of seconds, then kismet stops. Memory at that point is at 25%.
Also the kismet databases are mostly around 30mb, some of them even at exactly 30mb.
I now changed sd-card, placed the logging location to a usb stick, no changes.
Also i cannot find any logs or entrys, it just "disappears", syslog shows nothing
How can i help to debug this? I have no idea where to start looking
The text was updated successfully, but these errors were encountered: