Lots of unmatched time when using a simple rule that should tag everything #129

nomeata · 2021-04-23T00:08:08Z

Original report by Konzertheld (Bitbucket: Konzertheld, GitHub: Konzertheld).

Hello there,

I have a simple rule in my configuration: tag Program:$current.program that should tag every entry. When I run statistics: arbtt-stats -c Program --for-each Month -m 0.1 I get a lot of unmatched time though. What could be happening?

I use arbtt-stats version 0.10.4 on Manjaro Linux. Following is the entire categorizing file:

‌

-- -*- mode: haskell; -*-
-- Comments in this file use the Haskell syntax:
-- A "--" comments the rest of the line.
-- A set of {- ... -} comments out a group of lines.

-- This defines some aliases, to make the reports look nicer:
aliases (
	"sun-awt-X11-XFramePeer"  -> "java",
	"sun-awt-X11-XDialogPeer" -> "java",
	"sun-awt-X11-XWindowPeer" -> "java",
	"gramps.py"               -> "gramps",
	"___nforschung"           -> "ahnenforschung",
	"Pidgin"                  -> "pidgin"
	)

-- A rule that probably everybody wants. Being inactive for over a minute
-- causes this sample to be ignored by default.
$idle > 60 ==> tag inactive,

-- A rule that matches on a list of strings
current window $program == ["Navigator","galeon"] ==> tag Web,

current window $program == "sun-awt-X11-XFramePeer" &&
current window $title == "I3P"
  ==> tag Program:I3P,

current window $program == "sun-awt-X11-XDialogPeer" &&
current window $title == " " &&
any window $title == "I3P"
  ==> tag Program:I3P,

-- Simple rule that just tags the current program
tag Program:$current.program,

-- Another simple rule, just tags the current desktop (a.k.a. workspace)
tag Desktop:$desktop,

tag Title:$current.title,

-- I'd like to know what evolution folders I'm working in. But when sending a
-- mail, the window title only contains the (not very helpful) subject. So I do
-- not tag necessarily by the active window title, but the title that contains
-- the folder
current window $program == "evolution" &&
any window ($program == "evolution" && $title =~ /^(.*) \([0-9]+/)
  ==> tag Evo-Folder:$1,

-- A general rule that works well with gvim and gnome-terminal and tells me
-- what project I'm currently working on
current window $title =~ m!(?:~|home/jojo)/projekte/(?:programming/(?:haskell/)?)?([^/)]*)!
  ==> tag Project:$1,
current window $title =~ m!(?:~|home/jojo)/debian!
  ==> tag Project:Debian,

-- This was a frequently looked-at pdf-File
current window $title =~ m!output.pdf! &&
any window ($title =~ /nforschung/)
  ==> tag Project:ahnenforschung,


-- My diploma thesis is in a different directory
current window $title =~ [ m!(?:~|home/jojo)/dokumente/Uni/DA!
                         , m!Diplomarbeit.pdf!
                         , m!LoopSubgroupPaper.pdf! ]
  ==> tag Project:DA,

current window $title =~ m!TDM!
  ==> tag Project:TDM,

( $date >= 2010-08-01 &&
  $date <= 2010-12-01 &&
  ( current window $program == "sun-awt-X11-XFramePeer" &&
      current window $title == "I3P" ||
    current window $program == "sun-awt-X11-XDialogPeer" &&
      current window $title == " " &&
      any window $title == "I3P" ||
    current window $title =~ m!(?:~|home/jojo)/dokumente/Uni/SA! ||
    current window $title =~ m!Isabelle200! ||
    current window $title =~ m!isar-ref.pdf! ||
    current window $title =~ m!document.pdf! ||
    current window $title =~ m!outline.pdf! ||
    current window $title =~ m!Studienarbeit.pdf! )
) ==> tag Project:SA,


-- Out of curiosity: what percentage of my time am I actually coding Haskell?
current window ($program == "gvim" && $title =~ /^[^ ]+\.hs \(/ )
  ==> tag Editing-Haskell,

{-
-- Example of time-related rules. I do not use these myself.

-- To be able to match on the time of day, I introduce tags for that as well.
-- $time evaluates to local time.
$time >=  2:00 && $time <  8:00 ==> tag time-of-day:night,
$time >=  8:00 && $time < 12:00 ==> tag time-of-day:morning,
$time >= 12:00 && $time < 14:00 ==> tag time-of-day:lunchtime,
$time >= 14:00 && $time < 18:00 ==> tag time-of-day:afternoon,
$time >= 18:00 && $time < 22:00 ==> tag time-of-day:evening,
$time >= 22:00 || $time <  2:00 ==> tag time-of-day:late-evening,

-- This tag always refers to the last 24h
$sampleage <= 24:00 ==> tag last-day,

-- To categorize by calendar periods (months, weeks, or arbitrary periods),
-- I use $date variable, and some auxiliary functions. All these functions
-- evaluate dates in local time. Set TZ environment variable if you need
-- statistics in a different time zone.

-- You can compare dates:
$date >= 2001-01-01 ==> tag this_century,
-- You have to write them in YYYY-MM-DD format, else they will not be recognized.

-- “format $date” produces a string with the date in ISO 8601 format
-- (YYYY-MM-DD), it may be compared with strings. For example, to match
-- everything on and after a particular date I can use
format $date =~ ".*-03-19"  ==> tag period:on_a_special_day,
-- but note that this is a rather expensive operation and will slow down your
-- data processing considerably.

-- “day of month $date” gives the day of month (1..31),
-- “day of week $date” gives a sequence number of the day of week
-- (1..7, Monday is 1):
(day of month $date == 13) && (day of week $date == 5) ==> tag day:friday_13,

year $date == 2010 ==> tag year:2010,

-- “$now” evaluates to the current time
day of month $now == day of month $date ==> tag current-day,
month $now == month $date ==> tag current-month,
year $now == year $date ==> tag current-year,
-}
-- “month $date” gives a month number (1..12), “year $date” gives a year:
month $date == 1 ==> tag month:Januar,
month $date == 2 ==> tag month:Februar,
month $date == 3 ==> tag month:März,
month $date == 4 ==> tag month:April,
month $date == 5 ==> tag month:Mai,
month $date == 6 ==> tag month:Juni,
month $date == 7 ==> tag month:Juli,
month $date == 8 ==> tag month:August,
month $date == 9 ==> tag month:September,
month $date == 10 ==> tag month:Oktober,
month $date == 11 ==> tag month:November,
month $date == 12 ==> tag month:Dezember,

(day of month $date == 10) && (month $date == 11) ==> tag DoM:7th,
(day of month $date == 14) && (month $date == 11) ==> tag DoM:14th,
(day of month $date == 21) && (month $date == 11) ==> tag DoM:21th,
(day of month $date == 28) && (month $date == 11) ==> tag DoM:28th,

--------------- ADDED 2020 -----------------
--------------------------------------------

current window $title =~ m!de\.wikipedia\.org! ==> tag Service:Wikipedia,

Thanks for reading and hopefully someone has an idea. Or maybe I found some bug, who knows. :slight_smile:

The text was updated successfully, but these errors were encountered:

nomeata · 2021-04-23T19:29:23Z

Original comment by nomeata (Bitbucket: nomeata, GitHub: nomeata).

Odd indeed on first glance. Maybe you can minimize the configuration and check if it also occurs if you have just that rule?

Also, arbtt-stats --dump-samples is very useful, as it shows the raw data next to the tags.

nomeata · 2021-04-27T01:20:21Z

Original comment by Konzertheld (Bitbucket: Konzertheld, GitHub: Konzertheld).

Thanks for your answer. I tried a configuration file that contains nothing but the rule mentioned in the first sentence and used the same command mentioned above again, nothing changed. So, it’s not related to anything else in my config.

Then I learned I should not use --dump-samples without less. 😆

I then ran arbtt-stats --dump-samples --categorizefile=/home/christian/tmp/testcategorizearbtt | grep "^\W+Program:$" -B 8 | grep "(*)". A number of recurring windows appeared (sample):

() : franklin.hrz.tu-chemnitz.de hat ein Fenster freigegeben.
() : franklin.hrz.tu-chemnitz.de hat ein Fenster freigegeben.
() : franklin.hrz.tu-chemnitz.de hat ein Fenster freigegeben.
() : glxgears
() : glxgears
() : glxgears
() : VirtualBox - Error In suplibOsInit
() : VirtualBox - Error In suplibOsInit
() : VirtualBox - Error In suplibOsInit
() : VirtualBox - Error In suplibOsInit
() : VirtualBox - Error In suplibOsInit
() : VirtualBox - Error In suplibOsInit
() : VirtualBox - Error In suplibOsInit
() : VirtualBox - Error In suplibOsInit
() : VirtualBox - Error In suplibOsInit
() : VirtualBox - Error In suplibOsInit
() : VirtualBox - Error In suplibOsInit
() processing-app-Base: 11-Arduino-I2C-Gangschalthebel-live | Arduino 1.8.13
() : fitzgerald.hrz.tu-chemnitz.de hat ein Fenster freigegeben.
() google-chrome: BigBlueButton - NaGruSe-Christian - https://fitzgerald.hrz.tu-chemnitz.de/ - Google Chrome
() : fitzgerald.hrz.tu-chemnitz.de hat ein Fenster freigegeben.
() processing-app-Base: 11-Arduino-I2C-Gangschalthebel-live | Arduino 1.8.13
() : fitzgerald.hrz.tu-chemnitz.de hat ein Fenster freigegeben.
() processing-app-Base: 11-Arduino-I2C-Gangschalthebel-live | Arduino 1.8.13
(*) : fitzgerald.hrz.tu-chemnitz.de hat ein Fenster freigegeben.

I find it interesting that some do have a program associated and some do not. The first lines are from BigBlueButton, a video conference system, when I shared my screen (propably). I had hope that would explain a lot of the time as I worked as a lecturer and spent quite some time in that system. So I updated my test config to match most of the lines from my search. It reduced the amount of unmatched time by amazing 24 seconds, leaving 4 days and almost 10 hours. So, I still have no clue. :slight_smile:

The lines that my grep search turned out are not too many, so I also have doubt those lines explain the problem at all.

Edit: I missed the inactive tag clause in my test config. I added it. It only accounts for 2 hours of those 4+ days. (That way, I also confirmed that filtering out inactive time works.)

nomeata · 2021-04-27T10:40:53Z

Original comment by Konzertheld (Bitbucket: Konzertheld, GitHub: Konzertheld).

Additional info from things I came up with today. An even simpler rule tag bla:asdf that tags everything with something static works. It matches 100% of the entries. current window $program == "" ==> tag lol:lol, matched nothing.

nomeata · 2021-04-27T17:30:25Z

Original comment by nomeata (Bitbucket: nomeata, GitHub: nomeata).

So the problem is that current window $program is empty for many windows?
And then additionally it is not possible to match on current window $program?
I also don’t really have a clue 😕
Could you maybe do a fresh recording (so that we have only a few samples to look at) and send the the capture.log and categorize.cfg ? Then I can maybe debug this.

nomeata · 2021-08-10T10:21:40Z

Thanks for the test data! When I look at

arbtt-dump -f capture.log

I see that quite a few samples don't have an active window (the (*) in the first column). But some do. This would explain why current window-based rules fail, when there is no current window.

So the question is why that happens. Maybe there are modal dialogs open that are not recognized by arbtt? Or maybe one of the used applications is somehow confusing arbtt?

Konzertheld · 2022-01-10T22:31:52Z

New approach: Are there known issues with Wayland? I saw that there is a wayland compatible port of arbtt-capture. My system uses Wayland. (Asking because today I again have 6h 50min uptime but only 6h 1min captured including inactive times)

nomeata · 2022-01-11T08:42:02Z

Or maybe arbtt-capture's wake up interval isn't precise enough? If arbtt-capture takes 1s, and you set the interval to 10s, if I did the sleep naively, it might explain 10% lost time.

What's your interval time?

nomeata · 2022-01-11T08:43:22Z

(I don't use Wayland yet, and have no idea how arbtt will behave there)

Konzertheld · 2022-01-11T08:50:56Z

Hmm, my interval is 3 seconds. I can change it to 10 and see if it makes a difference.

nomeata · 2022-01-11T09:41:37Z

Ah, that might be an explanation. arbtt-capture is a bit naive in it’s interval calculation, and will always sleep the configured 3s between taking captures; not taking the time it takes to take the capture into account. At the default capture interval of one minute this might be ok, but with very short capture times, the error accumulates.

It shouldn’t be too hard to fix that, though.

Konzertheld · 2022-01-11T09:44:29Z

Ah well, if a fix is possible, that's even better. I use short intervals because I got quite a few captures of windows I accidentally alt-tabbed to which were then logged as whatever interval I had set instead of the 200 ms I actually focused them. With short intervals, those captures will disappear in irrelevance.

nomeata · 2022-01-11T11:28:58Z

My theory is that even with 60s interval, this will even out (you’ll catch them less often), but yes, depending on your needs and use cases this might be too imprecise. At the cost of a greatly increased log file, of course.

Konzertheld · 2022-01-11T16:13:37Z

Hmm, I'm trying to come up with a mathematical explanation but from my first thoughts, I think you're right.

Konzertheld · 2022-01-12T02:09:49Z

Update: It seems to decrease with increased intervals. I had 12 minutes leakage on 6 hours today with an interval of 10 seconds.

nomeata · 2022-01-12T12:24:45Z

#145 might improve the situation. Do you want to try it out? Can you build from a branch or need help with that?

Konzertheld · 2022-01-13T21:22:47Z

#145 might improve the situation. Do you want to try it out? Can you build from a branch or need help with that?

I can build from a branch. I thought I already did but I built master. Interestingly, it is extra bad today, almost a third of the time is lost. But I will try the branch now for the next few hours.

Konzertheld · 2022-01-13T21:39:28Z

Is this command correct? I get a weird output trying to check if the manually compiled version from the branch works at all.

dist/build/arbtt-stats/arbtt-stats -c Project --filter '$sampleage <= 00:05' --also-inactive                                                          ✔  33s   ghc-9.0.1  
Processing data [=======================================================================================================================================================================================================================] 101%
Processing data [=======================================================================================================================================================================================================================] 101%
Statistics for category "Project"
=================================
______________Tag_|_________Time_|_Percentage_
Project:MailsNews | 43d06h48m50s |      68.10
    Project:arbtt | 20d06h33m10s |      31.90
     (total time) | 63d13h22m00s |     100.00

Not only did it ignore the filter, the results are also completely wrong.

nomeata · 2022-01-13T22:03:11Z

This might be an old file, look at the time-stamp. Modern cabal puts its files in dist-newstyle.

Konzertheld · 2022-01-13T22:11:05Z

Nope, the directory did not exist before and the timestamp matches what I expected it to be.

Konzertheld · 2022-01-13T22:11:53Z

It was probably a bad idea that I just continued to use the old log file with no migration, right?

Konzertheld · 2022-01-13T22:29:54Z

Well, this is interesting. I tried a new logfile. It is already 27 MB large after not even five minutes. Processing takes way too long. While below, there are supposedly 4 days recorded, there are now 31 days recorded already. So either I messed up the build completely or something does not work at all.

ps aux | grep arbtt
[...] /AUR/arbtt-git/dist/build/arbtt-capture/arbtt-capture --sample-rate=10 --logfile=...experimental-logfile

dist/build/arbtt-stats/arbtt-stats -c Project --filter '$sampleage <= 00:05' --also-inactive --logfile=...experimental-logfile
Processing data [=======================================================================================================================================================================================================================] 101%
Processing data [=======================================================================================================================================================================================================================] 102%
Processing data [=======================================================================================================================================================================================================================] 103%
Processing data [=======================================================================================================================================================================================================================] 104%
Processing data [=======================================================================================================================================================================================================================] 105%
Statistics for category "Project"
=================================
__________Tag_|_________Time_|_Percentage_
Project:arbtt |  4d23h10m00s |     100.00
 (total time) |  4d23h10m00s |     100.00

Build commands were taken from the AUR package, I just removed those that write to system directories:

runhaskell Setup configure -O --enable-shared --enable-executable-dynamic --disable-library-vanilla \
        --dynlibdir=/usr/lib --libsubdir=\$compiler/site-local/\$pkgid
	
runhaskell Setup build

nomeata · 2022-01-14T08:00:40Z

Hmm, maybe my changes on the branch are just broken. :-/. Will have to careful look at them again.

nomeata · 2022-01-14T08:04:32Z

Did you run multiple arbtt-capture instances at the same time maybe?

Konzertheld · 2022-01-14T08:05:01Z

Did you run multiple arbtt-capture instances at the same time maybe?

no :)

nomeata added major and removed bug labels Jun 3, 2021

Konzertheld mentioned this issue Jan 4, 2022

Dependency problems: attoparsec 0.13 requested but newer version is already installed for aeson #141

Closed

nomeata added a commit that referenced this issue Jan 12, 2022

More precise delay timing (fixes #129)

0456245

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Lots of unmatched time when using a simple rule that should tag everything #129

Lots of unmatched time when using a simple rule that should tag everything #129

nomeata commented Apr 23, 2021

nomeata commented Apr 23, 2021

nomeata commented Apr 27, 2021

nomeata commented Apr 27, 2021

nomeata commented Apr 27, 2021

nomeata commented Aug 10, 2021

Konzertheld commented Jan 10, 2022 •

edited

Loading

nomeata commented Jan 11, 2022

nomeata commented Jan 11, 2022

Konzertheld commented Jan 11, 2022

nomeata commented Jan 11, 2022

Konzertheld commented Jan 11, 2022

nomeata commented Jan 11, 2022

Konzertheld commented Jan 11, 2022

Konzertheld commented Jan 12, 2022

nomeata commented Jan 12, 2022 •

edited

Loading

Konzertheld commented Jan 13, 2022

Konzertheld commented Jan 13, 2022

nomeata commented Jan 13, 2022

Konzertheld commented Jan 13, 2022

Konzertheld commented Jan 13, 2022

Konzertheld commented Jan 13, 2022

nomeata commented Jan 14, 2022

nomeata commented Jan 14, 2022

Konzertheld commented Jan 14, 2022

Lots of unmatched time when using a simple rule that should tag everything #129

Lots of unmatched time when using a simple rule that should tag everything #129

Comments

nomeata commented Apr 23, 2021

nomeata commented Apr 23, 2021

nomeata commented Apr 27, 2021

nomeata commented Apr 27, 2021

nomeata commented Apr 27, 2021

nomeata commented Aug 10, 2021

Konzertheld commented Jan 10, 2022 • edited Loading

nomeata commented Jan 11, 2022

nomeata commented Jan 11, 2022

Konzertheld commented Jan 11, 2022

nomeata commented Jan 11, 2022

Konzertheld commented Jan 11, 2022

nomeata commented Jan 11, 2022

Konzertheld commented Jan 11, 2022

Konzertheld commented Jan 12, 2022

nomeata commented Jan 12, 2022 • edited Loading

Konzertheld commented Jan 13, 2022

Konzertheld commented Jan 13, 2022

nomeata commented Jan 13, 2022

Konzertheld commented Jan 13, 2022

Konzertheld commented Jan 13, 2022

Konzertheld commented Jan 13, 2022

nomeata commented Jan 14, 2022

nomeata commented Jan 14, 2022

Konzertheld commented Jan 14, 2022

Konzertheld commented Jan 10, 2022 •

edited

Loading

nomeata commented Jan 12, 2022 •

edited

Loading