Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Creating a new watcher #131

Closed
nikanar opened this issue Nov 9, 2017 · 25 comments
Closed

Creating a new watcher #131

nikanar opened this issue Nov 9, 2017 · 25 comments

Comments

@nikanar
Copy link
Contributor

nikanar commented Nov 9, 2017

I would like to watch one integer variable from my window-manager (XY context : the virtual desktop I am in, as they are a good classification of the activities I do on the computer + makes it trivial to label current activity, by just moving to the corresponding space).

What would be a bare-bones watcher ?

Only has to run on linux. I can have the watched-thing run a shell command (messaging the watcher) every time its value changes, so it doesn't even have to continuously listen (though if it must listen, it can watch the /tmp/current-value file, easy).

@ErikBjare
Copy link
Member

There's a small code-example in the docs: https://activitywatch.readthedocs.io/en/latest/writing-watchers.html

Just ask if you have any questions or issues adapting that code to your needs 🙂

@nikanar
Copy link
Contributor Author

nikanar commented Nov 10, 2017

Thanks much ! That raised a bunch of points :

Slapping an import section on top of it would help.

from aw_client import ActivityWatchClient
from aw_core.models import Event
from datetime import datetime, timezone
from time import sleep

Still, get_current_window remains undefined (defined in .lib of aw-window) so the code is broken as is. I'd change the doc to use a def observe_your_thing() with a trivial demo example like watching /proc/uptime or whatever.

What I used personally is

import subprocess
def get_the_watched_thing():
    filename = "/path/to/a/log/file"
    # This logfile contains lines like "Current value is now 3" and "Current value is now 9".
    f = subprocess.Popen(['tail','-c','2',filename],
                         stdout=subprocess.PIPE,stderr=subprocess.PIPE)
    value = int(f.stdout.read(1))
    # print("Value is {}".format(value))
    return value

The testing=True parameter refers to the “Running” section of the Installing from source doc. I would personally remove it from the minimal example as I don't use --testing nor expect many users to, and the ones who want it will find their way to its doc. But at any rate, I suggest to at least mention it in comments so that if the code doesn't run because of that, people know where to look / what to do.

event_type = "currentwindow" is there a shortlist of accepted values / some other logic using this, or is it an arbitrary label ? (a "what is the unit of this value"-type question). If it's an arbitrary string label, why not write it properly with a space ?

Why does the queued parameter default to False in ActivityWatchClient.create_bucket() ? I have no idea but it seems most use cases and n00b users like me would want it True ?

@johan-bjareholt
Copy link
Member

johan-bjareholt commented Nov 10, 2017

Great questions, the docs certainly needs further explanation. Will update shortly!

Slapping an import section on top of it would help.

Still, get_current_window remains undefined (defined in .lib of aw-window) so the code is broken as is. I'd change the doc to use a def observe_your_thing() with a trivial demo example like watching /proc/uptime or whatever.

Will fix!

The testing=True parameter refers to the “Running” section of the Installing from source doc. I would personally remove it from the minimal example as I don't use --testing nor expect many users to, and the ones who want it will find their way to its doc. But at any rate, I suggest to at least mention it in comments so that if the code doesn't run because of that, people know where to look / what to do.

The reason we have --testing is because while we are developing we are not sure that the data is correct until properly tested and verified, and you can run a stable and development version simultaneously on the non-testing and testing instances of aw-server simultaneously. Also if you for example would want to rewrite some part of your client but you're not sure that the code would be correct, it is easy to just connect with --testing instead so you don't need to worry about inserting some corrupt events into your bucket. We want people to be aware of this feature when developing so they don't accidentally put corrupt data in their buckets. I personally have months of data but only do backups approx. once a month, and I would become rather sad if i inserted corrupt data and had to revert weeks of data. We should probably make it easier later to remove select events from buckets from the aw-core library, but it is currently only possible to delete a whole bucket.

event_type = "currentwindow" is there a shortlist of accepted values / some other logic using this, or is it an arbitrary label ? (a "what is the unit of this value"-type question). If it's an arbitrary string label, why not write it properly with a space ?

It is just an arbitrary label, but we use it so clients which do visualizations (e.g. aw-webui) and are looking for a specific type of events know what type of events each bucket contains. You can call it what you want as long as it is descriptive. If you collect multiple types of events in one client I would recommend making multiple buckets (I would guess that you don't though).

We should probably enforce (or atleast recommend) lowercase and replace spaces with underscores or something just to keep consistency for all types. We should make it clear that it is not a description but an actual value which others use as a reference so people don't use whitespaces.

Why does the queued parameter default to False in ActivityWatchClient.create_bucket() ? I have no idea but it seems most use cases and n00b users like me would want it True ?

The reason why create_bucket is not queued is in case the create_bucket request fails to aw-server a exception should occur. The queue is not blocking so it doesn't raise any exceptions if some request fails, it just queues up the events and re-tries every X seconds.

@johan-bjareholt
Copy link
Member

johan-bjareholt commented Nov 10, 2017

It also seems like the client.connect() call is missing in that example client, so no queued events will be saved it seems. There's apparently also a second example in ActivityWatch/docs/examples/client.py which looks better though so you should probably have a look at that instead. Will fix this code duplication.

@nikanar
Copy link
Contributor Author

nikanar commented Nov 10, 2017

we have --testing [...] We want people to be aware of this feature when developing [...]

Fair enough ! :) (then all I'm saying is I (n=1) find it easier to use the main server as testing)

We should probably enforce/recommend lowercase [...] just to keep consistency

Sounds to me like more rules than are necessary. A convention for knowledgeable devs, sure, but please please keep it simple = accessible for us newcomers.

You can always .lower() and normalize it in the visualisation modules if that's preferable. Maybe add a comment like # keep this short as it will show up as a label in the timeline/UI. Up to 15 characters works best or something.

We should make it clear that it is not a description but an actual value which others use as a reference so people don't use whitespaces.

Doesn't matter much, but I don't get this part. You just said I could call it what I wanted, so why not allow "quantity of available foobars" ? What makes it hard to check if the value is "current window" with a space instead of an underscore ?

tl;dr : I suggest making the doc

event_type = "type_of_event" # e.g. "uptime" or "current_window"

It's pretty suggestive and shows the kind of values that are accepted.

in case the create_bucket request fails to aw-server a exception should occur

Hrm, okay, then why is it True in the doc example ? Anyway this seems like it'll get nuked/rewritten so I'll wait until it's better.

@nikanar
Copy link
Contributor Author

nikanar commented Nov 11, 2017

Re the second example, I find the first one gets a few points better :

  • bucket_id = "{}_{}".format(...) keeps names standardised without requiring any user input.
  • # The duration between them will be less than pulsetime, so they will get merged. is not really clear. I'd say, either add a link to some more doc (explaining the AW event data model), or just show me what to do but don't try to explain.
  • the first has a while True: loop that makes it clear how it watches. I'd like to use the "synchronous example" but I'd need to see how it interacts with the while True loop (need to sleep() ? update the data ? etc.).
  • Same in the asynchronous example, it would be nice to see what happens when the data changes. There are several repetitions of the example_data variable, so it's not too clear which must be the same, and which shall be updated with variable data. The first example gets that right.

The end of the example (client.get_events() and client.delete_bucket()) seems unrelated to building a watcher, more like a comprehensive AW tutorial. FTR I previously asked about using stored AW data.

@nikanar
Copy link
Contributor Author

nikanar commented Nov 11, 2017

Two specific questions about the code and current status of my solution :

  • I see we client.create_bucket() every time we launch the watcher : what does it do if the bucket already exists, open it instead ("append-mode") ?
  • In the second example, the asynchronous way uses client.heartbeat(queued = True), whereas the synchronous way uses client.insert_event(). Why not use heartbeat(queued = False), to show a parallel ? (would that work ?)

Now my watcher works with insert_event (but identical consecutive events are not merged). Using queue = True and heartbeats didn't do anything, i.e. the bucket was not created.

Using client.heartbeat(queued = False) fails with :

$ python aw-wmtag.py 
Traceback (most recent call last):
  File "aw-wmtag.py", line 47, in <module>
    main()
  File "aw-wmtag.py", line 17, in main
    client.create_bucket(bucket_id, event_type, queued=use_retry_queue)
  File "/home/will/.local/lib/python3.6/site-packages/aw_client-0.2-py3.6.egg/aw_client/client.py", line 149, in create_bucket
  File "/home/will/.local/lib/python3.6/site-packages/aw_client-0.2-py3.6.egg/aw_client/client.py", line 70, in _post
  File "/home/will/.local/lib/python3.6/site-packages/requests/models.py", line 937, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 400 Client Error: BAD REQUEST for url: http://localhost:5600/api/0/buckets/aw-wmtag_ker

(EDIT : I know it says it's the create_bucket line that raises the error but the same line worked when using insert_event instead of heartbeats, so I don't know if it's related or just a coincidence, but I report what I see and I meant exactly what I wrote :p)

While the aw-qt console says :

2017-11-11 00:21:05,183 [INFO ]: 400 (127.0.0.1): [this is red:]POST /api/0/buckets/aw-wmtag_ker HTTP/1.1[/red]  (flask:25)

@johan-bjareholt
Copy link
Member

It isn't client.heartbeat which fails but client.create_bucket, look at the last line here

$ python aw-wmtag.py 
Traceback (most recent call last):
   File "aw-wmtag.py", line 47, in <module>
     main()
   File "aw-wmtag.py", line 17, in main
     client.create_bucket(bucket_id, event_type, queued=use_retry_queue)

run aw-server with --verbose and you will see more info and it will likely say what's wrong with the request

Will update the docs soon with a new example with more comments

@johan-bjareholt
Copy link
Member

johan-bjareholt commented Nov 11, 2017

Was able to reproduce your create_bucket failing, aw-server was responding with a 400 status code instead of a 304, fixed here ActivityWatch/aw-server@7f424fb

It apparently worked because we run create_bucket asyncronously in all of our watchers so the exception wasn't raised there. Tested with the example watcher, works syncronously now aswell :)

@johan-bjareholt
Copy link
Member

The example in the docs is now update, will take a few hours until it's updated but you can look directly at the source. I think i adressed most of your concerns, tell me if I missed something.

@ErikBjare
Copy link
Member

ErikBjare commented Nov 11, 2017

@nikanar

event_type = "currentwindow" is there a shortlist of accepted values / some other logic using this, or is it an arbitrary label ? (a "what is the unit of this value"-type question). If it's an arbitrary string label, why not write it properly with a space ?

@johan-bjareholt

It is just an arbitrary label, but we use it so clients which do visualizations (e.g. aw-webui) and are looking for a specific type of events know what type of events each bucket contains. You can call it what you want as long as it is descriptive. If you collect multiple types of events in one client I would recommend making multiple buckets (I would guess that you don't though).

I've had the idea of using a namespace for these types. For example would "currentwindow" become "org.activitywatch.window.active". I've written about it here in the past: #97


@nikanar Thanks for all the feedback, if you have the time you can create a PR or if not I'll keep the things you've said in mind when editing.

  • Document the testing mode better.
  • Include imports in code examples, make them valid code.

Nice bug catching @nikanar and nice fix @johan-bjareholt.

@johan-bjareholt
Copy link
Member

johan-bjareholt commented Nov 12, 2017

  • Include imports in code examples, make them valid code.

I already fixed this and added both a minimal 20-line example as well as a very extensively commented ~80-line client. There are a few more things which needs documentation though.

  • Document heartbeats better
  • Explain bucket bucket name convention
  • Explain eventtype name convention

@nikanar
Copy link
Contributor Author

nikanar commented Nov 20, 2017

Thanks @johan-bjareholt the new examples are great. Minor nitpick, the long example also doesn't use the event_type variable.

Sorry for taking so long to update, I'm still having a hard time with git-maintaining --user in all the submodules' Makefiles. Actually since the main seems to have changed to much, Makefile, I re-cloned from scratch (and am running into #87 again).

# TODO: Make a section with an illustration on how heartbeats work and insert a link here

Most definitely. It's an important feature you got that most users aren't familiar with, and it gets in the way every time we want to dive in the internals, as iiuc you expect some part of the community to do.

  • I see you update heartbeat_event.timestamp at every heartbeat, and I suppose .data is to be updated in the same fashion. Is there something against creating a new Event for each heartbeat ?

@nikanar
Copy link
Contributor Author

nikanar commented Nov 21, 2017

Currently all I can get at is:

will@ker:~$ aw-wmtag 
400 Client Error: BAD REQUEST for url: http://localhost:5600/api/0/buckets/aw-wmtag_ker
POST request response had status code 400
Message: 400
Traceback (most recent call last):
  File "/usr/local/bin/aw-wmtag", line 36, in <module>
    main()
  File "/usr/local/bin/aw-wmtag", line 12, in main
    client.create_bucket(bucket_id, event_type)
  File "/home/will/.local/lib/python3.6/site-packages/aw_client/client.py", line 170, in create_bucket
    self._post(endpoint, data)
  File "/home/will/.local/lib/python3.6/site-packages/aw_client/client.py", line 86, in _post
    raise e
  File "/home/will/.local/lib/python3.6/site-packages/aw_client/client.py", line 83, in _post
    r.raise_for_status()
  File "/home/will/.local/lib/python3.6/site-packages/requests/models.py", line 935, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 400 Client Error: BAD REQUEST for url: http://localhost:5600/api/0/buckets/aw-wmtag_ker

http://localhost:5600/#/buckets confirms the aw-wmtag_ker bucket exists.

aw-qt's terminal says 2017-11-20 18:57:33 [INFO ]: 400 (127.0.0.1): POST /api/0/buckets/aw-wmtag_ker HTTP/1.1 (flask:25).

My code does:

    client = ActivityWatchClient("aw-wmtag")
    bucket_id = "{}_{}".format(client.name, client.hostname)  # Give your bucket a unique id.
    event_type = "activedesktop"  # Used to annotate what kind of data the events in a given bucket will contain.
    client.create_bucket(bucket_id, event_type)

I don't know if I failed to get the latest version of aw-server (I reinstalled/make uninstall/make clean_all/manually purged some files many times, and finally recloned the whole thing) or if it's more or less expected behavior for some other reason, what do you think ?

@ErikBjare
Copy link
Member

ErikBjare commented Nov 21, 2017

Yup, looks like you aren't running the latest aw-server. The reason you don't is likely because the bundle is referencing an older aw-server commit. If you run git checkout master && git pull origin master in the aw-server folder then you should get the latest version of aw-server independent of which version is used in the bundle.

Regardless, I will update the aw-server used in the bundle.

  • Update aw-server submodule

I'm still having a hard time with git-maintaining --user in all the submodules' Makefiles.

I can add a flag to the Makefiles to make life easier for you 🙂

  • Add a flag to the Makefiles for using the --user argument when installing with pip.

running into #87 again

@johan-bjareholt
Copy link
Member

I don't know if I failed to get the latest version

To be certain, do git log to see the most recent commits with timestamps

@nikanar
Copy link
Contributor Author

nikanar commented Nov 21, 2017

Indeed @ErikBjare that was it.

Omg, PIP_USER=trueis gold, many thanks for finding it.
Doesn't it remove the requirement on running in a virtualenv ? I think this doc could be updated with it.

The watcher works now ! It stores data which is visible in the webui. Identical consecutive events are merged together. The poll_time (2s) is coarse-grained enough that it doesn't pick up the instantaneous useless events.

aw-wmtag-alive
I kinda wonder if the timeline is relevant on this page / if+how it should apply to custom buckets, but that is yours to decide.

Time for me to move on to visualising that bucket !

@ErikBjare
Copy link
Member

@nikanar Awesome! Great to see you got it working 😄

Huge thanks for helping us make the process a bit less cumbersome for people in the future. I updated the doc to mention the PIP_USER trick as you suggested.

Closing this for now, but don't hesitate to let us know how things turn out 🚀

@nikanar
Copy link
Contributor Author

nikanar commented Dec 21, 2017

Update : Turns out, of course, the watcher watches even when I'm afk. So stats about raw watched data aren't too meaningful, and I have to throw afk-watcher observations into the mix to do something useful. It's about done now, but my conclusion is that I should have included the afk-watcher ['status'] attribute to this watcher's ['data'], from the start. (Especially so afk and not-afk heartbeats don't get merged.)

Hopefully this can help others to not run in the same issue.

Gory data details I also have 30 'not-afk' events with 0 duration, out of 1200 afk-watcher events (53% not-afk so they must not be perfectly alternating), which I'm ignoring. In fact I filter out all events of 0 duration.

I also ran into issues like

2017-12-20 21:58:30,423 : WARNING : Discarding one tag event for some reason.
2017-12-20 21:58:30,424 : DEBUG : For some reason there doesn't seem to be a AW not-afk event starting with this tag change. I don't know how that is possible, it's like AW didn't detect the keypress.
2017-12-20 21:58:30,424 : DEBUG : tag event at 2017-11-22 22:28:31.658000+00:00 duration 0:05:40.610000
2017-12-20 21:58:30,424 : DEBUG : Is surrounded by
2017-12-20 21:58:30,424 : DEBUG : ak event at 2017-11-22 22:22:18.334000+00:00 duration 0:06:04.886000
2017-12-20 21:58:30,424 : DEBUG : and
2017-12-20 21:58:30,424 : DEBUG : ak event at 2017-11-22 22:33:49.626000+00:00 duration 0:07:35.579000
2017-12-20 21:58:30,424 : DEBUG : so it's like user was never ak but changed tag. what happened ?

where "ak" stands for "not-afk". But I think it's on my side because my watcher was running but not aw-watcher-afk because aw-qt was crashed at that time but not aw-server because ActivityWatch/aw-server#31. There aren't too many of these, and in particular none is recent.

@johan-bjareholt
Copy link
Member

but my conclusion is that I should have included the afk-watcher ['status'] attribute to this watcher's ['data'], from the start.

This is not the way we hope people will implement watchers. As you might know the window watcher solves this without including that in its 'data' field and does so by doing transformations in aw-server which cuts portion of events where there is no 'not-afk' intersecting it. Currently in AW 0.7 we do that with a kind of query language in JSON format, but that will soon be deprecated since we were not satisfied with the API. However it will be replaced by my dev/transform-next pull requests which will specify an actual nice looking query language for clients to request transforms on data from multiple buckets from the aw-server.

In the end we hope that the watchers will just push their single data point in lossless format to a aw-server bucket and then when displaying it we can aggregate and transform data from multiple buckets with this query language in a format which fits how we want to visualize the data.

I also have 30 'not-afk' events with 0 duration, out of 1200 afk-watcher events (53% not-afk so they must not be perfectly alternating), which I'm ignoring. In fact I filter out all events of 0 duration.

We have an issue on this and it's high priority, I am on holidays now so hopefully I'll have some time to get around and reproduce+fix this soon.

@nikanar
Copy link
Contributor Author

nikanar commented Dec 21, 2017

Hrm, oh wow, I had no idea. From here, this all sounds like documentation waiting to exist ;)
(also "I can't do this in the proper way yet")

So iiuc you want pure raw data in the buckets, and preprocessing to be done "by users" (at least the watcher-writing ones) in your query language ? If so, ok, why not, I only hope you'll implement nice doc and default values for this part.

This sounds a bit at odds with hibernating events not being reported anymore since ActivityWatch/aw-watcher-afk#26 ? I would expect this view to imply the afk-watcher must push all the datapoints it can.

@johan-bjareholt
Copy link
Member

johan-bjareholt commented Dec 22, 2017

This should fix at least some of the aw-watcher-afk issues I believe ActivityWatch/aw-watcher-afk#31

Hrm, oh wow, I had no idea. From here, this all sounds like documentation waiting to exist ;)

In progress :) ActivityWatch/docs#2

(also "I can't do this in the proper way yet")

Well you can with the old query API, but since it will be deprecated soon and it's not documented i recommend not wasting your time on it!

If you still want to get it done soon I would appreciate if you tested the dev/transform-next branches (which also includes some other goodies like aw-watcher-web domain summaries).

So iiuc you want pure raw data in the buckets, and preprocessing to be done "by users" (at least the watcher-writing ones) in your query language ? If so, ok, why not, I only hope you'll implement nice doc and default values for this part.

Well not "by users", but by each program which visualizes activitywatch data (primarily aw-webui).

This sounds a bit at odds with hibernating events not being reported anymore since ActivityWatch/aw-watcher-afk#26 ? I would expect this view to imply the afk-watcher must push all the datapoints it can.

I actually still run the old aw-watcher-afk commit with hibernation support on one of my computers and it's all backwards compatible so it still works. The only reason we removed it was since there seemed to be some corner case where it overlapped other events so we removed it. We will likely add it back later, but that's currently not a priority.

@rhabbachi
Copy link

@nikanar Hey! I'm looking for something similar to what your are working/worked on. Did you publish your code in a public repository?

@nikanar
Copy link
Contributor Author

nikanar commented Sep 12, 2018

@rhabbachi Not that I can remember. I think it boils down to the following

will@ker:~$ cat .../AW/aw-wmtag.py

#!/usr/bin/python3

import subprocess
import time, datetime
from aw_client import ActivityWatchClient
from aw_core.models import Event

def main():
    client = ActivityWatchClient("aw-wmtag")
    bucket_id = "{}_{}".format(client.name, client.hostname)  # Give your bucket a unique id.
    event_type = "activedesktop"  # Used to annotate what kind of data the events in a given bucket will contain.
    client.create_bucket(bucket_id, event_type)

    poll_time = 5  # Send an event every 5 seconds
    with client:
        while True:
            data = {"desktop": get_the_watched_thing()}
            now = datetime.datetime.now(datetime.timezone.utc)
            awesome_event = Event(timestamp=now, data=data)
            client.heartbeat(bucket_id, awesome_event, pulsetime=poll_time+1, queued=True)
            time.sleep(poll_time)
        
def get_the_watched_thing():
    filename = "/path/to/awesome-logging.log"
    f = subprocess.Popen(['tail','-c','2',filename],
                         stdout=subprocess.PIPE,stderr=subprocess.PIPE)
    value = int(f.stdout.read(1))
    # print("Looking at {} says desktop is currently {}".format(filename, value))
    return value
    
if __name__ == "__main__":
    main()

... given that the window manager (awesome-wm) writes in /path/to/awesome-logging.log, with the following (excerpt from ~/.config/awesome/rc.lua):

local function log_current_tag()
   local i = awful.screen.focused().selected_tag.index
   file = io.open("/path/to/awesome-logging.log", "a")
   file:write(string.format("%s Switching to awesome tag %i\n", os.date("%Y/%m/%d %X"), i))
   file:close()
end

and then calling that function whenever I change tags (by whatever keybinding or clicking sequence) = user input changes the observed value - e.g.

   awful.key({modkey},	"Tab", function () awful.tag.viewnext() log_current_tag() end),

There, now it's done :')

@rhabbachi
Copy link

rhabbachi commented Sep 17, 2018 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants