Skip to content
This repository has been archived by the owner on May 28, 2021. It is now read-only.

Analytics Tasks #424

Closed
DMWMBot opened this issue Sep 28, 2010 · 14 comments
Closed

Analytics Tasks #424

DMWMBot opened this issue Sep 28, 2010 · 14 comments
Assignees

Comments

@DMWMBot
Copy link

DMWMBot commented Sep 28, 2010

Better test the existing analytics tasks and add some new ones.

@vkuznet
Copy link
Contributor

vkuznet commented Oct 6, 2010

valya: We need to schedule full working version of analytics (robots) for upcoming DAS release.

@DMWMBot
Copy link
Author

DMWMBot commented Oct 26, 2010

gfball: Tested queryspammer to submit to a DAS web server, so that a weighted producer can be used to test analytics.

./das_queryspammer -w 4 -c 100 -p WeightedDatasetProducer -s DASWebSubmitter

(creates 4 threads, each making 100 dataset queries using a weighted distribution to http://localhost:8212/das/jsonview?input=...

(HTTPSubmitter can be used to customise the URL)

@vkuznet
Copy link
Contributor

vkuznet commented Oct 27, 2010

valya: Gordon, I successfully applied your patch and it is merged into main tree. What else left for this ticket?

@DMWMBot
Copy link
Author

DMWMBot commented Oct 29, 2010

gfball: More for this patch definitely, the one so far doesn't really match the description anyway, just didn't have a more appropriate ticket for it.

Coming patch should cover:

  • fix hotspot
  • add new task querymaintainer
  • fixes to analytics controller/scheduler/tasks
  • fixes to analyticsdb
  • add some functions to mongocache
  • standalone analytics task runner

Should be coming in next couple days.

@vkuznet
Copy link
Contributor

vkuznet commented Oct 29, 2010

valya: Cool.

@DMWMBot
Copy link
Author

DMWMBot commented Nov 4, 2010

gfball: Big analytics update (sorry, it did indeed take a few days). Applies against SVN head as of about an hour before upload. This also resolves ticket #425 and parts of #426.

@vkuznet
Copy link
Contributor

vkuznet commented Nov 4, 2010

valya: HI,
I applied the patch and fix das_analytics_t.py unit test where I included new analyticsdb parameter, history. I also changed stock das.cfg to have this parameter in place.

@vkuznet
Copy link
Contributor

vkuznet commented Nov 4, 2010

valya: Gordon,
do you mind to update doc/sphinx/changelog.rst and add new documentation for DAS analytics, e.g. add doc/sphinx/das_analytics.rst and update doc/sphinx/index.rst accordingly.

@DMWMBot
Copy link
Author

DMWMBot commented Nov 4, 2010

gfball: Will do. Quite a big block of code with no documentation otherwise...

@DMWMBot
Copy link
Author

DMWMBot commented Nov 5, 2010

gfball: Added documentation and changelog entry.

@vkuznet
Copy link
Contributor

vkuznet commented Nov 12, 2010

valya: Gordon do you plan to issue more patches for this ticket. So far, I applied all patched. I tested analytics and it works. So we can close the ticket.

@vkuznet
Copy link
Contributor

vkuznet commented Nov 12, 2010

valya: Gordon,
once I played with your analytics server, I decided to keep this ticket open and ask you to implement two more requests. Please let me know if you can make them in dozen of days that I can plan to include them in upcoming release or not.

Currently you made web interface. I want to have similar CLI interface to DAS analytics server. Here is a set of changes I propose:

  1. change web server to return JSON for schedule/results upon request HTTP header, check for application/json.
  2. create CLI script, e.g. das_analytics_cli.py and its shell companion, which should do the following:

das_analytics_cli --list and get JSON doc with all scheduled tasks
das_analytics_cli --results and get JSON doc with all results
das_analytics_cli --name "SiteDB Task" --class ValueHotspot --key="site.name"
das_analytics_cli --remove UUID returns JSON with ok/fail status

obvious parameters are: --host, e.g. http://a.b.c:8213, may be --query which can be used instead of key. You may look at das_cache_client.py for example how to make JSON requests. On a server side you need to check if client send application/json header and return JSON instead of HTML templates. I think it should be very easy to extend your current code to support that.

Actually I can make task #2 quickly if you want and let you work on a server side.

Regardless of this implementation you should also add the input parameter validation in the following way:
Please define your set in DAS/analytics/analytics_web.py as
{{{
DAS_ANALYTICS_INPUTS = ['name', 'classname', 'intervals', ....]
}}}
then add for each input name the required check in checkargs @ DAS/web/utils.py
and use the following decorator for all your exposed methods
{{{
@checkargs(DAS_ANALYTICS_INPUTS)
}}}

You may check how it's done in either web/das_web.py or web/das_cache.py.

@vkuznet
Copy link
Contributor

vkuznet commented Nov 12, 2010

valya: So, here is prototype of das_analytics_cli where I print out requested URL field, rather then making actual call

{{{
Usage: das_analytics_cli.py [options]

Options:
-h, --help show this help message and exit
-v, --verbose verbose output
--name=NAME specify task name
--class=KLASS specify task class name, e.g. ValueHotspot, see DAS
analytics docs
--query=QUERY specify DAS query, site.name=T1_CH_CERN
--key=KEY specify DAS key, e.g. site.name
--interval=INTERVAL specify task interval
--list list existing tasks on server
--results list existing results on server
--result=RESULT_ID show task result for given task ID
--remove=REMOVE_ID remove given task ID
--host=HOST specify host name of DAS analytics server, default
http://localhost:8213

python tools/das_analytics_cli.py --list
Requested URL http://localhost:8213/analytics/schedule?

python tools/das_analytics_cli.py --results
Requested URL http://localhost:8213/analytics/results?

python tools/das_analytics_cli.py --name="SiteDB task" --class="ValueHotspot"
Requested URL http://localhost:8213/analytics/add_task?classname=ValueHotspot&interval=60&name=SiteDB+task&kwargs=%7B%7D

python tools/das_analytics_cli.py --name="SiteDB task" --class="ValueHotspot" --key=site.name
Requested URL http://localhost:8213/analytics/add_task?classname=ValueHotspot&interval=60&name=SiteDB+task&kwargs=%7B%22key%22%3A+%22site.name%22%7D
}}}

The parameters to add_tasks are produced by json.dumps() such that your add_task will load them.

@vkuznet
Copy link
Contributor

vkuznet commented Nov 16, 2010

valya: (In 522ae06) Update on analytics server; analytics cli; analytics help; Add ability to learn data-service output keys. fixes #424

From: Valentin Kuznetsov vkuznet@gmail.com

This issue was closed.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants