Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Where does Shogun suck in benchmarks? #4097

Open
karlnapf opened this issue Jan 21, 2018 · 14 comments
Open

Where does Shogun suck in benchmarks? #4097

karlnapf opened this issue Jan 21, 2018 · 14 comments

Comments

@karlnapf
Copy link
Member

This task is to find a systematic way to figure out the cases for which a library in the automated benchmarking system performs badly

  • very slow (ranked last, or far from median)
  • very memory intensive
  • fails
  • the focus here is (obviously) on Shogun, but a script that does that should be general (and contributed back to their project)

Contact @zoq and @rcurtin who have ideas how to do that.

(18:15:51) rcurtin: hmm, you could do this by postprocessing the SQL results
(18:15:57) rcurtin: I think it would be a fairly easy query too
(18:16:21) Heiko: could I make an entrance task for that and send people to you?
(18:16:29) rcurtin: the system can run and output results into an SQL database or into SQLite too, so it's easy to just grab a database and go to town with the sqlite3 db
(18:16:49) rcurtin: that would be fine, but definitely we should talk with Marcus and see what he thinks the interface for that should be
(18:17:07) rcurtin: like right now we kind of have a command-line interface to actually run the benchmarks, and some HTML/JS to view the results
(18:17:18) rcurtin: but this seems like something separate, a simple script to analyze results
(18:17:39) rcurtin: my initial inclination is that maybe this would fit well into some kind of new results/scripts/ directory, which can contain simple scripts to work with the results
(18:18:51) rcurtin: but maybe Marcus has a better idea
(18:19:09) rcurtin: and how to test the code to be submitted will be a little bit of an issue, we might need to set up some infrastructure for that

@karlnapf
Copy link
Member Author

#4046

@rcurtin
Copy link

rcurtin commented Jan 21, 2018

I would say a good place to start is to check out the benchmarking system, run it on a small dataset, and then produce an SQL query on the results that gives the desired information. After that we can figure out the right way to put this together into something that can be merged.

@zoq
Copy link
Contributor

zoq commented Jan 21, 2018

If somebody likes to work on the latests results, we can also provide an SQL dump.

@karlnapf
Copy link
Member Author

(18:34:46) zoq: Agreed, a simple preprocessing script sounds like a good idea, at the end, it's just a simple SQL query.
(18:34:58) zoq: We could provide some simple scripts apart from the HTML/JS interface for the most common tasks.
(18:35:12) zoq: I guess you could also write a simple Jupiter notebook? I think you could provide some input on that?

@p16i
Copy link
Contributor

p16i commented Feb 7, 2019

Hi,

I've figured out a way to get the results through the mysql_wrapper.php.

curl -X POST \
  http://www.mlpack.org/php/mysql_wrapper.php \
  -H 'Cache-Control: no-cache' \
  -H 'Content-Type: application/x-www-form-urlencoded' \
  -H 'Postman-Token: 175a469b-31bd-4921-bd23-170d8b9e171a' \
  -d 'request=<QUERY>'

And <QUERY> is

SELECT
    *
FROM (
    SELECT
        libraries.name as lib,
        methods.name as name,
        datasets.name as dataset,
        results.time as time,
        results.var as var,
        libraries.id,
        datasets.id as did,
        libraries.id as lid,
        results.build_id as bid,
        datasets.instances as di,
        datasets.attributes as da,
        datasets.size as ds
    FROM results, datasets, methods, libraries
    WHERE
        results.dataset_id = datasets.id AND
        results.method_id = methods.id AND
        methods.parameters = '' AND
        libraries.id = results.libary_id AND
        libraries.name = 'shogun' AND
        results.time <= 0 # this can be remove if we want to get all results.
    ORDER BY bid DESC
) tmp;

Returned JSON:

[
    {
        "lib": "shogun",
        "name": "LinearRegression",
        "dataset": "diabetes",
        "time": "0",
        "var": "0",
        "id": "6",
        "did": "8",
        "lid": "6",
        "bid": "33",
        "di": "442",
        "da": "10",
        "ds": "0"
    },
    {
        "lib": "shogun",
        "name": "LinearRegression",
        "dataset": "cosExp",
        "time": "-2",
        "var": "0",
        "id": "6",
        "did": "9",
        "lid": "6",
        "bid": "33",
        "di": "200",
        "da": "800",
        "ds": "1"
    },
...
]

@zoq
Copy link
Contributor

zoq commented Feb 7, 2019

Nice solution, thanks for the input.

@p16i
Copy link
Contributor

p16i commented Feb 7, 2019

hi @zoq,

I'm a bit concerned about the API though. With this interface, one might directly send malicious queries to the DB.

I personally think we should find a better way to extract the result as well as making the API more restrict.

One way might be to generate benchmark results offline and write the results as JSON files. The frontend will just read those JSON files.

@zoq
Copy link
Contributor

zoq commented Feb 8, 2019

Actually, I was surprised you could send a query; only a specific IP should be able to do that. Do you think, that could be sufficient?

@p16i
Copy link
Contributor

p16i commented Feb 8, 2019

I was surprised too!

Can you elaborate a bit about your approach? If I understood correctly, you would allow only certain IPs to send the POST request to the server. If that's the case, the benchmark page will be functional on for those IPs, right?

@zoq
Copy link
Contributor

zoq commented Feb 8, 2019

Right, only the build/webserver would be able to send the POST, I think this is already the case, but since a user is able to run the php script the IP is correct, so I guess we could adapt the script.

@p16i
Copy link
Contributor

p16i commented Feb 12, 2019

Hi @zoq & @karlnapf,

Given the command I provided previously and current coverage in #4046 (comment), I think we basically have everything we need for this issue.

What else should we do for this issue?

@karlnapf
Copy link
Member Author

But some shogun algos are uncovered, so how can we know how it performs?

@zoq
Copy link
Contributor

zoq commented Feb 13, 2019

It would be interesting to see in which case shogun is leading or is behind. The second case, could be a good starting point for further analysis.

@p16i
Copy link
Contributor

p16i commented Feb 13, 2019

I've also created an issue at mlpack's benchmarks (mlpack/benchmarks#133). It seems Shogun has an DTC algorithm but the benchmark report doesn't have the the result.

@zoq do you have any idea why that happens?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants