Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

advanced querying #35

Closed
howardchung opened this issue Sep 30, 2014 · 13 comments
Closed

advanced querying #35

howardchung opened this issue Sep 30, 2014 · 13 comments
Assignees
Milestone

Comments

@howardchung
Copy link
Member

Allow users to sort/filter their matches.

Filter matches by significant game modes only.

Detect no stats recorded matches and disregard.

@howardchung
Copy link
Member Author

Sorting implemented client side, but I think we need to determine filtering on the server side and pass back only matches that are significant.

@howardchung howardchung added this to the v1.0 milestone Nov 20, 2014
@howardchung
Copy link
Member Author

We can achieve this by implementing a server-side API that our frontend queries to get back filters/sorts.

@howardchung
Copy link
Member Author

Our "frontend" can be integration of this API with datatable's support for server-side operations.

@howardchung
Copy link
Member Author

Implement custom/advanced querying

@howardchung howardchung removed this from the v1.0 milestone Jan 13, 2015
@howardchung howardchung assigned howardchung and unassigned albertcui Jan 15, 2015
@howardchung howardchung modified the milestones: 1.1, 2.0 Jan 17, 2015
@howardchung howardchung changed the title improved match filtering/sorting match filtering/sorting, advanced querying, player match tables use api Feb 3, 2015
@howardchung howardchung changed the title match filtering/sorting, advanced querying, player match tables use api match filtering/sorting, advanced querying Feb 5, 2015
@howardchung howardchung assigned albertcui and unassigned howardchung Feb 26, 2015
@howardchung
Copy link
Member Author

My opinion of this is that it's not going to be used a lot, and will require a lot of indexes and work to build an interface. Albert still wants to give it a shot, so reassigning.

@howardchung
Copy link
Member Author

There are two parts to this:

  1. Getting the data. We can either construct some complicated mongo query for this, or use something like elastic search, which albert wants to play with.

  2. Building a workable UI for users to construct queries.

@howardchung
Copy link
Member Author

Advanced querying also includes aggregating the results for these kinds of queries.

There are two types of queries, ones that are predefined by YASP and we build something with the data, like a histogram, or nick's ward map, and ones that users build using some custom UI.

A key difference between automatic and manual queries is that automatic queries can occur on the server side prior to rendering data back to the user. Custom queries require hitting an API endpoint and sending back results.

Examples of automatic queries (We already build histograms of match duration and GPM)
select user, gpm, gets back array of gpms, build a histogram with it. Could also report numerical summaries.
select user, match duration, same as above
select user, runes, gets back array of hashes of counts by rune type.
select user, Kills/Deaths/Assists, get back array of hashes of counts for each
consumables avg/cumulative
item timings, for each item, hero?
HD/TD/HH (histograms)
Chat (ggs called/messages, Swearing/profanity analysis)
Grouping of heroes played (by valve groupings/primary attribute)
@nickhh 's idea, select user, obs, get back array of hashes of x,y,value, iterate through all of them, sum the totals, build a heatmap.

Examples of custom queries:
User selects user, spectre, radiance, get back array of radiance timings.

After the result of an advanced query, we want two things back:
returning the full array of values (with this we can display a table, or build a histogram)
returning the min/max/avg/sum (with this we can display a numerical summary of that data set)
Optionally, we also return some metric such as win/loss so we can get a winrate for each particular query.

Part 1 is building the functions to return these desired results for some given query.
Part 2 involves building a UI to allow users to build custom queries, then use the same functions for data.

Sorting: As long as the result set size is under 16MB (MongoDB's maximum) we can sort/filter on JS side (which is better since it doesn't require an index anyway).
Filtering: detect no stats recorded (algorithmically), significant game modes only

@howardchung howardchung changed the title match filtering/sorting, advanced querying advanced querying Mar 4, 2015
@howardchung howardchung assigned howardchung and unassigned albertcui Mar 4, 2015
@howardchung
Copy link
Member Author

Taking a stab at this :)

@howardchung
Copy link
Member Author

so I think this is actually two separate things:

  1. An advanced querying function that accepts conditions/return values, and filters an array of matches accordingly. It can also compute a winrate if a user was defined.

  2. An aggregator function that accepts an array of matches, and returns aggregated data across specified fields.

@howardchung
Copy link
Member Author

I think I've done pretty much all I want to do with this, by adding a hero filter on the player trends page. The result is basically aggregattion by player, with an additional optional filter by hero. If we add too many more filters, the dataset will be too small to draw any meaningful conclusion.

If we want to provide some kind of aggregation api that works across players, I think we'll need to build a second api endpoint. We can't use the current /api/matches endpoint as that has to be limited heavily (in terms of n) in order to prevent abuse, as it can return full matches (which are 200kb each).

@howardchung howardchung assigned albertcui and unassigned howardchung Mar 8, 2015
@howardchung howardchung modified the milestones: 3.0, 2.0 Mar 21, 2015
@howardchung howardchung assigned howardchung and unassigned albertcui Mar 21, 2015
@howardchung
Copy link
Member Author

guess I will give this another shot. Current plan of attack:
Enhance the /api/matches endpoint:
Restrict projection per match to a very small set (to prevent bandwidth abuse)
Set a limit on how many matches we will aggregate (max result set size)
Set a limit on results returned at once to prevent bandwidth abuse
Do aggregation on the results (Since aggData is relatively small)?
Or we could just return winrate, count by iterating through the result set

Player matches page can use this endpoint to populate matches?

Possibly trends can be adapted to use this as well.

The problem with using datatables to interface with the api is that the query string suddenly gets a lot harder to build. Maybe we can add jQuery helpers to construct the query and use it for both matches tab (lists matches, reports winrate fitting the advanced query conditions) and trends tabs (aggregates data fitting the advanced query conditions)

@howardchung
Copy link
Member Author

I'm happy with the basic implemenation we have now. We can add more filters incrementally if users request them.

Over to Albert for styling.

@howardchung howardchung assigned albertcui and unassigned howardchung Mar 30, 2015
@albertcui albertcui mentioned this issue Apr 1, 2015
@howardchung
Copy link
Member Author

framework is there for additional options to be added.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants