Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add additional statistical functions like median #1660

Open
camsaul opened this issue Dec 9, 2015 · 22 comments

Comments

@camsaul
Copy link
Member

commented Dec 9, 2015

median would be extremely useful for financial calculations and the like. We have other fancy aggregations like Stanard Deviation already... This would be fairly easy to implement, at least for Oracle. We can just make it a feature so we don't offer it for DBs that don't support it

⬇️ Please click the 👍 reaction instead of leaving a +1 or 👍 comment

@camsaul

This comment has been minimized.

Copy link
Member Author

commented Dec 15, 2015

could also add min + max

@phalgun

This comment has been minimized.

Copy link

commented Dec 29, 2015

👍

@MartinArens

This comment has been minimized.

Copy link

commented Jan 21, 2016

+1

1 similar comment
@adamthehutt

This comment has been minimized.

Copy link

commented Feb 7, 2016

+1

@agilliland

This comment has been minimized.

Copy link
Contributor

commented Feb 24, 2016

+1 .. just came across a couple scenarios where max() would be really useful

@camsaul

This comment has been minimized.

Copy link
Member Author

commented Feb 24, 2016

@agilliland Can't you just order by that field and limit it to 1

@camsaul

This comment has been minimized.

Copy link
Member Author

commented Feb 24, 2016

But anyways as @salsakran would say, +1000

@agilliland

This comment has been minimized.

Copy link
Contributor

commented Feb 24, 2016

@camsaul yeah, if you are just trying to find a single value that definitely works. if you apply a grouping then you need the aggregation function though. e.g. if i want the most expensive order fulfilled each day

@tlrobinson

This comment has been minimized.

Copy link
Member

commented Feb 24, 2016

In particular with the new multiseries stuff you could overlay min/max/avg (though it would be nice to be able to pick your own colors)

@camsaul camsaul self-assigned this Feb 26, 2016

@camsaul camsaul added this to the 0.16.0 milestone Feb 26, 2016

@salsakran

This comment has been minimized.

Copy link
Contributor

commented Mar 2, 2016

@agilliland + @tlrobinson will this need any front end design?

@camsaul

This comment has been minimized.

Copy link
Member Author

commented Mar 2, 2016

It's just an aggregation type so I was thinking just stick them in the advanced options with cumulative sum

@tlrobinson

This comment has been minimized.

Copy link
Member

commented Mar 2, 2016

@salsakran Not if we just want them to show up just like the existing aggregations.

Here:

var Aggregators = [{

case "cum_sum": return ["Cumulative sum of ", Query.getFieldName(tableMetadata, aggregation[1], options)];

(We should actually consolidate those so there's just one place we need to change)

Also at some point when we switched the aggregation widget we lost the "advanced" distinction. We might want to add that back in if we add a lot more options.

camsaul added a commit that referenced this issue Mar 16, 2016

@camsaul camsaul changed the title Add additional statistical functions like median Add additional statistical functions like min / max Mar 17, 2016

@camsaul

This comment has been minimized.

Copy link
Member Author

commented Mar 17, 2016

So unfortunately it looks like Oracle is the only DB that we [potentially will] support that ships median aggregations out of the box. We can revisit exposing that if/when we ship our Oracle driver.

Luckily however min and max are supported by everybody and are enabled in PR #2171

@camsaul camsaul changed the title Add additional statistical functions like min / max Add additional statistical functions like min, max, median Mar 17, 2016

@camsaul camsaul changed the title Add additional statistical functions like min, max, median Add additional statistical functions like min / max Mar 17, 2016

camsaul added a commit that referenced this issue Mar 17, 2016

camsaul added a commit that referenced this issue Mar 17, 2016

camsaul added a commit that referenced this issue Mar 17, 2016

@camsaul camsaul closed this Mar 17, 2016

@grzaks

This comment has been minimized.

Copy link

commented Feb 12, 2017

@camsaul it's possible to calculate median in PostgreSQL like this:
select percentile_cont(.50) within group (order by value) as value_median from table

Would you consider adding median, and percentile to aggregation types?

@doodlebro

This comment has been minimized.

Copy link

commented Jun 15, 2017

How about support for median in redshift? Has been around awhile.

@camsaul

This comment has been minimized.

Copy link
Member Author

commented Jun 15, 2017

Hi @grzaks / @doodlebro, if you have a sense of what needs to be done to support this and some basic familiarity with Clojure, we'd love to have you as a contributor and would welcome a PR to support this 👍

@camsaul camsaul removed this from the 0.16.0 milestone Jun 15, 2017

@camsaul

This comment has been minimized.

Copy link
Member Author

commented Jun 15, 2017

Reopening this since there's a few databases that support things like median so we can consider enabling it for those databases at some point in the future

@camsaul camsaul reopened this Jun 15, 2017

@camsaul camsaul changed the title Add additional statistical functions like min / max Add additional statistical functions like median Jun 15, 2017

@camsaul camsaul removed their assignment Jul 17, 2017

@yejianye

This comment has been minimized.

Copy link
Contributor

commented Mar 29, 2018

I implemented median for Postgres, Redshift and SQL Server. Here is the PR
#7234

@philip-n

This comment has been minimized.

Copy link

commented Apr 20, 2018

+1

I like to use min/max/median/avg to get an initial "feel" for data distribution, allowing to write more specific queries later on. Thanks to @yejianye for providing a pull request, I'd be happy if median functionality could be integrated (for the supporting databases)!

@eladeyal-intel

This comment has been minimized.

Copy link

commented Apr 30, 2018

median is super useful

@mazameli

This comment has been minimized.

Copy link
Contributor

commented Dec 12, 2018

These would make a ton of sense to add to our custom aggregation expression editor, e.g. let me type Median(Cost).

@soltanianalytics

This comment has been minimized.

Copy link

commented Aug 6, 2019

I'd like to point out that this is still a super relevant issue... Median ist a basic function that would be super useful in many applications. Otherwise, Metabase has pretty much all that's required.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
You can’t perform that action at this time.