-
-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TIMESTAMP_TRUNCATED_TO_MONTH & TIMESTAMP_TRUNCATED_TO_YEAR #20
Comments
Here is an example of truncating weekly for a chart query:
monthly looks like: |
We demonstrate in the README how we added "truncated-to-hour": https://github.com/graphile/pg-aggregates#defining-your-own-grouping-derivatives and adding truncated to week/month/year is similar. Though now I come to look at it it assumes you already know the plugin system; could you try this and let me know if it works for you please? If so I'll update the docs with this more full example. const TIMESTAMP_OID = "1114";
const TIMESTAMPTZ_OID = "1184";
// Add this plugin to your `--append-plugins`/`appendPlugins` list:
const MyAggregateGroupSpecsPlugin = (builder) => {
builder.hook("build", (build) => {
const { pgSql: sql } = build;
// Add a "truncated to year" aggregrate group-by:
build.pgAggregateGroupBySpecs.push({
id: "truncated-to-year",
isSuitableType: (pgType) =>
pgType.id === TIMESTAMP_OID || pgType.id === TIMESTAMPTZ_OID,
sqlWrap: (sqlFrag) => sql.fragment`date_trunc('year', ${sqlFrag})`,
});
// Push more aggregate group-bys here if you want, e.g. week, month, etc.
return build;
});
};
// For CommonJS:
module.exports = MyAggregateGroupSpecsPlugin;
// Or for ESM:
// export default MyAggregateGroupSpecsPlugin; |
@benjie thanks so much for the quick reply. This project is still a prototype, so I've started with a simple rc file. for simplicity i've added this plugin to the rc file rather than importing it. Perhaps I am doing something wrong here as it doesn't seem to make any impact on the schema. Below is my entire .postgraphilerc file. I've added all the possible date truncations listed here for date_trunc. Yes I know how locking micro/milliseconds can be, but so can many other parts of this plugin, and I know how useless millennium and even century are but seemed like we might as well extend the entire PG API on this, rather than require adding them? Happy to help with a PR for that if we can get this working.... Am I gonna have to comment out year and hour? I did try that, but this plugin seems to do nothing, no matter what I try.
|
Stack notes, in case it helps:
|
You can’t add a plugin function via appendPlugins in the RC file; you have to pass a path to a file to be required (like you have for the other plugins). Make sure it’s an absolute path, e.g. use |
OIC that makes a lot of sense for CLI config file. Thanks for the hand holding through plugin development. here is my groupByPlugin.js file, it's top level (sibling files no subfolder) RN:
Here is my rc file:
|
If the plugin isn't found or causes an error, is something output? Because when I intentionally use an invalid path, everything runs without warnings. |
I tried this variation too, based on the source document:
|
I made this, I haven't tested it yet, but after I do, I'll submit a PR. There may be an opinionated reason not to accept it, but if that's the case, the docs do seem to be incorrect as they are... and tracking down append plugin errors (including fully missing files) currently seems fuzzy at best. I'm just trying to help make things better for everyone, not just my own client/employer. |
So it turns out you actually can pass the function directly! https://github.com/graphile/postgraphile/blob/e0eeb1a1f768f7c757d36a6fa91045add92f9169/src/postgraphile/cli.ts#L596 Sorry for the misinformation here, it's a long time since I've visited this code!
Should be!
Kinda indicates your config file is not actually being used? I wonder if you're editing the wrong file or running postgraphile in the wrong folder? Or maybe you're editing a
I almost never use I generally recommend that people using a
That one should definitely be throwing an error because there should be a conflict on the Also if you just add
There's a couple:
Hopefully we can make it sufficiently easy to add your own group by (maybe even with your code as an example in the README) that we don't need to add it to core. And of course you could always create a |
It's clearly using my url, and all the plugins I specified in .postgraphilerc.js so it's not missing the config file entirely. It's just not loading my plugin, or it doesn't work as expected. When I get the filename wrong, I get no errors, it just runs as if it's not even trying to load it, yet aggregates and my db connection work.
I guess that's my only option at this point. I don't need express/restify on all my services, so they usually start as minimal as possible. Rather than adding endpoints here, we have a primary rest API server... This is also still a prototype. That's my rationale: it still works and start simple. But I'm seeing, that now half of that is no longer true.
Yeah that's kinda what I was going for, just get any response.
First, I completely understand and respect your view. Second, I disagree. I feel like the goal of PostGraphile is to extend Postgres into GraphQL and modularity is key. However, plugins (the modular part) should be API complete, before being extendable. The purpose of a plugin for a library like this, doesn't seem to be making a plugin system for a plugin system. Although that's handy it doesn't seem like it should be required for typical usage.
yes, which is why i just extended the postgres api and if ppl want something else, they can extend the plugin as you suggest i do.
I agree, why did you add hour at all? If you had month, week, day and year (ideally quarter too) in there over hour & day, this thread would have never happened. Thanks again for your time sharing with me and swapping opinions. I'll forgo the rc file and report back. |
Try editing one of the things out of your config file that you know it's using, e.g. one of the plugins you rely on, and then restart PostGraphile and ensure that that functionality has disappeared. Because honestly, I think it's not using that particular config file with the bad path in it - if you can show me in the code how it's possible to not throw an error when the path is wrong I'd be very interested to see (and fix) it.
You're the only person who's requested this so far, so I'd be careful inferring what you're doing is necessarily typical usage 😉
I added the fields that I needed for my usecase at the time which was per-day and per-hour. I didn't need the others and I saw week and month as problematic (as previously mentioned) so I wanted to avoid the complexity of adding them and then having people complain they work the wrong way - instead add them yourself and have them work the way you design. I considered not even adding day/hour to core but figured it's necessary for feature discovery. People might not even need time-based groupings but groupings on some other metric altogether; adding more groupings "for completeness" makes it harder to discover the other values of the enum because they get lost in a longer list.
Always good to have me surface the reasoning behind my decisions 😉 |
this was exactly right, it has somehow cached the data in this file and isn't reading further changes.... I have no idea how to fix it other than not using an rc/config/ini file to optionally hold envs/flags, as many other cli tools do. For example, I was able to make 2 really long node scripts with all my flags (one for local, one for prod), as this wouldn't need to require any Node based RESTful libraries or any other bloat this project doesn't need. However, an absolute path inside a json file is cumbersome at best. I was hoping to see some sort of replacement for the RC file that didn't require being inserted into Express or Restify that I clearly don't need. From the sounds of it, there isn't, and middleware-only is where this tool is headed. Having to import & -configure express, just for one middleware, with 2 routes seems rather silly. When you need more it makes sense, but GraphQL and Microservices don't always have to be one giant schema (although that's all the rage). It also works well to have smaller focused GraphQL services too, like this example of an internal reporting server where the only auth is "are you on a whitelisted IP?" aka "are you in the office or on the VPN?" style private services.
This is issue number 20 for this repo I see. So, I would repeat back the same remark to you "so I'd be careful inferring" about your lack of useful data, at least in the context of this project 😉 Let me share more context for what I meant "typical aggregation/reporting usage in Postgres (or similar RDBs), based on my own firsthand experience" (like you, I only have the sample group I have been given). Many clients have asked me for reporting/charts with For reporting purposes, most users don't seem to care about the details of groupings too much or if daylight savings was considered in the equation (I've never once seen it used for aggregation calculation). In fact, many will say 13 * 28 = 364 is good enough, when each year is actually 364.25 days (the 1/4 day is added to February every 4th year because that's how we decided to mark time), and that's 1:4 of being under by a whole day and 3:4 chance of being "right" according to how we measure time. Asking someone for an exact average doesn't make sense, neither does getting too particular about time groupings. Even in your case with hour and day... 24 hours is not the length of every day. even if we claim it is exactly true on every day, some days have 23 and other have 25, at least in the US... this is a technicality and doesn't impact the data groupings unless we're comparing the same hour range, on every day of a year, even then it's only off twice for obvious reasons, and can be adjusted.
Yeah I think that's the main difference in our opinions. I share/default all my opinions with Postgres. If they made the date_trunc function work this way and we just extended it, how can we be wrong? We cannot be, only "incomplete" by someone else's opinion (which is what started this thread, and is clearly easy to say no to).
Yes developer UX is a concern here. I went overboard using all of them, kinda to make a point: there's a balance. I feel a handful of the most common (year, quarter, month, week, day... possibly even hour if used with paging) are better in a plugin like this, than "two I once needed for a project" (not that that's ever a bad reason to make it into open source).
You're a very talented and capable human. You don't rally have to justify any of your decisions to me. However, I'm glad to hear your rationale and possibly let it inform my future opinions. Hopefully this will be closed soon. Still working on converting this to a middleware project to I can include a local plugin. |
I was able to get this working:
....
Looks like this with 7 options in descending order, per date. Perhaps too much for some, but it works perfect for charting, and nothing more costly was added. If anyone else might need this... it's here. |
dang... ima leave this closed but that actually just listed queries that don't seem to work. they only work alone and not along with the "having" clauses =[ the way of extending this is a little more complex than is shown in the docs, and perhaps not everything was taken into account, my fork seems to work closer to what I'd expect |
Pretty sure it's reading a different file - there's no caching.
You don't have to use express/restify, you can just use the built in Node.js webserver which is exactly what the CLI uses and is also the second example code block on our library usage page. https://www.graphile.org/postgraphile/usage-library/ - zero bloat. In fact, because it
One of the main reasons that I built PostGraphile as a plugin-based system is so that users could customise it to their needs without having to affect all the other users of the library. This plugin gets 600 downloads per week and you're the only person that's requested this feature so far.
That's 7 options per timestamp field, per table though, right? This soon multiplies to making sizeable diffs to the
Excellent! I'd probably do something like: const isTimestamp = (pgType) =>
pgType.id === TIMESTAMP_OID || pgType.id === TIMESTAMPTZ_OID;
const tsTruncateSpec = (sql, interval) => ({
id: `truncated-to-${interval}`,
isSuitableType: isTimestamp,
sqlWrap: (sqlFrag) => sql.fragment`date_trunc(${sql.literal(interval)}, ${sqlFrag})`,
});
const SevenAggregateGroupSpecsPlugin = (builder) => {
builder.hook("build", (build) => {
const { pgSql: sql } = build;
build.pgAggregateGroupBySpecs = [
// Copy existing specs, but remove the ones we're replacing
...build.pgAggregateGroupBySpecs.filter(
spec => !["truncated-to-day", "truncated-to-hour"].includes(spec.id)
),
// Add our timestamp specs
tsTruncateSpec(sql, "year"),
tsTruncateSpec(sql, "month"),
tsTruncateSpec(sql, "week"),
tsTruncateSpec(sql, "day"),
tsTruncateSpec(sql, "hour"),
];
return build;
});
};
The group derivatives aren't added to Or maybe I've misinterpreted you and you mean there's an error? |
kinda, i just get empty results combining having with group by using the "extend the plugin method" like it wasn't adding the "having" part to the query, to which I confirmed with the explain mode on, that it was not. it's fine, my fork of the plugin works for me. I just need to keep track of upstream changes which should rarely conflict. I was most interested in simply paging data by year/week/month segments (sorta like you have above), but these filters need to work as well. here's the example query I got stuck on.
the above returns 4 chartable datasets, really 2 pairs sum/avg and min/max. they could be used like this for 2 views or we could add more field data, to make 4 views, with the same chart template. For more context, I'm using this with ChartsJS and Svelte ( and code generator) to make a simple/reusable internal reporting app, that we pretty much just need to format how the charts look and how many show on each page etc and swap from imports to read replica when it's ready.
How is 3 for each helpful now? my only point was hour and day are an incomplete list. if you're gonna complain about the list being too long, perhaps you don't actually need an aggregate plugin? if you need these, why would you want to extend a plugin that's already made to do aggregations. and further, as it is now, it seems more like schema bloat and attack surface than if it was API complete, because then even if it added more fields, they would have a purpose/use. why not either:
the code is done, it's basically just your code pasted from this thread. I have no emotional attachment to it, other than it makes my project work. because of that last reason, i'll try to keep my fork in sync, since its public, perhaps it helps someone else. I have nothing but respect for you. there was nothing personal or implied above. just some debate/discussion on how/why things are done. I understand your decisions/opinions and respect that you stand behind them. |
When you say "your fork" are you referring to this PR? https://github.com/graphile/pg-aggregates/pull/21/files That should be identical to using the plugin... Or do you have a different fork? |
yes, that's what i thought too but i kept getting empty arrays, only when combining with a having clause that returns results, from the same database with one folder using the extend-a-plugin method and the other using my fork worked fine. however, its possible I may be running into a different bug. If i can get my project working without a submodule from my fork, it makes my life easier. I'm working on getting this production ready and up on test server. If I can get it working, I will for sure update this thread. |
it actually looks more like a timeout issue (perhaps cache invalidation too)... i have some long running queries on a table with several million rows. they take 8-12ish seconds when I run my integration tests locally. I'm using a simple jest runner and moved the test timeouts to 30 seconds, but ideally in production they'll be 0-5 or 1-10, worst case. seems like occasionally the requests pass the 15 second barrier... perhaps something fails on a default timeout of 5 seconds with a 3x grace period or some such thing? my hope is that I won't have the same issues, using Linux instead of Windows. Sometimes a C++ or even js dependency isn't well tested on all platforms and causes odd behavior with certain combinations x64/x86, nodejs & dep version numbers even the occasional WSL2 bug/issue. This does indeed work, both ways will hit intermittent empties, and often when a query comes back empty it gets stuck that way until a service restart, even if removing the grouping part of the query, leaving the having part, still shows results. This is what I ended up using, which is 6 total, including the timestamp itself, 4 added: year, quarter, week, month & 1 removed: hour, 1 kept: day.
I left the other plugins to show context and compatibility but only the function marked |
here is another example, since you mentioned, this feature isn't used/discussed much.
|
Here is a fairly* sharable sample of what I was building, in case you still think this is a minority use case. I built a mock from PostGraphile and plugins, then randomize the data and change some small details.
https://svelte.dev/repl/96fff1ea92ec46d2b465a0830d958782?version=3.42.1 |
Thanks for clarification - I've taken the example plugin code from above and tested it, added comments, and incorporated it into the documentation here: #24 Regarding svelte.dev: nice example! 🙌 |
Feature description
Add more truncated timestamp group by options:
TIMESTAMP_TRUNCATED_TO_WEEK
TIMESTAMP_TRUNCATED_TO_MONTH
TIMESTAMP_TRUNCATED_TO_YEAR
Motivating example
Clearly missing from the list:
TIMESTAMP_TRUNCATED_TO_HOUR
TIMESTAMP_TRUNCATED_TO_DAY
This is required for my reporting project. Until this is in, I'll have to manually build queries for these.
This is an incredible tool to use for reporting services on an isolated reporting replica... it only makes sense to complete this list so that we can group our data in all the meaningful ways, using a terse built-in method that already exists.
Breaking changes
I don't see any reason this would cause breaking changes, simply extending current features.
Supporting development
I [tick all that apply]:
The text was updated successfully, but these errors were encountered: