Feature/spline 684 data retention 2#1113
Conversation
…with current develop - TODO test
…b.com/Aditya-Sood/ecc07c9f296dbdf03d4946c5d1b4efce - naively tested with test data (multiple lineages at different times - purge with time between - correct outcome - older purged, newer kept)
% Conflicts: % admin/src/main/scala/za/co/absa/spline/arango/ArangoManager.scala % arangodb-foxx-services/src/main/routes/index.ts
|
|
||
| import {aql, db} from '@arangodb' | ||
|
|
||
| export function pruneBefore(timestamp) { |
There was a problem hiding this comment.
Was there any specific reason why you preferred breaking one complex AQL query into a series of smaller queries?
Although such approach is probably better from the ArangoDB memory perspective, but I am worried about transferring all those intermediate results (IDs) from AQL engine to V8 and back, which I'm sure are not sharing any memory. Also, it could result in less optimal AQL execution plan as the AQL optimizer does not see into the Foxx function.
I would suggest to try to combine some queries into bigger blocks.
There was a problem hiding this comment.
This is maily because the most of the code comes from https://gist.github.com/Aditya-Sood/ecc07c9f296dbdf03d4946c5d1b4efce script - adapted only where needed for our needs.
There was a problem hiding this comment.
I thought about it some more, for example in case of the looping for the collections to purge in stage 2, that can be done in AQL -- but in that case we would have to forgo the logging (I don't know a way to write to logs from AQL directly).
There was a problem hiding this comment.
I would give it a try as my logic tells me that it would be more correct approach. Logging is not important here IMO.
There was a problem hiding this comment.
Actually, I tend to think that logging here is crucial. For large pruning, the individual parts can take minutes/tens of minutes and without logging one is blind on what is happening in the DB -- it could run for hours without any sign of the current state.
Here, I would strive to keep it as is for now.
…n-2' into feature/spline-684-data-retention-2
| `).toArray() | ||
|
|
||
| const t1 = Date.now() | ||
| console.log(`Purged ${refPlanIds.length} plans, ${t1 - t0} ms`) |
There was a problem hiding this comment.
Let's use Logger helper object instead of console for logging.
import * as Logger from '../utils/logger'
...
Logger.info(`Purged ${refPlanIds.length} plans, ${t1 - t0} ms`)There was a problem hiding this comment.
Thanks, redone as suggested. I have test-ran it with this logging, too, and witnessed it correctly shows log entries on the ArangoDB LOG UI.
|
Kudos, SonarCloud Quality Gate passed!
|
| val ZonedDateTimeRegexp(ldt, tzOffset, tzId) = s | ||
| val maybeTzIds = Seq(tzId, tzOffset).map(Option.apply) | ||
|
|
||
| require(!maybeTzIds.forall(_.isDefined), "Either timezone ID or offset should be specified, not both") |
There was a problem hiding this comment.
ZonedDateTime.parse allows both offset and name at the same time . I think it's good idea to accept all inputs that are valid for ZonedDateTime.parse
There was a problem hiding this comment.
The issue with allowing only geographical zone id is that it is ambiguous. Not unique.
2022-10-30T02:30:00[Europe/Prague] is both:
2022-10-30T02:30+02:00[Europe/Prague]
2022-10-30T02:30+01:00[Europe/Prague]
This is the day and hour of change from summer time to winter time:
val start = ZonedDateTime.parse("2022-10-30T02:30:00+02:00[Europe/Prague]")
val end = start.plusHours(1)
// start: java.time.ZonedDateTime = 2022-10-30T02:30+02:00[Europe/Prague]
// end: java.time.ZonedDateTime = 2022-10-30T02:30+01:00[Europe/Prague]Another interesting behaviour (not so relevant for us though) is this:
val start = ZonedDateTime.parse("2022-10-30T02:30:00+02:00")
val end = start.plusHours(1)
// start: java.time.ZonedDateTime = 2022-10-30T02:30+02:00
// end: java.time.ZonedDateTime = 2022-10-30T03:30+02:00When no geographical zone id is provided the ZonedDateTime will stay in the same offset.
There was a problem hiding this comment.
We agreed to solve this as part of other ticket #1139








This PR brings back unmerged PR #762.
The original service wrapper has been generally used (while brought up to date with latest develop). The internals of the foxx service where the db-prune actually happens has been taken from https://gist.github.com/Aditya-Sood/ecc07c9f296dbdf03d4946c5d1b4efce (#684 (comment)).
Naively tested.
Based on the measuring of the implementation options for Stage 1 and Stage 2, Option A for Stage 1 and Option A for Stage 2 have been selected for the implementation.