Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

/api/timesheets response (TimesheetCollection) missing modified timestamp #4902

Open
fulldecent opened this issue Jun 6, 2024 · 4 comments

Comments

@fulldecent
Copy link

fulldecent commented Jun 6, 2024

Describe the problem

Executive summary

The current API spec does not allow you to properly do incremental data loads.

(I only care about the timesheets resource. But this concept may in fact apply to other resources as well.)

Details

How you query

In the /api/timesheets route, you can query incremental records since your last query using:

Excerpt from Swagger docs: (sorry I don't have a canonical link for this, but I access at https://pacific.kimai.cloud/api/doc.json)

"/api/timesheets": {
    "get": {
        ...
        "parameters": [
            ...
            {
                "name": "modified_after",
                "in": "query",
                "description": "Only records changed after this date will be included (format: HTML5 datetime-local, e.g. YYYY-MM-DDThh:mm:ss)",
                "required": false,
                "allowEmptyValue": true,
                "schema": {
                    "type": "string",
                    "default": null
                }
            }
        ],
        "responses": {
            "200": {
                ...
                "content": {
                    "application/json": {
                        "schema": {
                            "type": "array",
                            "items": {
                                "$ref": "#/components/schemas/TimesheetCollection"
                            }
                        }
                    }
                }
            }
        }
    },
},

What you get

However the records returned are:

"TimesheetCollection": {
    "required": [
        "begin",
        "rate",
        "exported",
        "billable"
    ],
    ...
}

The problem

The request query "Only records changed after this date will be included" does not align with any data available in the response. Therefore it is not possible using available information to perform this common scenario:

"Download all records modified since the last time I download records"

Describe the solution you'd like

Update the response to include a modified timestamp:

  "TimesheetCollection": {
      "required": [
          ...
+         "modified"
      ],
      ...
  }

And then in the /api/timesheets route, update specification to clearly say:

```diff
  "/api/timesheets": {
      "get": {
          ...
          "parameters": [
              ...
              {
                  "name": "modified_after",
                  "in": "query",
-                  "description": "Only records changed after this date will be included (format: HTML5 datetime-local, e.g. YYYY-MM-DDThh:mm:ss)",
+                  "description": "Only records with `modified` after this date will be included (format: HTML5 datetime-local, e.g. YYYY-MM-DDThh:mm:ss)",
                  "required": false,
                  "allowEmptyValue": true,
                  "schema": {
                      "type": "string",
                      "default": null
                  }
              }
          ],
          ...
      },

Describe alternatives you've considered

Wait for webhooks to be implemented.

But even this is insufficient. Because a proper sync will just wait for the webhook and then do the same sync process above. (This approach provides resiliency against failed/dropped webhooks.)

Screenshots

No response

Additional notes about deleting records

Performing a proper incremental data synchronization requires finding all modified (and deleted) records since the last load.

When data is deleted from the source system (Kimai) the best practice is:

  1. Set deleted = true
  2. Update modified time
  3. Do not remove record from the database

Then the API will see those records as being newly deleted since the last sync.

If Kimai does not act this way, it would be an additional blocker for doing proper incremental data sync.

GDPR note: if there is a desire to actually delete information from the system, here is how to do it. Update the modified column to NOW(), keep the id column, and remove/reset all other information in that row. Never use SQL DELETE. In fact, the application shouldn't even have SQL DELETE permission.

@kevinpapst
Copy link
Member

does not align with any data available in the response

I don't understand that. Can you explain further?

The modified flag is exactly there to provide all changed timesheet since X.
This does not include deleted items, as those are gone.
Kimai has no "trash" concept, deleting something means it is gone.

@fulldecent
Copy link
Author

fulldecent commented Jun 6, 2024

Here is a story.

  1. I create timesheet A, its modified time is now (according to Kimai, the current time is 20240601T000000)
  2. I access the API to get all timesheets modified since 2000
    3. The timesheet is returned
  3. Then I modify timesheet (according to Kimai, this happened at 20240601T120000)
  4. Now I access the API again and I would get that updated record which happened since the last API access

In step 4, which modified_since value should I pass to ensure I receive all records since I received in step 2?

  1. Should I save the time that the API client made its last request and use that?
    1. This fails because the Kimai server has a different clock than the API client's system. According to Einstein, it is impossible that the API client can know what the server thinks the current time is "now". And in addition to Einstein, there are many other practical reasons why system clocks will not be perfectly synchronized.

Also, sorry, one more requirement. The orderBy parameter should also please support sorting by the modified time.


I do understand that it is a design decision that Kimai will use SQL DELETE to remove data from the database. You can change that or not. But I just wanted to highlight that here because it is relevant to doing an incremental inhale of timesheet records. With DELETE, it is impossible to maintain a local copy of all Kimai data.*

(* A caveat, there IS an impractical way to maintain a local copy of all Kimai is to 1) wipe your local database, 2) repull ALL data from Kimai API, 3) do this every time you synchronize, like every 5 minutes.)

@kevinpapst
Copy link
Member

Please don't sync "everything" all 5 minutes, otherwise I have to add stricter rate limits on the API 😁

Here is a script that might be interesting for your sync: https://github.com/kimai/api-sync
It already contains a lot of the sync logic, including customers/projects ....

You don't have to know which time the server thinks it is, you query relatively to your current time.
If you want to sync every hour, you use "now - 3600 seconds" as modified parameter.

The server uses UTC by the way to store the modified timestamp.

@fulldecent
Copy link
Author

Thank you, I will study that implementation.

I consider that a workaround. For example, I will need to re-insert all records from the past hour, which is unnecessary database churn. And also it will break if my or the API server are ever more than an hour off the time sync (which might happen at daylight savings time).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants