Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add last active admin functionality #214

Merged
merged 5 commits into from
Apr 28, 2024
Merged

Add last active admin functionality #214

merged 5 commits into from
Apr 28, 2024

Conversation

JJ-8
Copy link
Collaborator

@JJ-8 JJ-8 commented May 5, 2023

The implementation is mostly based on DU4L#16 but we update the timer in the database instead of the frontend. The update is done when the user fetches the profile, so for every first load done in a new tab in the browser.

Please note that this commit can NOT be merged before #213 is merged since the migration files are now not consecutive. The other PR contains a migration with ID 41, so this PR has a migration with ID 42 (and this answers everything of course). The migration will fail in this branch therefore. For testing, please rename the file and use a throw away database.

The implementation is mostly based on DU4L#16
but we update the timer in the database instead of the frontend.
The update is done when the user fetches the profile,
so for every first load done in a new tab in the browser.

Please note that this commit can NOT be merged before TFNS#213
is merged since the migration files are now not consecutive.
The other PR contains a migration with ID 41, so this PR has a migration
with ID 42 (and this answers everything of course).
The migration will fail in this branch therefore. For testing, please
rename the file and use a throw away database.
@JJ-8 JJ-8 requested a review from XeR May 5, 2023 19:47
@JJ-8 JJ-8 mentioned this pull request May 7, 2023
@XeR XeR self-assigned this May 8, 2023
@JJ-8 JJ-8 mentioned this pull request May 9, 2023
Copy link
Contributor

@XeR XeR left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting addition.

From my understanding, the last active date should only be visible by users with admin privileges. (and I agree)
Unfortunately this is not the case -- guests can access this information too by looking at the response of the getTeam operation.

% curl -s 'http://localhost:8080/graphql' -H 'content-type: application/json' -H "authorization: Bearer $token" --data-raw '[{"operationName":"getTeam","variables":{},"query":"query getTeam {\n  profiles {\n    nodes {\n      ...ProfileFragment\n      __typename\n    }\n    __typename\n  }\n}\n\nfragment ProfileFragment on Profile {\n  id\n  username\n  lastactive\n  color\n  description\n  role\n  nodeId\n  __typename\n}\n"}]' | jq '.[0].data.profiles.nodes[] | {username, lastactive}'

{
  "username": "a",
  "lastactive": "2023-05-09T23:50:39.634667+00:00"
}
{
  "username": "x",
  "lastactive": "2023-05-10T00:08:53.204503+00:00"
}

% echo $token | cut -f2 -d. | base64 -d           
{"user_id":2,"role":"user_guest","exp":1686268296,"iat":1683676296,"aud":"postgraphile","iss":"postgraphile"}

@JJ-8
Copy link
Collaborator Author

JJ-8 commented May 10, 2023

@XeR, thanks for reviewing the code! The PR of DU4L already contained this in their PR description:

Note that ctfnote.profile is SELECT-able by any logged in users. This means that using the API any user can view the last active date of any other users. I can change this if needed, but I since many forums also have a public "last online" date, I thought I'd make this internally public as well.

However, I agree with your view that it is more expected by users to have this admin-only since the UI is implemented this way too. I will fix this soon.

Related question: do you know any good way of viewing the current permission model of the postgresql database? As far as I know, the only way is to read all migrations and track the permissions yourself to get a clear overview of what is allowed for which role.

JJ-8 added 2 commits May 11, 2023 09:30
This implementation almost works, but fails in the websocket subscription.
Therefore, we change to column wise security.
Now the row `lastactive` is only accessible by the admin.
The public profile is accessed everywhere and in the admin view
the whole profile is queried.

For the me() function, we have to output a valid ctfnote.profile
but we can't access the `lastactive` column, so we replace it with
NULL by default. This fixes the problem we faced in the row wise
security model.
@JJ-8
Copy link
Collaborator Author

JJ-8 commented May 11, 2023

@XeR, in the end I have added column security to the lastactive column. Can you check again if you can query it as a non-admin? It should be fixed now. All websocket updates etc. should also be working perfectly fine (which was quite a hassle to get working).

@JJ-8 JJ-8 requested a review from XeR May 11, 2023 09:46
@XeR
Copy link
Contributor

XeR commented May 30, 2023

@XeR, in the end I have added column security to the lastactive column. Can you check again if you can query it as a non-admin? It should be fixed now. All websocket updates etc. should also be working perfectly fine (which was quite a hassle to get working).

I tested again. I'm afraid the problem is still there. :-(
I used the same commands as before :

$ curl -s 'http://localhost:8080/graphql' -H 'content-type: application/json' -H "authorization: Bearer $token" --data-raw '[{"operationName":"getTeam","variables":{},"query":"query getTeam {\n  profiles {\n    nodes {\n      ...ProfileFragment\n      __typename\n    }\n    __typename\n  }\n}\n\nfragment ProfileFragment on Profile {\n  id\n  username\n  lastactive\n  color\n  description\n  role\n  nodeId\n  __typename\n}\n"}]' | jq '.[0].data.profiles.nodes[] | {username, lastactive}'
{
  "username": "a",
  "lastactive": "2023-05-30T00:39:52.420508+00:00"
}
{
  "username": "b",
  "lastactive": "2023-05-30T00:41:01.353308+00:00"
}

$ echo $token | cut -f2 -d. | base64 -d
{"user_id":2,"role":"user_guest","exp":1687999261,"iat":1685407261,"aud":"postgraphile","iss":"postgraphile"}

Related question: do you know any good way of viewing the current permission model of the postgresql database? As far as I know, the only way is to read all migrations and track the permissions yourself to get a clear overview of what is allowed for which role.

Redirecting this question to @B-i-t-K

JJ-8 added 2 commits May 30, 2023 15:48
The websocket implementation has been changed to manual by
following https://www.graphile.org/postgraphile/subscriptions/.
This was required since the default implementation returns a profile, which
is restricted in access. Now a public profile is returned which is
not restricted.

All functionalities should stay the same. You are only able to query your own
last active date but not from other profiles.

Column wise security was not possible to fully complete since relations
between objects will break due to internal postgraphile stuff.
For example, postgraphile likes to use * internally which won't work
with column security.
@JJ-8
Copy link
Collaborator Author

JJ-8 commented May 30, 2023

@XeR, I had to revert the column wise security since postgraphile doesn't like that if you fully enforce the access (I didn't drop the previous privileges, that is why you were able to query it through the old method).

I have now reverted to the row wise security and implemented the websocket part through the manual method of postgraphile. The websocket now broadcasts the public profile while it is still being triggered by a (private) profile update.

I was able to reproduce your query through the JavaScript console and have now verified that you are only able to query the last active information of yourself. Can you also check again?

@XeR
Copy link
Contributor

XeR commented Jul 3, 2023

Hi,

I tested commit 54639c2 with a new environment (docker-compose down -v; docker-compose up).
I can confirm that this information is only visible for your account unless you are admin.

Now the boring part : this is personal data.
I can see this information being misused by administrators ("you said you would play but you did not"). Especially in corporate teams.
Do we really want to store this information, even though it can only be seen by admins ?
Is that even GDPR compliant ? (We do not really need to make CTFNote work, so that might be NOK)

I'd love to have the input of the users about this.

cc @SakiiR @B-i-t-K

@JJ-8
Copy link
Collaborator Author

JJ-8 commented Jul 4, 2023

For me this feature was useful to get more insight in the large amount of user accounts created in the past few years. Now I can easily see that a lot of accounts are not used anymore and can safely delete them. I didn't think of using this as an activity tracker for CTF participation. It is also totally not accurate for that, since it will only update the timestamp on a first load of the page (for example by opening CTFNote in a new tab or reloading the page).

you said you would play but you did not

I think this can be more easily answered by checking the "on it" status of the tasks during the CTF. That is an easier way to blame specific users if they only solve the welcome challenge and nothing else ;)

@JJ-8
Copy link
Collaborator Author

JJ-8 commented Jul 4, 2023

My take on the GDPR (I am not a lawyer btw):

Now the boring part : this is personal data.

The GDPR defines personal data as:

(1) ‘personal data’ means any information relating to an identified or identifiable natural person (‘data subject’); an
identifiable natural person is one who can be identified, directly or indirectly, in particular by reference to an
identifier such as a name, an identification number, location data, an online identifier or to one or more factors
specific to the physical, physiological, genetic, mental, economic, cultural or social identity of that natural person;

From a timestamp, I would not be able to identify a specific user. Instead the usernames the users provide could be personal data, since they can be their real name for example.

As stated in my previous comment, with the timestamp it is easier to identify inactive accounts. This makes it easier to fulfill article 5 1(e):

(e) kept in a form which permits identification of data subjects for no longer than is necessary for the purposes for
which the personal data are processed; personal data may be stored for longer periods insofar as the personal data
will be processed solely for archiving purposes in the public interest, scientific or historical research purposes or
statistical purposes in accordance with Article 89(1) subject to implementation of the appropriate technical and
organisational measures required by this Regulation in order to safeguard the rights and freedoms of the data subject
(‘storage limitation’);

Now I can know when I should start deleting their accounts, including their usernames which can be classified as personal data, in order to ensure that I don't store it for longer than necessary.

Therefore, I think it is fair to say that we can use article 6 1(f) as a lawful base for processing the users last activity timestamp so we can more accurately perform the task described in article 5 1(e).

@XeR
Copy link
Contributor

XeR commented Jul 4, 2023

For me this feature was useful to get more insight in the large amount of user accounts created in the past few years.

Ah, fair point.
Besides, if it was not useful, you would not have spent the time to write this PR :-p

The GDPR defines personal data as:

(1) ‘personal data’ means any information relating to an identified or identifiable natural person (‘data subject’) […]

From a timestamp, I would not be able to identify a specific user. […]

A personal data is a data that is related to an data subject.
The timestamp alone is not personal data, but since it is tied to an information that identifies you (your nickname), it becomes a personal data.
Storing this information alone is not personal data (but it is not very useful unless we're trying to detect brute-force attacks)

Instead the usernames the users provide could be personal data, since they can be their real name for example

It does not have to point to somebody in meat space for it to be considered "personal data".
If you can identify somebody because they use the same nickname (which is what most people do) it is good enough.

@peace-maker
Copy link
Contributor

I don't think this applies here in general, since you @XeR are not the one collecting the data in some CTFNote as a Service business, where you host the data for someone else. The GDPR doesn't apply in personal context. The teams hosting the tool for their friends and members are responsible to decide which data they want to store.

Since I assume you're hosting CTFNote yourself though, this feature should be designed in a way, that you're comfortable hosting your own software. So I'd propose a switch in the admin panel to toggle storing the timestamp and showing a notice in the login form when the timestamp is collected. That way everyone can decide for their own if their setup is large enough to disable the feature.

@JJ-8 JJ-8 merged commit 89f5152 into TFNS:main Apr 28, 2024
@JJ-8 JJ-8 deleted the 0-last-active branch April 28, 2024 14:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants