-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Anonymize user id's in logs / Matomo #799
Comments
At https://wiki.hbz-nrw.de/pages/viewpage.action?pageId=765100087 some solutions are listed, which maybe can be used. |
All apache logs dealing with IPs are logged into one file. The NWBib-webApp doesn't log IPs in its log. |
These are anonymizers, not pseudonymizers. I wrote a pseudonymizer myself, code resides at |
After offline discussion: will set up matomo which is also dsgvo compatible. |
Matomo is set up. Available at it's real name as subdomain of lobid. Note: only https is allowed. |
Weekend's triggering of uploading the logs failed after 12 min. Not sure why. Triggered again and seems to work since 5 h. A bit weired: no logs even when in DEBUG mode. But |
Errors occur when the webserver is restarted. As |
The "error" messages like Logs import summary
Website import summary
Performance summary
It took 5 days to process these logs: After the import an 'archiving' process must be executed to use matomo. This took 5.5h. TODO:
"Check if all is there" is crucial. As far as I can see there is no possibility to discriminate lobid.org from subdomains at the moment, nwbib is missing etc. I remember to have changed once years ago the apache-logs syntax to log the subdomains as a column of its own. GoAccess was configured in that way subsequently. Now one would have to do this for matomo also. We have two syntactically different logs and I don't know if these can be merged into one in matomo. |
As discussed offline:
|
Data is imported up to May. Please check. Pathes likes |
Looks good. I noticed that two widgets only give information for nwbib.de but not for the lobid services:
Why is that? |
Looks better now, but links behind the shown "Pages following a site search" don't work for lobid-gnd and lobid-resources. |
URLS have to be defined one per line, not comma-separated. Have to reindex again. |
Reindexing is finished, data up to date (=>including June). |
Automatically splitting and indexing using crontab |
Sub-issue of hbz/lobid#363. We have to pseudonymise user id's in logs. I heard multiple numbers flowing around for how fast we have to do it.
@dr0i : Does the NWBib have it's own logs with user id's or are all requests to lobid-resources logged in one file?
The text was updated successfully, but these errors were encountered: