Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GDPR] Cron jobs for User data export wedged #15663

Closed
reetp opened this issue Oct 24, 2019 · 11 comments
Closed

[GDPR] Cron jobs for User data export wedged #15663

reetp opened this issue Oct 24, 2019 · 11 comments
Labels
stat: stale Stale issues will be automatically closed if no activity

Comments

@reetp
Copy link

reetp commented Oct 24, 2019

This refers:
#10777 (comment)

After the update to 2.1.1 I noticed that data exports still did not work.

I checked in Mongo and it appears there were jobs still 'pending'. These should have been cleared by the system.

These are some existing jobs on my main v2.1.1 server. They should have been cleared:

ExistingJobs

Here I added in a download dir (a mapped volume) and tried to add some new downloads, but this does not appear to have been picked up:

WithNewJobs

On my test server I then tried to see if I could clear the 'wedged' downloads as per #10777

Now get errors in my logs:

https://pastebin.com/Fq7q3N7t

SyncedCron ➔ info Finished "Generate download files for user data".
SyncedCron ➔ info Not running "Generate download files for user data" again.
SyncedCron ➔ info Not running "Generate download files for user data" again.
SyncedCron ➔ info Not running "Generate download files for user data" again.
SyncedCron ➔ info Not running "Generate download files for user data" again.
Exception on find Error: Meteor code must always run within a Fiber. Try wrapping callbacks that you pass to non-Meteor libraries with Meteor.bindEnvironment.
    at Object.Meteor._nodeCodeMustBeInFiber (packages/meteor.js:1186:11)
    at Meteor.EnvironmentVariable.EVp.get (packages/meteor.js:1199:10)
    at Object.collection.(anonymous function) [as findOne] (packages/matb33_collection-hooks.js:132:37)

We can see on the testbox the correct dir is set:

Screenshot_2019-10-23_18-20-28

But the jobs still have the wrong path:

Screenshot_2019-10-23_18-11-19

I think there is probably more than one issue here.

@reetp
Copy link
Author

reetp commented Oct 24, 2019

@reetp
Copy link
Author

reetp commented Oct 24, 2019

Ahh - just seen this - I presume this fixes 'wedged data' ?

#15654

Not sure it is going to fix my DB errors though :-/

@reetp
Copy link
Author

reetp commented Oct 31, 2019

Still exists in 2.2.0 because the fix does not appear to have been merged.

@reetp
Copy link
Author

reetp commented Oct 31, 2019

(Just so I remember, and for reference, these are previous attempts at fixes)

#14143
Error: ENOENT: no such file or directory, mkdir '/tmp/userData/hYHbjXWQMSRMu2szG/full'
Fixed in
#15294

#10777
Exception while invoking method 'requestDataDownload' Error: ENOENT: no such file or directory, mkdir '/tmp/userData/*********
Fixed in
#14645

This Issue 15666 should be fixed here but it hasn't been merged yet:
#15654

Rocket is still not GDPR compliant for anyone suffering this.

@reetp
Copy link
Author

reetp commented Dec 11, 2019

Still on 2.2.1

Also there seems to be an associated bug. I can see the error in the docker log but NOT in the admin View Logs

Admin Log Viewer

I20191211-10:24:00.050(0) SyncedCron ➔ info Starting "Generate download files for user data". 
I20191211-10:24:00.052(0) SyncedCron ➔ info Finished "Generate download files for user data". 
I20191211-10:24:00.084(0) SyncedCron ➔ info Not running "Generate download files for user data" again. 
I20191211-10:24:00.095(0) SyncedCron ➔ info Not running "Generate download files for user data" again. 
I20191211-10:24:00.099(0) SyncedCron ➔ info Not running "Generate download files for user data" again. 
I20191211-10:24:00.109(0) SyncedCron ➔ info Not running "Generate download files for user data" again. 

Docker log viewer

2019-12-11T10:24:00.050146782Z SyncedCron ➔ info Starting "Generate download files for user data".
2019-12-11T10:24:00.052432864Z SyncedCron ➔ info Finished "Generate download files for user data".
2019-12-11T10:24:00.059002722Z { Error: ENOENT: no such file or directory, open '/tmp/userData/vBvYiyciYHtuCtTY2/user.html'
2019-12-11T10:24:00.059002722Z     at Object.fs.openSync (fs.js:646:18)
2019-12-11T10:24:00.059002722Z     at Object.fs.writeFileSync (fs.js:1299:33)
2019-12-11T10:24:00.059002722Z     at startFile (app/user-data-download/server/cronProcessDownloads.js:28:5)
2019-12-11T10:24:00.059002722Z     at generateUserFile (app/user-data-download/server/cronProcessDownloads.js:444:2)
2019-12-11T10:24:00.059002722Z     at Promise.asyncApply (app/user-data-download/server/cronProcessDownloads.js:499:4)
2019-12-11T10:24:00.059002722Z     at /app/bundle/programs/server/npm/node_modules/meteor/promise/node_modules/meteor-promise/fiber_pool.js:43:40
2019-12-11T10:24:00.059002722Z   errno: -2,
2019-12-11T10:24:00.059002722Z   code: 'ENOENT',
2019-12-11T10:24:00.059002722Z   syscall: 'open',
2019-12-11T10:24:00.059002722Z   path: '/tmp/userData/vBvYiyciYHtuCtTY2/user.html' }
2019-12-11T10:24:00.084758618Z SyncedCron ➔ info Not running "Generate download files for user data" again.
2019-12-11T10:24:00.095355416Z SyncedCron ➔ info Not running "Generate download files for user data" again.
2019-12-11T10:24:00.099593879Z SyncedCron ➔ info Not running "Generate download files for user data" again.
2019-12-11T10:24:00.109422532Z SyncedCron ➔ info Not running "Generate download files for user data" again.

@vytasmk
Copy link

vytasmk commented Jan 8, 2020

I am on 2.3.1 also Docker. Yesterday upgraded from quite old version 0.7.1 to the latest.

Today was checking logs and found same error message regarding missing files in /temp/userData/xxxx/user.html

Was trying to find what that means and if it is bad or not. It looks like this is the problem when you recreate Docker container as all the files from /tmp/ folder inside Docker container are lost.

To get rid of this error you can clear one collection inside mongoDB. I used tehese steps

  1. connected to mongoDB docker-compose exec mongodb mongo (I am using docker-compose)
  2. selected meteor database use meteor;
  3. listed rocketchat_export_operations collection db.rocketchat_export_operations.find({});
  4. then deleted all documents inside that collection db.rocketchat_export_operations.deleteMany({});

And that error log message is gone. Then requested data export from my profile, stopped rocket chat, deleted that container, and recreated it. And the error was back, as the /tmp/folder was deleted with that Docker container. So using Docker that folder /tmp/userData must be made as volume to make that data persistent or there should be fixed in code to check if that folder still exists as it is inside temporary folder. I think this is a bug.

But later I got another error in logs when was trying to "Download My Data (HTML)" from "My Account". After requested and some time i got new error message:

rocketchat_1            | Exception on find Error: Meteor code must always run within a Fiber. Try wrapping callbacks that you pass to non-Meteor libraries with Meteor.bindEnvironment.
rocketchat_1            |     at Object.Meteor._nodeCodeMustBeInFiber (packages/meteor.js:1186:11)
rocketchat_1            |     at Meteor.EnvironmentVariable.EVp.get (packages/meteor.js:1199:10)
rocketchat_1            |     at Object.collection.(anonymous function) [as findOne] (packages/matb33_collection-hooks.js:132:37)
rocketchat_1            |     at ns.Collection.findOne (packages/mongo/collection.js:356:29)
rocketchat_1            |     at BaseDb.findOne (app/models/server/models/_BaseDb.js:135:21)
rocketchat_1            |     at BaseDb.findOneById (app/models/server/models/_BaseDb.js:139:15)
rocketchat_1            |     at Uploads.findOneById (app/models/server/models/_Base.js:132:29)
rocketchat_1            |     at getAttachmentData (app/user-data-download/server/cronProcessDownloads.js:105:26)
rocketchat_1            |     at msg.attachments.forEach.attachment (app/user-data-download/server/cronProcessDownloads.js:154:27)
rocketchat_1            |     at Array.forEach (<anonymous>)
rocketchat_1            |     at getMessageData (app/user-data-download/server/cronProcessDownloads.js:153:19)
rocketchat_1            |     at cursor.forEach.msg (app/user-data-download/server/cronProcessDownloads.js:250:25)
rocketchat_1            |     at each (/app/bundle/programs/server/npm/node_modules/meteor/npm-mongo/node_modules/mongodb/lib/cursor.js:765:11)
rocketchat_1            |     at handleCallback (/app/bundle/programs/server/npm/node_modules/meteor/npm-mongo/node_modules/mongodb-core/lib/cursor.js:204:5)
rocketchat_1            |     at nextFunction (/app/bundle/programs/server/npm/node_modules/meteor/npm-mongo/node_modules/mongodb-core/lib/cursor.js:585:5)
rocketchat_1            |     at AggregationCursor.Cursor.next (/app/bundle/programs/server/npm/node_modules/meteor/npm-mongo/node_modules/mongodb-core/lib/cursor.js:763:3)
rocketchat_1            |     at AggregationCursor.Cursor._next (/app/bundle/programs/server/npm/node_modules/meteor/npm-mongo/node_modules/mongodb/lib/cursor.js:211:36)
rocketchat_1            |     at loop (/app/bundle/programs/server/npm/node_modules/meteor/npm-mongo/node_modules/mongodb/lib/operations/cursor_ops.js:150:10)
rocketchat_1            |     at each (/app/bundle/programs/server/npm/node_modules/meteor/npm-mongo/node_modules/mongodb/lib/operations/cursor_ops.js:102:43)
rocketchat_1            |     at cursor.next (/app/bundle/programs/server/npm/node_modules/meteor/npm-mongo/node_modules/mongodb/lib/operations/cursor_ops.js:112:7)
rocketchat_1            |     at result (/app/bundle/programs/server/npm/node_modules/meteor/npm-mongo/node_modules/mongodb/lib/utils.js:414:17)
rocketchat_1            |     at executeCallback (/app/bundle/programs/server/npm/node_modules/meteor/npm-mongo/node_modules/mongodb/lib/utils.js:406:9)
rocketchat_1            |     at handleCallback (/app/bundle/programs/server/npm/node_modules/meteor/npm-mongo/node_modules/mongodb/lib/utils.js:128:55)
rocketchat_1            |     at cursor._next (/app/bundle/programs/server/npm/node_modules/meteor/npm-mongo/node_modules/mongodb/lib/operations/cursor_ops.js:195:5)
rocketchat_1            |     at handleCallback (/app/bundle/programs/server/npm/node_modules/meteor/npm-mongo/node_modules/mongodb-core/lib/cursor.js:204:5)
rocketchat_1            |     at nextFunction (/app/bundle/programs/server/npm/node_modules/meteor/npm-mongo/node_modules/mongodb-core/lib/cursor.js:585:5)
rocketchat_1            |     at done (/app/bundle/programs/server/npm/node_modules/meteor/npm-mongo/node_modules/mongodb-core/lib/cursor.js:651:7)
rocketchat_1            |     at queryCallback (/app/bundle/programs/server/npm/node_modules/meteor/npm-mongo/node_modules/mongodb-core/lib/cursor.js:699:18)
rocketchat_1            |     at /app/bundle/programs/server/npm/node_modules/meteor/npm-mongo/node_modules/mongodb-core/lib/connection/pool.js:532:18
rocketchat_1            |     at _combinedTickCallback (internal/process/next_tick.js:132:7)
rocketchat_1            |     at process._tickDomainCallback (internal/process/next_tick.js:219:9) nju6g8yxZqXn9unsv

That error message repeats ten or more times and stops then after some time (few minutes) it appears again. Disabling "User Data Download" in Administration stops that error.

@reetp
Copy link
Author

reetp commented Jan 8, 2020

So 2.3.x is still not GDPR compliant then (and probably 2.4.x as well), at least with docker installs.

:-(

@reetp
Copy link
Author

reetp commented Jan 13, 2020

After requested and some time i got new error message:

Tested this on 2.3.3 with a mapped volume /opt/uploads and can confirm the same error.

"_updatedAt" : ISODate("2020-01-13T12:28:41.647Z"),
"assetsPath" : "/opt/uploads/userData/cviqCK7M5YpbY3Jtc/assets",
"exportPath" : "/opt/uploads/userData/cviqCK7M5YpbY3Jtc"

Exception on find Error: Meteor code must always run within a Fiber. Try wrapping callbacks that you pass to non-Meteor libraries with Meteor.bindEnvironment.

Note it does seem to have generated the data in the directory but viewing the DB keys it seems that is still 'exporting' some rooms. Wondering if it is timing out or somesuch?

Here one room is completed but another (the main room) is still 'exporting'. When you look in the room2.html file it only goes up to Sun, 23 Dec 2018 18:08:57 GMT ????

        {
            "roomId" : "qGy7pdqDfQFjsWsdv",
            "roomName" : "room1",
            "userId" : null,
            "exportedCount" : 343,
            "status" : "completed",
            "targetFile" : "room1.html",
            "type" : "p"
        }, 
        {
            "roomId" : "NpatrG7JMmTZTaHiz",
            "roomName" : "room2",
            "userId" : null,
            "exportedCount" : 2101,
            "status" : "exporting",
            "targetFile" : "room2.html",
            "type" : "p"
        }, 
        {
            "roomId" : "8NaLuuFBdAtW8DtzYhYHbjXWQMSRMu2szG",
            "roomName" : "8NaLuuFBdAtW8DtzYhYHbjXWQMSRMu2szG",
            "userId" : "8NaLuuFBdAtW8DtzY",
            "exportedCount" : 2000,
            "status" : "exporting",
            "targetFile" : "8NaLuuFBdAtW8DtzYhYHbjXWQMSRMu2szG.html",
            "type" : "d"
        }, 
        {
            "roomId" : "eimqgMS8KwzJZP8TchYHbjXWQMSRMu2szG",
            "roomName" : "eimqgMS8KwzJZP8TchYHbjXWQMSRMu2szG",
            "userId" : "eimqgMS8KwzJZP8Tc",
            "exportedCount" : 9,
            "status" : "completed",
            "targetFile" : "eimqgMS8KwzJZP8TchYHbjXWQMSRMu2szG.html",
            "type" : "d"
    ],
    "status" : "exporting",
    "fileList" : [ 
        {

When I get 5 minutes I'll try on 2.4.x but imagine that this will still exist.

@photoninger
Copy link

On rocket.chat-3.0.2 user data export works for some users, but for another user the export job exports one room in an endless loop. The json file is growing and growing, the same messages are exported multiple times and in the mongodb the exportedCount stays at zero:

                        {
				"roomId" : "8tSxK5yhcgJFverfF",
				"roomName" : "wlan",
				"userId" : null,
				"exportedCount" : 0,
				"status" : "exporting",
				"targetFile" : "8tSxK5yhcgJFverfF.json",
				"type" : "p"
			}

@andypost
Copy link

andypost commented Jun 1, 2020

sometimes the exported data does not fit into 16M storage limit and starts to annoy in logs

@github-actions
Copy link
Contributor

github-actions bot commented Oct 9, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stat: stale Stale issues will be automatically closed if no activity
Projects
None yet
Development

No branches or pull requests

4 participants