Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How does MegaQC determine the date (x axis) for each report uploaded? #422

Open
chrispyatt opened this issue Oct 27, 2022 · 3 comments
Open
Labels

Comments

@chrispyatt
Copy link

Our previous installation of MegaQC seemed to mostly work as expected, with each tranche of data appearing on the x axis approximately on the date of sequencing/analysis (so we could see what changed between runs). However, having now restored from backup on a new system, all data-points are being plotted on the same date (corresponding to the config_creation_date field in multiqc_data.json).

I don't believe I've changed any settings so what is the mechanism for deciding where on the x axis data should sit? I don't see any other dates within the JSON, and the upload dates are still the same as before (as the DB was restored rather than recreated), so I'm a bit confused about what could have changed.

I also note that recreating the database on a separate system does create correct plots (with plausible x axes) so I don't think the JSONs are wrong. I'd like to avoid doing this on the production version as the original database has a lot of saved filters that I'd have to remake manually.

@multimeric
Copy link
Collaborator

At the point where MegaQC ingests the reports, it chooses either the config_creation_date, or a manually provided date if one was passed in via the CLI or REST API. Then this date gets saved to the DB.

So I'm not sure why this particular issue happened to you. How did you restore the database? Also, can you run an SQL query to look at the data? Something like SELECT report_hash, created_at FROM report LIMIT 50; or apply some kind of date filter like WHERE created_at BETWEEN DATE '2020-02-16' AND DATE '2021-02-16' if you know the time period affected.

@chrispyatt
Copy link
Author

Perhaps the long term fix is to force it and supply dates along with the upload then, but I don't understand why I'm seeing different behaviour vs previous deployment.

I guess it's possible that I'm only seeing a subset of the data (for one particular config version) but unclear why that would be, given I know there are multiple 'created_at' values present in the DB.

The DB was backed up using pg_dump and then restored using pg_restore. The megaqc container was then spun up with a config to point it to the restored database (although I do seem to be having connection issues on upload - not sure whether that's due to a container setting or something outside on the new server).

Output of suggested query:

           report_hash            |     created_at
----------------------------------+---------------------
 f4ce4010a9a0007c4a67c3af630bc315 | 2021-10-07 11:45:00
 2d7aaf0ea5ddae123b94719d97e050dd | 2022-04-19 11:10:00
 1deddf31aeaa8246015a234eb1bff550 | 2022-04-12 09:54:00
 285b74798276dc524dbe0f7155ecc21d | 2022-01-19 13:21:00
 3bc94fcedd995b01cd173227b9f5dfae | 2020-05-28 21:18:00
 6c63b8d44b8351a0a0df92cb04e6da8b | 2020-06-01 12:08:00
 5a6aae582000bd4147e870170e908ab5 | 2020-06-28 06:29:00
 8a379e7df88021935f00a3320e78735a | 2020-06-28 06:52:00
 0917e86fb1947590109b02af23067e0e | 2020-06-28 06:57:00

@multimeric
Copy link
Collaborator

Perhaps the long term fix is to force it and supply dates along with the upload then

Yes, I would recommend this if you have more precise metadata available to you

I don't understand why I'm seeing different behaviour vs previous deployment.

I don't either, but I think you're most likely right that not all data is being loaded correctly. If you investigate further please let me know what you find.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants