Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: keep only one (or two?) mongodb export #9946

Merged
merged 2 commits into from
Mar 21, 2024
Merged

Conversation

stephanegigandet
Copy link
Contributor

We currently generate 3 MongoDB exports, each of them takes more than one hour to generate every day:

-rw-r--r--   1 off  off   4675987404 Mar 18 08:08 fr.openfoodfacts.org.products.rdf
-rw-r--r--   1 off  off   7311036936 Mar 18 09:36 openfoodfacts-products.jsonl.gz
-rw-r--r--   1 off  off   8859132762 Mar 18 10:28 openfoodfacts-mongodbdump.gz
-rw-r--r--   1 off  off           95 Mar 18 10:30 gz-sha256sum
-rw-r--r--   1 off  off           63 Mar 18 10:30 gz-md5sum
-rw-r--r--   1 off  off   8932229671 Mar 18 11:42 openfoodfacts-mongodbdump.tar.gz

I suggest that we completely remove openfoodfacts-mongodbdump.tar.gz which has exactly the same usage as openfoodfacts-mongodbdump.gz

I'm not sure if we should keep the jsonl. As I recall, it was added for Robotoff. Do we still need it? Or do we still want to provide it to others, so that's it's possible to get JSON data without having MongoDB?

@stephanegigandet stephanegigandet requested a review from a team as a code owner March 18, 2024 14:12
@github-actions github-actions bot added export MongoDB We have 2 mongodb collections: one for current products, and one for obsolete products 📚 Documentation Documentation issues improve the project for everyone. labels Mar 18, 2024
Copy link

sonarcloud bot commented Mar 18, 2024

Quality Gate Passed Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
No data about Coverage
No data about Duplication

See analysis details on SonarCloud

@stephanegigandet
Copy link
Contributor Author

See also #9946

We will need to update the documentation to remove references to the removed dumps.

@hangy
Copy link
Member

hangy commented Mar 18, 2024

Should we put up a deprecation notice for a month or so before removing this export?

@stephanegigandet
Copy link
Contributor Author

Should we put up a deprecation notice for a month or so before removing this export?

We could, but I'm not sure where we could put that deprecation notice for it to have a chance to be read by people who may have automated the download of the export.

Copy link
Member

@alexgarel alexgarel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's ok. We had both for about one year so I think we can move on !

@alexgarel
Copy link
Member

@stephanegigandet before merging, maybe, amend the data page + be sure to remove the old archive in production (better have a 404 than serving an old archive forever).

@stephanegigandet
Copy link
Contributor Author

@alexg : "before merging, maybe, amend the data page + be sure to remove the old archive in production (better have a 404 than serving an old archive forever)."

Date page update PR: openfoodfacts/openfoodfacts-web#563

I'll remove the old archive when this is deployed.

@stephanegigandet stephanegigandet merged commit c93bf32 into main Mar 21, 2024
14 checks passed
@stephanegigandet stephanegigandet deleted the mongodb-export branch March 21, 2024 16:08
Payne680 pushed a commit to Payne680/openfoodfacts-server that referenced this pull request Mar 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
📚 Documentation Documentation issues improve the project for everyone. export MongoDB We have 2 mongodb collections: one for current products, and one for obsolete products
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants