Skip to content
This repository has been archived by the owner on Jan 13, 2022. It is now read-only.

[Infrastructure] Expire outdated images in the database #482

Closed
ChariniNana opened this issue Aug 13, 2020 · 0 comments · Fixed by #483
Closed

[Infrastructure] Expire outdated images in the database #482

ChariniNana opened this issue Aug 13, 2020 · 0 comments · Fixed by #483
Assignees

Comments

@ChariniNana
Copy link
Contributor

Current Situation

In the existing setup we do not have a mechanism of expiring outdated images, where 'outdated' refers to images which are not updated potentially due to them being removed from the upstream.

Suggested Improvement

It is necessary to have a method of reflecting whether images are outdated or not in the CC database. For all provider API scripts with a re-ingestion strategy, we know how old the oldest image can be. For example, since the oldest images in Flickr are re-ingested once every six months, we know that the updated_on value for any Flickr image cannot be more than six months old. If the updated_on value is more than six months old for a given image, it reflects that that image is no longer existent in Flickr. As a solution to this, we propose setting the removed_from_source value to true if the updated_on value is too old as per the re-ingestion strategy of the corresponding provider. In order to make an allowance for potential delays with the re-ingestion strategy, we extend the expected largest image age per provider by 10% (e.g. For Flickr, six months is extended to 6 months and 18 days).

Benefit

Knowing which images are outdated is essential to decide what not to be made available via CC Search and CC Catalog API.

@kgodey kgodey added this to Pending Review in Backlog Aug 13, 2020
@kgodey kgodey moved this from Pending Review to Internships 2020 in Backlog Aug 13, 2020
@kgodey kgodey removed this from Internships 2020 in Backlog Aug 27, 2020
@kgodey kgodey added this to Ready for Development in Active Sprint via automation Aug 27, 2020
@kgodey kgodey moved this from Ready for Development to Done in Active Sprint Aug 27, 2020
@TimidRobot TimidRobot removed this from Done in Active Sprint Jan 12, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Development

Successfully merging a pull request may close this issue.

2 participants