You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Jan 13, 2022. It is now read-only.
In the existing setup we do not have a mechanism of expiring outdated images, where 'outdated' refers to images which are not updated potentially due to them being removed from the upstream.
Suggested Improvement
It is necessary to have a method of reflecting whether images are outdated or not in the CC database. For all provider API scripts with a re-ingestion strategy, we know how old the oldest image can be. For example, since the oldest images in Flickr are re-ingested once every six months, we know that the updated_on value for any Flickr image cannot be more than six months old. If the updated_on value is more than six months old for a given image, it reflects that that image is no longer existent in Flickr. As a solution to this, we propose setting the removed_from_source value to true if the updated_on value is too old as per the re-ingestion strategy of the corresponding provider. In order to make an allowance for potential delays with the re-ingestion strategy, we extend the expected largest image age per provider by 10% (e.g. For Flickr, six months is extended to 6 months and 18 days).
Benefit
Knowing which images are outdated is essential to decide what not to be made available via CC Search and CC Catalog API.
The text was updated successfully, but these errors were encountered:
Current Situation
In the existing setup we do not have a mechanism of expiring outdated images, where 'outdated' refers to images which are not updated potentially due to them being removed from the upstream.
Suggested Improvement
It is necessary to have a method of reflecting whether images are outdated or not in the CC database. For all provider API scripts with a re-ingestion strategy, we know how old the oldest image can be. For example, since the oldest images in Flickr are re-ingested once every six months, we know that the
updated_on
value for any Flickr image cannot be more than six months old. If theupdated_on
value is more than six months old for a given image, it reflects that that image is no longer existent in Flickr. As a solution to this, we propose setting theremoved_from_source
value to true if theupdated_on
value is too old as per the re-ingestion strategy of the corresponding provider. In order to make an allowance for potential delays with the re-ingestion strategy, we extend the expected largest image age per provider by 10% (e.g. For Flickr, six months is extended to 6 months and 18 days).Benefit
Knowing which images are outdated is essential to decide what not to be made available via CC Search and CC Catalog API.
The text was updated successfully, but these errors were encountered: