Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a way to detect that an archive is zimit2 #1200

Merged
merged 1 commit into from
Jan 23, 2024

Conversation

Jaifroid
Copy link
Member

Fixes first bullet point of #1199.

It works like this:

  1. Assume all archives are 'open' as a baseline
  2. If we find warc-headers, it must be zimit classic, so change zimType to 'zimit'. This is true currently, but might not be true in the future - so if headers are re-introduced, the logic will need to be modified, perhaps by testing the location of Wombat in the ZIM, but that is considerably more complicated to do.
  3. If we don't find warc-headers, check if there is Scraper metadata. If the archive is not detected as 'zimit', but was scraped by warc2zim, then it must be zimit2.

@Jaifroid Jaifroid added enhancement backend zimit Code relating to the support of Zimit-style archives labels Jan 22, 2024
@Jaifroid Jaifroid added this to the v4.0 milestone Jan 22, 2024
@Jaifroid Jaifroid self-assigned this Jan 22, 2024
@Jaifroid Jaifroid merged commit a0a898a into main Jan 23, 2024
9 checks passed
@Jaifroid Jaifroid deleted the Add-a-way-to-detect-that-an-archive-is-zimit2 branch January 23, 2024 05:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend enhancement zimit Code relating to the support of Zimit-style archives
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant