Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix generic_json parser #1365

Merged
merged 2 commits into from
Mar 1, 2024
Merged

Fix generic_json parser #1365

merged 2 commits into from
Mar 1, 2024

Conversation

jimwins
Copy link
Contributor

@jimwins jimwins commented Feb 27, 2024

Summary

This fixes the generic_json parser by not always assuming the JSON needs special handling and doing a more straightforward workaround when it might.

Also adds support for a tags field.

Related issues

Fixes #1347.

Changes these areas

  • Bugfixes
  • Feature behavior
  • Command line interface
  • Configuration options
  • Internal architecture
  • Snapshot data layout on disk

Rather than by assuming the JSON file we are parsing has junk at the beginning
(which maybe only used to happen?), try parsing it as-is first, and then fall
back to trying again after skipping the first line

Fixes ArchiveBox#1347
@pirate
Copy link
Member

pirate commented Feb 29, 2024

Looks good, thanks! Ready to merge @jimwins?

@jimwins
Copy link
Contributor Author

jimwins commented Mar 1, 2024

Yeah, I think this is good to go. I'll open a new issue to track adding JSONL handling.

@pirate pirate merged commit 7b042c8 into ArchiveBox:dev Mar 1, 2024
1 of 2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants