Open
Description
Which package is the feature request for? If unsure which one to select, leave blank
None
Feature
Crawlee supports other export formats such as json but doesn't seem to have any support for Web Archive formats.
Motivation
Saving it as web archive enables developers to psot-process the data as much as as the crawler has gathered.
Ideal solution or implementation, and any additional constraints
Optional feature switch integration with fastwarc library is preferred. Crawlee should be able to write raw file contents into the fastwarc library and fastwarc should be able to provide for the rest.
Alternative solutions or implementations
Alternatively, warcio is an alternative library for consideration. However, that is slower in python as it's written in pure python compared to fastwarc which is written mainly in C.
Other context
Was introduced to crawlee at pyconsg 2025