In order to support multiple gigabyte sized files and lots of different function outputs, we need to change how Task Reporting is done.
A temporary fix was introduced in #1048.
Approaches brought up so far:
- WebSocket based approach: previously implemented in a previous version of indexify https://github.com/tensorlakeai/indexify/blob/v1/src/ingest_extracted_content.rs
- Using pre-signed URLs to upload files
- Two phase API approach allowing uploading individual files and then finalizing the task separately
- Support two approaches, the existing ingest API and another more performant one to keep the executor simple to configure by default.