Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: support for file upload #664

Open
skitsanos opened this issue Nov 16, 2022 · 35 comments
Open

Feature request: support for file upload #664

skitsanos opened this issue Nov 16, 2022 · 35 comments
Labels
enhancement New feature or request.

Comments

@skitsanos
Copy link

I have a played a bit here with bun and hono to see how it works and realized that the request body comes there in raw format if I want to upload files.

surprisingly, getting body as a buffer worked out of the box:

 const reader = ctx.req.body.getReader()
 const result = await reader.read()
 const buffer = Buffer.from(result.value ?? [])

So I have my buffer, but the question is -- do we have anything is there any functionality to have a list of files within that buffer?

In other words, - maybe there is an example of implementing file uploading with hono, or at least some timeline on when it will be available.

Thank you.

@yusukebe
Copy link
Member

Hi @skitsanos !

Do you want to upload and get files with multipart/form-data? If so, you can use FormData, but unfortunately, it is not implemented in Bun yet. Below is an example using FormData.

app.post('/form-upload', async (c) => {
  const formData = await c.req.formData()
  const file = formData.get('file')
  const arr = await file.arrayBuffer()

  fs.writeFile('foo.png', Buffer.from(arr), (err) => {
    if (err) throw err
  })

  return c.text('uploaded!')
})

This is working well on Node.js with node-server.

I'll try to figure out how to implement this without FormData. Regardless, I think the best way is that Bun supports FormData.

@skitsanos
Copy link
Author

Yes, Yusuke, I wanted it to run via bun... With node, it would be a simpler task because there are tons of multipart parsers out there already and the task itself is not a complicated one at all, - boundary comes in HTTP header, so you split the request body by line that contains boundary... This would give you form values and files that were sent...

@yusukebe
Copy link
Member

I think we have to use the web-standard API as much as possible. In the case of Bun, it seems to be difficult to import FormData from WebKit now.

oven-sh/bun#621

so you split the request body by line that contains boundary

Hmm... :(

@skitsanos
Copy link
Author

What does parsing the request body has to do with that?:) I just explained the flow of how to make it happen :)

@yusukebe
Copy link
Member

Ah, yes. I know it:)

@skitsanos
Copy link
Author

I checked more, - one of the 'recipes' would be to stream to temp file multipart payload and then parse it in chunks...

As I can see there is multipart parser for Rust, I have played a bit with it:

use multipart::server::Multipart;
use std::io::{BufRead, Cursor, Read};

fn main() {
    let body = "----------------------------605243336009903535936235\r
Content-Disposition: form-data; name=\"id\"\r
\r
123\r
----------------------------605243336009903535936235\r
Content-Disposition: form-data; name=\"name\"\r
\r
TestName\r
----------------------------605243336009903535936235\r
Content-Disposition: form-data; name=\"file\"; filename=\"demo.txt\"\r
\r
demo/demo/demo\r
----------------------------605243336009903535936235\r
Content-Disposition: form-data; name=\"body\"\r
\r
TestBody\r
----------------------------605243336009903535936235--\r";

    fs::write("data/demo.txt".to_string(), Vec::from(body.as_bytes()));

    let body_from_file = fs::read_binary("data/demo.txt".to_string());
    let file = Cursor::new(body_from_file);

    let mut mp = Multipart::with_body(
        file,
        "--------------------------605243336009903535936235",
    );

    //https://stackoverflow.com/questions/73235131/how-to-extract-data-from-form-data-in-rust

    while let Some(mut field) = mp.read_entry().unwrap() {
        let data = field.data.fill_buf().unwrap();

        let s = String::from_utf8_lossy(data);
        //println!("{}", s)
        println!("headers: {:?}, data: {}", field.headers, s);
    }

In hono you could add multipart parsing as wasm done from some rust lib...

@yusukebe
Copy link
Member

@skitsanos

How about doing "polyfill"? I have not used it, but there is a library like this:

https://www.npmjs.com/package/formdata-polyfill

In hono you could add multipart parsing as wasm done from some rust lib...

😱

@skitsanos
Copy link
Author

You don't need to polyfill the FormData; as I told you before, the problem is not with creating from but with parsing incoming data. It comes in the format that I showed you in the Rust example above.

You can get a whole array of incoming data:

const buffer = [];
for await (const data of req.body)
{
  buffer.push(data);
}

console.log(buffer.toString())

But it will 'die', if you will upload some big file because you will run out of RAM, so you need to stream your request body to the disk first and then parse the whole thing...

@yusukebe yusukebe added the enhancement New feature or request. label Dec 14, 2022
@isaac-mcfadyen
Copy link

Hey @yusukebe! 👋

Any update on this? Bun now supports both FormData and Blob, wondering if there's still a blocker or if it's possible to get this added? 🙂

@yusukebe
Copy link
Member

yusukebe commented Mar 6, 2023

Hi @isaac-mcfadyen !

Yes, Bun supported FormData. There is no blocker so, I would like to consider supporting "file upload". By the way do you want it? 🙂

@isaac-mcfadyen
Copy link

Yeah, I'd love it being supported. 😄

@buzinas
Copy link

buzinas commented Mar 11, 2023

Hey, the same applies for @honojs/node. I tried using both await req.formData() and await req.parseBody(), but the former returns an empty FormData and the latter only returns a single file for my multi file upload.

@yusukebe
Copy link
Member

Hi @isaac-mcfadyen !

I have thought about the "file upload" feature, but I think it would be a better way to use the c.req.parseBody() that we currently have.

app.post('/upload', async (c) => {
  const { image } = await c.req.parseBody()

  if (image instanceof File) {
    console.log(image.name) // show the file name
    const buffer = await image.arrayBuffer()
    // do something with the buffer
  }

  return c.text('Uploaded!')
})

What do you think? By the way, it is not well documented and needs to be.

@skitsanos
Copy link
Author

You need to identify first if you have single file or array of files or filer/files with other form data, see my notes from above #664 (comment), I think it is faster to 'inject' formdata parser written in Rust and compiled into WASM, to have universal solution :)

@LebCit
Copy link

LebCit commented Jul 19, 2023

Hello @yusukebe

Every day I use Hono I get more amazed 🤯 !

app.post('/form-upload', async (c) => {
  const formData = await c.req.formData()
  const file = formData.get('file')
  const arr = await file.arrayBuffer()

  fs.writeFile('foo.png', Buffer.from(arr), (err) => {
    if (err) throw err
  })

  return c.text('uploaded!')
})

Please consider a How To section on the website, it will save new users (like me 🙂) a lot of time.

A small note if I may. I use Node V18.16.0 and vanilla JS not TS.
The code above closes the connection to the server and returns an error saying that arrayBuffer() is not a function.

Since I'm using on the front end filepond that makes a fetch request to the server, I was able to do the following without really posting on the route where the image(s) is(are) added:

// THIS IS THE ROUTE THAT FILEPOND LOOKS FOR TO MAKE A FETCH REQUEST.
.post("/add-image", async (c) => {
    const { images } = await c.req.parseBody()

    const arr = [images]

    arr.forEach(async (image) => {
        console.log(image)
        const buffer = await image.arrayBuffer()

        writeFile(`${join(process.cwd())}/static/images/${image.name}`, Buffer.from(buffer), (err) => {
            if (err) throw err
        })
    })

    // THIS WILL NEVER BE EXECUTED, JUST HERE TO AVOID CONTEXT ERROR.
    return c.text("uploaded!")
})

// THIS IS THE ROUTE TO REDIRECT TO THE IMAGES PAGES.
.post("/save-image", (c) => {
    return c.redirect("/admin-gallery")
})

Thank you very much.
I 💗 Hono

@yusukebe
Copy link
Member

Hi @LebCit !

Thank you for using Hono and giving nice advice. I'll consider adding that to the website.

@skitsanos
Copy link
Author

I just checked how file uploading works over bun 1.0.2 and it seems small files going through just fine, but tried 500Mb file, and it got 'frozen' with a simple route handler like this:

export default async (ctx: Context) =>
{
    console.log('got file')
    const {req} = ctx;

    return ctx.json(response.result('hello there'));
};

Has anyone tried to upload big files?

@crivera
Copy link

crivera commented Sep 28, 2023

I just checked how file uploading works over bun 1.0.2 and it seems small files going through just fine, but tried 500Mb file, and it got 'frozen' with a simple route handler like this:

export default async (ctx: Context) =>
{
    console.log('got file')
    const {req} = ctx;

    return ctx.json(response.result('hello there'));
};

Has anyone tried to upload big files?

The problem is that you are loading the whole file into memory which might cause OOM Exceptions. You will need to stream the file. Best way to do this is something like:

const routes = documentApp.post(
  "/api/document",
  async (c) => {
   // this only loads the meta data information of the form data
    const data = await c.req.formData()
    // now you can get the file
    const image = data.get('file')

    if (image instanceof File) {
      // use image.stream(); to stream the file to disk or a cloud provider like GCS, S3 etc
    }

    return c.jsonT({
      message: "uploaded",
    });
  }
);

Now the main problem is saw is that you don't get code completion on the client. When using zValidator it actually will parse the whole formBody which will again break for any larger files.

@yusukebe I was therefore thinking - what if we update the validator.ts form case to this:

case "form": {
        const contentType = c.req.header("Content-Type");
        if (contentType) {
          const formData = await c.req.formData();
          const form: BodyData = {};
          formData.forEach((value, key) => {
            form[key] = value;
          });
          value = form;
          c.req.bodyCache.formData = formData;
        }
        break;
      }

That way it's not reading the whole body. (still will work for strings, numbers, etc - but wont load files completely)

Now we can use this in the route:

 const routes = documentApp.post(
  "/api/document",
  auth(),
  zValidator(
    "form",
    z.object({
      name: z.string(),
      image: z.any(),
    })
  ),
  async (c) => {
    const { name, image } = c.req.valid("form");
    console.log(name);

    if (image instanceof File) {
      // use image.stream(); to stream it to anywhere
    }

    return c.jsonT({
      message: "uploaded",
    });
  }
);

And this still seems to work perfectly.

Any feedback would be appreciated.

@yusukebe
Copy link
Member

Hi @crivera

You are right. Hono's Validator once extracts the contents of the formData. But, hmm, this is a necessary step to validate the form and json. I can't think of a solution. So, it would be better to warn users that they should not handle a big file while using the Validator.

@crivera
Copy link

crivera commented Sep 29, 2023

Hi

json has a different case statement, so I don't think that should be affected.

Also I tested with regular objects using form and the new validator

zValidator(
    "form",
    z.object({
      name: z.string(),
      street: z.string(),
    })
  ),

and doing this in the code after

 const { name, street } = c.req.valid("form");
   console.log(name, street);

works completely fine

@muhaimincs
Copy link

Hi @skitsanos !

Do you want to upload and get files with multipart/form-data? If so, you can use FormData, but unfortunately, it is not implemented in Bun yet. Below is an example using FormData.

app.post('/form-upload', async (c) => {
  const formData = await c.req.formData()
  const file = formData.get('file')
  const arr = await file.arrayBuffer()

  fs.writeFile('foo.png', Buffer.from(arr), (err) => {
    if (err) throw err
  })

  return c.text('uploaded!')
})

This is working well on Node.js with node-server.

I'll try to figure out how to implement this without FormData. Regardless, I think the best way is that Bun supports FormData.

This seems work perfectly but how do you response with Internal Server Error

@skitsanos
Copy link
Author

@muhaimincs That method won't work when you deal with files 100-200Gb each... The safe way of doing things would be to stream upload to the disk

@muhaimincs
Copy link

app.post('/form-upload', async (c) => {
  const formData = await c.req.formData()
  const file = formData.get('file')
  const arr = await file.arrayBuffer()

  fs.writeFile('foo.png', Buffer.from(arr), (err) => {
    if (err) throw err /** I just concern about this line. **/
  })

  return c.text('uploaded!')
})

If it hit the err, what should happen? How do you response with c.json()? In my case, it always halt the apps

@crivera
Copy link

crivera commented Nov 22, 2023

app.post('/form-upload', async (c) => {

  const formData = await c.req.formData()

  const file = formData.get('file')

  const arr = await file.arrayBuffer()



  fs.writeFile('foo.png', Buffer.from(arr), (err) => {

    if (err) throw err /** I just concern about this line. **/

  })



  return c.text('uploaded!')

})

If it hit the err, what should happen? How do you response with c.json()? In my case, it always halt the apps

Its bc you are no awaiting the writing of the file - wrap it in a promise and then catch any error outside and return c.json()

@isaac-mcfadyen
Copy link

isaac-mcfadyen commented Nov 22, 2023

Or you could just use fs/promises which has native support for promises (no need for wrapping in a promise manually).

import { writeFile } from "fs/promises";

app.post('/form-upload', async (c) => {
  const formData = await c.req.formData()
  const file = formData.get('file')
  const arr = await file.arrayBuffer()

  try {
      await writeFile('foo.png', Buffer.from(arr));
  } catch (e) {
      return c.json({ error: e.toString() }, 500);
  }

  return c.text('uploaded!')
})

@muhaimincs
Copy link

Thanks for your help. Here is what I come out with

async function writingFile(path, fileName, data) {
  return new Promise((resolve, reject) => {
    if (!existsSync(path)) {
      fs.mkdir(path, { recursive: true }, (err) => {
        if (err) {
          reject(err)
        }
      })
    }

    fs.writeFile(fileName, data, (err) => {
      if (err) {
        reject(err)
      }
      resolve('Success')
    })
  })
}

@isaac-mcfadyen
Copy link

Thanks for your help. Here is what I come out with

That will definitely work - just as an FYI you don't need the function to be marked as async, you're not doing any awaiting in it.

@muhaimincs
Copy link

aha yes.

@ps73
Copy link

ps73 commented Jan 2, 2024

I have created a simple package based on busboy to handle very large files memory efficient with hono:
https://github.com/ps73/hono-upload

Warning

Keep in mind that this is an early version of the package. Might have some issues and breaking changes.

I have tested the example with 100GB Files and it works really well only consuming around 150-200 MB of memory.

I have also tested it in a project with https://github.com/tweedegolf/storage-abstraction and it works also.

@dEvAshirvad
Copy link

This approach works for me in new version of hono

app.post("/upload", async (c) => {
		const body = await c.req.parseBody();
		const file = body["file"];

		if (!(file instanceof File)) {
			throw new Error("Provided input is not a File instance.");
		}

		const buffer = await fileToBuffer(file);

		writeFile("foo.pdf", buffer, (error) => {
			if (error) {
				console.error("Error writing file:", error); // Log error
			} else {
				console.log("File written successfully"); // Confirm success
			}
		});
	});

@vickyRathee
Copy link

@dEvAshirvad What is fileToBuffer() ? I can't see it utility

@skitsanos
Copy link
Author

The problem with the large files is still standing. Just tried to upload a 14Gb file, and the connection resets, but the app doesn't crash. Am I missing something? There is a limit set somewhere?

@EdamAme-x
Copy link
Contributor

EdamAme-x commented Jun 24, 2024

The problem with the large files is still standing. Just tried to upload a 14Gb file, and the connection resets, but the app doesn't crash. Am I missing something? There is a limit set somewhere?

There should be no such limitation on the Hono side, so it may be a runtime issue.

@skitsanos
Copy link
Author

yes, just verified with "blank" Bun, it is not from Hono. It is something on Bun Error: write EPIPE, app stays alive but drops that http connection

@skitsanos
Copy link
Author

Okay, solved it, it seems:

const server = Bun.serve({
    port: Number(PORT),
    fetch: app.fetch,
    maxRequestBodySize: 200_000_000_000,
    websocket
});

maxRequestBodySize was missing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request.
Projects
None yet
Development

No branches or pull requests