Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce API response time for puppeteer service #35

Closed
kartikayasijaa opened this issue May 19, 2024 · 7 comments
Closed

Reduce API response time for puppeteer service #35

kartikayasijaa opened this issue May 19, 2024 · 7 comments

Comments

@kartikayasijaa
Copy link
Collaborator

It looks like the service is taking longer to respond than what's allowed in the free tier of Vercel Functions, resulting in a timeout error. To address this, we'll need to make some changes to the architecture. If anyone is interested in this issue, please reach out to me so we can discuss the approach.

@Sambit-Mondal
Copy link

@kartikayasijaa I'm interested to work on this issue. Kindly assign it to me.
Thank you!

Sambit-Mondal added a commit to Sambit-Mondal/ScrapQuest that referenced this issue May 20, 2024
@Wahid7852
Copy link

Wahid7852 commented May 20, 2024

import { NextResponse } from 'next/server';
import puppeteer from 'puppeteer';

export async function POST(request: Request) {
  try {
    const { url } = await request.json();

    const urlRegex = /^(http|https):\/\/[^ "]+$/;

    if (!urlRegex.test(url)) {
      return NextResponse.json({ urlerror: "invalid url" });
    }

    const browser = process.env.API
      ? await puppeteer.connect({ browserWSEndpoint: process.env.API })
      : await puppeteer.launch();

    const page = await browser.newPage();
    await page.goto(url);
    const extractedText = await page.evaluate(() => document.body.innerText);

    await browser.close();

    return NextResponse.json({ extractedText });
  } catch (error) {
    console.error(error);
    return NextResponse.json({ error });
  }
}

maybe try this

@kartikayasijaa
Copy link
Collaborator Author

did you test it?

Sambit-Mondal added a commit to Sambit-Mondal/ScrapQuest that referenced this issue May 20, 2024
@Sambit-Mondal Sambit-Mondal mentioned this issue May 20, 2024
@kartikayasijaa
Copy link
Collaborator Author

@Wahid7852 Would you like to work on this?

@Abidsyed25
Copy link
Owner

@kartikayasijaa I think there is no solution exist for this problem. The only thing we can do is moving out of vercel or upgrading the plan. I am saying this because I already did a lot of research regarding this problem.

@kartikayasijaa
Copy link
Collaborator Author

@Abidsyed25 I don't think moving out of vercel is a good idea. We have to reduce the API response time anyhow. If we deploy it to some other platform like EC2, then surely it will work.
But it will create a huge load on server. Which would be difficult to scale.

@kartikayasijaa
Copy link
Collaborator Author

To reduce the response time. We can use an async worker. That will work in this case.

When there's a request, we will push it to a worker queue.

On frontend, we could do short polling to check the status of request.

If the status is completed, we will return the response.

@Abidsyed25 Abidsyed25 closed this as not planned Won't fix, can't repro, duplicate, stale Jul 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants