playwright-go-server

Background

Often, search engines only return a webpage's URL along with some snippets. However, sometimes it is necessary to retrieve the complete webpage content. To address this, the playwright-go-server project was developed. It leverages browser automation technology to fetch the full HTML content of a webpage and supports converting it to Markdown format, which is more convenient for subsequent processing by large language models.

Features

Webpage Content Fetching: Uses a browser pool (based on Playwright) to fetch the full HTML content of a given URL.
Markdown Conversion: Converts the fetched HTML content into Markdown format for easier text processing and inference by large models.
Efficient and Stable: Implements lazy initialization of a global session pool to reuse browser instances, ensuring fast and efficient response.

Installation & Dependencies

Clone the repository:

git clone https://github.com/litongjava/playwright-go-server.git
cd playwright-go-server

Install Go dependencies:
```
go mod tidy
```
Install the HTML-to-Markdown conversion library:
```
go build
```

docker

docker build -t litongjava/playwright-go-server:1.0.0 .
docker run -dit --name playwright-go-server --net=host litongjava/playwright-go-server:1.0.0

Usage

The project provides an HTTP service with an endpoint to fetch webpage content and convert it based on the provided format.

Endpoint: /fetch
Query Parameters:
- url: The URL of the webpage to fetch (required)
- format: The format of the returned content (optional; when set to markdown, returns content in Markdown format; otherwise returns the raw HTML)

Example

Fetching Markdown formatted content:

GET /fetch?url=https://example.com&format=markdown

curl "http://localhost/fetch?url=https://www.kapiolani.hawaii.edu/&format=markdown"

Running the Server

Start the service using the following command:

go run main.go

Once the server is running, you can make HTTP requests to the endpoint.

Contributing

Contributions are welcome! Please feel free to open issues or submit pull requests to improve the project.

License

This project is licensed under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
handler		handler
playwrightpool		playwrightpool
router		router
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
README.zh.md		README.zh.md
fly.toml		fly.toml
go.mod		go.mod
go.sum		go.sum
main.go		main.go
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

playwright-go-server

Background

Features

Installation & Dependencies

docker

Usage

Example

Running the Server

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

litongjava/playwright-go-server

Folders and files

Latest commit

History

Repository files navigation

playwright-go-server

Background

Features

Installation & Dependencies

docker

Usage

Example

Running the Server

Contributing

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages