feat: HTTP request tool #9228

michael-radency · 2024-04-26T10:23:38Z

Summary

Tool to visit a website

Related tickets and issues

https://linear.app/n8n/issue/AI-162/tool-to-visit-a-website

…ol-to-visit-a-website

netroy · 2024-05-01T10:09:59Z

packages/@n8n/nodes-langchain/nodes/tools/ToolHttpRequest/utils.ts

+					);
+				}
+				const returnData: string[] = [];
+				const html = cheerio.load(response);


We could use @mozilla/readability and jsdom here to cleanly extract the content that's likely relevant to an end-user.

Something like this perhaps:

import { JSDOM } from 'jsdom' import { Readability } from '@mozilla/readability' const dom = await JSDOM.fromURL(url) const article = new Readability(dom.window.document, { keepClasses: true, }).parse()

and then use article.content.

we could also consider using turndown to convert the html into markdown, which LLM tend to handle better than html IMO.

import Turndown from 'turndown' const markdown = turndown.turndown(article.content)

@netroy
what would be advantages over html-to-text + Cheerio? since we already using such setup for Html node

cheerio is great to either use css selectors to extract text, but leaves the burden of determining the semantics in the markup to the end user.

First download an article via curl https://www.bbc.com/news/articles/cldd6x6gglxo > news.html.

Then try this with cheerio:

const fs = require('fs'); const cheerio = require('cheerio'); const html = fs.readFileSync('news.html', 'utf8'); const $ = cheerio.load(html); console.log($('body').text());

I got this

With just readability:

(async () => { const { JSDOM } = require('jsdom'); const { Readability } = require('@mozilla/readability'); const Turndown = require('turndown'); const dom = await JSDOM.fromFile('news.html'); const article = new Readability(dom.window.document, { keepClasses: true, }).parse(); console.log(article.textContent); })();

I got this

With readability + turndown:

(async () => { const { JSDOM } = require('jsdom'); const { Readability } = require('@mozilla/readability'); const Turndown = require('turndown'); const dom = await JSDOM.fromFile('news.html'); const article = new Readability(dom.window.document, { keepClasses: true, }).parse(); const turndown = new Turndown({ headingStyle: 'atx', hr: '---', bulletListMarker: '-', codeBlockStyle: 'fenced', }); const markdown = turndown.turndown(article.content); console.log(markdown); })();

I got this

Perhaps we should add a "Extract as Markdown" option in the node to determine if we want to use markup to reduce semantic noise in the extracted text?

…ol-to-visit-a-website

…rs processing from model

…ol-to-visit-a-website

michael-radency added 2 commits April 26, 2024 13:21

⚡ setup

3744248

Merge branch 'master' of https://github.com/n8n-io/n8n into ai-162-to…

a0b0200

…ol-to-visit-a-website

michael-radency added node/new Creation of an entirely new node n8n team Authored by the n8n team labels Apr 26, 2024

michael-radency added 21 commits April 26, 2024 16:28

⚡ authentication support

ef9a6ae

Merge branch 'master' of https://github.com/n8n-io/n8n into ai-162-to…

ad957fc

…ol-to-visit-a-website

⚡ oauth2 authentication fixes, toll description prompt update

742dd46

Merge branch 'master' of https://github.com/n8n-io/n8n into ai-162-to…

4c84098

…ol-to-visit-a-website

⚡ clean up

f335ec7

⚡ fix for tool description prompt

2f86da6

Merge branch 'master' of https://github.com/n8n-io/n8n into ai-162-to…

fa73fc6

…ol-to-visit-a-website

⚡ ui updates, options

f8ed53b

Merge branch 'master' of https://github.com/n8n-io/n8n into ai-162-to…

3d7bed7

…ol-to-visit-a-website

⚡ ui for optimize response option

e086460

Merge branch 'master' of https://github.com/n8n-io/n8n into ai-162-to…

f9a1801

…ol-to-visit-a-website

⚡ optimize response util

f632455

Merge branch 'master' of https://github.com/n8n-io/n8n into ai-162-to…

ea784a3

…ol-to-visit-a-website

⚡ optimize response update

6e6b211

Merge branch 'master' of https://github.com/n8n-io/n8n into ai-162-to…

58b1e67

…ol-to-visit-a-website

⚡ ui updates

8147c5b

Merge branch 'master' of https://github.com/n8n-io/n8n into ai-162-to…

87f2b39

…ol-to-visit-a-website

⚡ clean up

c5beba4

Merge branch 'master' of https://github.com/n8n-io/n8n into ai-162-to…

0996986

…ol-to-visit-a-website

⚡ descriptions update

448bd7d

Merge branch 'master' of https://github.com/n8n-io/n8n into ai-162-to…

75967f1

…ol-to-visit-a-website

netroy reviewed May 1, 2024

View reviewed changes

michael-radency and others added 4 commits May 16, 2024 10:28

Merge branch 'master' into ai-162-tool-to-visit-a-website

6dd4524

combined url with path, placeholder notice, fixes

3f61121

Merge branch 'master' of https://github.com/n8n-io/n8n into ai-162-to…

efa655a

…ol-to-visit-a-website

⚡ updating UI

e3079f5

michael-radency added 30 commits May 18, 2024 06:42

Merge branch 'master' of https://github.com/n8n-io/n8n into ai-162-to…

a9e1a59

…ol-to-visit-a-website

⚡ dynamic tool func update

c937f14

placeholders definitions parameter

6d3671f

prompts updates

03666d3

Merge branch 'master' of https://github.com/n8n-io/n8n into ai-162-to…

7824434

…ol-to-visit-a-website

require comma separated values from LLM instead filled request options

fb817a1

Merge branch 'master' of https://github.com/n8n-io/n8n into ai-162-to…

fa6ac15

…ol-to-visit-a-website

Merge branch 'master' of https://github.com/n8n-io/n8n into ai-162-to…

23d5c07

…ol-to-visit-a-website

properties schema and DynamicStructuredTool support

ea4a9d3

construct shcema properties helper

8236d49

Merge branch 'master' of https://github.com/n8n-io/n8n into ai-162-to…

0cac68d

…ol-to-visit-a-website

Merge branch 'master' of https://github.com/n8n-io/n8n into ai-162-to…

2362c29

…ol-to-visit-a-website

reverted changes related to dynamic structured tool, updated paramete…

05bce18

…rs processing from model

Merge branch 'master' of https://github.com/n8n-io/n8n into ai-162-to…

00a8e2a

…ol-to-visit-a-website

Merge branch 'master' of https://github.com/n8n-io/n8n into ai-162-to…

af07e08

…ol-to-visit-a-website

json with placheholders processing

d671160

Merge branch 'master' of https://github.com/n8n-io/n8n into ai-162-to…

fbca8e9

…ol-to-visit-a-website

refactoring

71dabba

refactoring

7535658

Merge branch 'master' of https://github.com/n8n-io/n8n into ai-162-to…

77f1b7d

…ol-to-visit-a-website

refactoring

eadb6de

spelling fixes

1dd28d5

Merge branch 'master' of https://github.com/n8n-io/n8n into ai-162-to…

40dba41

…ol-to-visit-a-website

include http code if present in error

557f8e8

Merge branch 'master' of https://github.com/n8n-io/n8n into ai-162-to…

10c0216

…ol-to-visit-a-website

review updates

eb93f7d

tool telemetry

8924ca5

Merge branch 'master' of https://github.com/n8n-io/n8n into ai-162-to…

3e36a22

…ol-to-visit-a-website

Merge branch 'master' of https://github.com/n8n-io/n8n into ai-162-to…

13a47bf

…ol-to-visit-a-website

Merge branch 'master' of https://github.com/n8n-io/n8n into ai-162-to…

8bcfddd

…ol-to-visit-a-website

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: HTTP request tool #9228

feat: HTTP request tool #9228

michael-radency commented Apr 26, 2024

netroy May 1, 2024

michael-radency May 23, 2024

netroy May 23, 2024 •

edited

feat: HTTP request tool #9228

Are you sure you want to change the base?

feat: HTTP request tool #9228

Conversation

michael-radency commented Apr 26, 2024

Summary

Related tickets and issues

netroy May 1, 2024

Choose a reason for hiding this comment

michael-radency May 23, 2024

Choose a reason for hiding this comment

netroy May 23, 2024 • edited

Choose a reason for hiding this comment

netroy May 23, 2024 •

edited