Skip to content

feat(router): allow inline router definition#1877

Merged
B4nan merged 6 commits intoapify:masterfrom
Strajk:feat/inline-router
May 2, 2023
Merged

feat(router): allow inline router definition#1877
B4nan merged 6 commits intoapify:masterfrom
Strajk:feat/inline-router

Conversation

@Strajk
Copy link
Copy Markdown
Contributor

@Strajk Strajk commented Apr 20, 2023

As mentioned in discussion #1873

Crawlee is so good that router definition often can be straightforward, so it would be nice to have the option to define it "inline".
It also could be great for tutorials and snippets in docs.

const crawler = new CheerioCrawler({
  requestHandler: createCheerioRouter({
    [LABEL.INDEX]: async ({ enqueueLinks }) => {
      await enqueueLinks({
        selector: `#sitemap a`,
        label: LABEL.PRODUCTS,
      })
    },
    [LABEL.PRODUCTS]: async ({ $ }) => {
      await Actor.pushData({ title: $(`title`).text() })
    },
  }),
})
await crawler.run([{ url: `https://example.com`, label: LABEL.INDEX }])

I will add tests after agreeing that the "signature" is ok :)

I've used this wildcard to find&edit all crawlers, so hopefully I've changed all of them
Screen 2023-04-20 at 09 32 51

Comment thread packages/types/src/utility-types.ts Outdated

export type AllowedHttpMethods = 'GET' | 'HEAD' | 'POST' | 'PUT' | 'DELETE' | 'TRACE' | 'OPTIONS' | 'CONNECT' | 'PATCH';

export type RouterRoutes<Context, UserData extends Dictionary> = {
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't have strong options about this type name, feel free to suggest better

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

those changes should go to the core package, to the router.ts file i guess. the types package is here mainly to define the storage client interface (that must be adhered in apify-client/storage-local/memory-storage clients), and a place for utility types - but this is router specific, no reason to define it elsewhere i guess

i dont mind the name (i cant find any better right now, and its not a big deal i guess)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, moving...
GetUserDataFromRequest is already defined there
image

Copy link
Copy Markdown
Member

@B4nan B4nan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looking good, lets just move the types directly to router.ts

Comment thread packages/types/src/utility-types.ts Outdated

export type AllowedHttpMethods = 'GET' | 'HEAD' | 'POST' | 'PUT' | 'DELETE' | 'TRACE' | 'OPTIONS' | 'CONNECT' | 'PATCH';

export type RouterRoutes<Context, UserData extends Dictionary> = {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

those changes should go to the core package, to the router.ts file i guess. the types package is here mainly to define the storage client interface (that must be adhered in apify-client/storage-local/memory-storage clients), and a place for utility types - but this is router specific, no reason to define it elsewhere i guess

i dont mind the name (i cant find any better right now, and its not a big deal i guess)

Comment thread packages/core/src/router.ts Outdated
static create<Context extends CrawlingContext = CrawlingContext>(): RouterHandler<Context> {
static create<
Context extends CrawlingContext = CrawlingContext,
UserData extends Dictionary = GetUserDataFromRequest<Context['request']>
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same for the other places

Suggested change
UserData extends Dictionary = GetUserDataFromRequest<Context['request']>
UserData extends Dictionary = GetUserDataFromRequest<Context['request']>,

Comment thread packages/core/src/router.ts Outdated
Comment on lines +159 to +161
>(
routes?: RouterRoutes<Context, UserData>,
): RouterHandler<Context> {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i would keep it on one line here

Suggested change
>(
routes?: RouterRoutes<Context, UserData>,
): RouterHandler<Context> {
>(routes?: RouterRoutes<Context, UserData>): RouterHandler<Context> {

return Router.create<Context>();
export function createHttpRouter<
Context extends HttpCrawlingContext = HttpCrawlingContext,
UserData extends Dictionary = GetUserDataFromRequest<Context['request']> // quick&dirty copy-paste
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:]

Suggested change
UserData extends Dictionary = GetUserDataFromRequest<Context['request']> // quick&dirty copy-paste
UserData extends Dictionary = GetUserDataFromRequest<Context['request']>,

@Strajk
Copy link
Copy Markdown
Contributor Author

Strajk commented Apr 29, 2023

@B4nan comments resolved (in IDE, not it web UI), thx for them!

  • added basic test

not sure if and where to add it to docs ¯_(ツ)_/¯ it could be our little secret :D

@Strajk
Copy link
Copy Markdown
Contributor Author

Strajk commented Apr 29, 2023

i'm glad that I've added the tests cause it taught me about .use middleware and it was exactly what I needed for my project :D

@B4nan
Copy link
Copy Markdown
Member

B4nan commented May 2, 2023

We should have a routing guide I guess, for now, let's add at least something to the Router class jsdoc:

https://crawlee.dev/api/core/class/Router

I see the middleware example is not rendered correctly...

@Strajk
Copy link
Copy Markdown
Contributor Author

Strajk commented May 2, 2023

@B4nan added docs and fixed the middleware code issue

Comment thread packages/core/src/router.ts
@B4nan
Copy link
Copy Markdown
Member

B4nan commented May 2, 2023

Perfect, thanks! Let's just wait for the green CI, looking good to me.

@Strajk
Copy link
Copy Markdown
Contributor Author

Strajk commented May 2, 2023

tenor-57148603

@B4nan B4nan changed the title feat(router): option to define routes when creating router feat(router): allow inline router definition May 2, 2023
@B4nan B4nan merged commit 2d241c9 into apify:master May 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants