Dev#1
Conversation
|
Review the following changes in direct dependencies. Learn more about Socket for GitHub. |
|
Warning Review the following alerts detected in dependencies. According to your organization's Security Policy, it is recommended to resolve "Warn" alerts. Learn more about Socket for GitHub.
|
|
here i have implemented:
|
There was a problem hiding this comment.
There's a lot to review here, so I'll start with the big structural changes.
May I suggest to use the same structure as we do for proxy-frontend and other projects?
In the backend plugin:
- api/
- services/
- stores/
- job store
- project store
- config store
- [move any global variables into stores]
- repositories/
- HTTPQClient code
- models/ (contains base objects like Request, ...)
- parsers/ (this is basically your extractors folder)
Corb3nik
left a comment
There was a problem hiding this comment.
Ok this was a tough PR to review, mostly because the backend code is very messy.
I added some comments, but there's still a lot more work to be done. Maybe we can merge/release this even if it's not perfect.
One more thing:
- After fixing the comments, could you add tests for each model and extractor? There's a lot of complicated code in there.
| jobsStore.updateJob(jobId, { | ||
| crawledUrls: stats.crawledUrls, | ||
| discoveredUrls: stats.discoveredUrls, | ||
| }); |
There was a problem hiding this comment.
Let's move the jobsStore call to the service.
| }); | ||
| }, | ||
| onComplete: (stats, jobId) => { | ||
| jobsStore.updateJob(jobId, { |
| this.recentSuccessRate = | ||
| this.completedTasks / (this.completedTasks + this.failedTasks); | ||
| } |
There was a problem hiding this comment.
I think we should remove recentSuccessRate since we can calculate it later when we need it (like in the frontend)
| this.totalTaskDuration += duration; | ||
| this.recentDurations.push(duration); | ||
|
|
||
| if (this.recentDurations.length > 100) { | ||
| this.recentDurations.shift(); | ||
| } |
There was a problem hiding this comment.
Do we need to track totalTaskDuration?
Also do we need to set a limit on the recent durations?
What we could do is just track all of the durations, and calculate the totalTaskDuration later in the frontend when we need it.
| this.totalTaskDuration += duration; | ||
| this.recentDurations.push(duration); | ||
|
|
||
| if (this.recentDurations.length > 100) { | ||
| this.recentDurations.shift(); | ||
| } |
|
|
||
| private buildUrl( | ||
| request: Parameters< | ||
| Parameters<BackendSDK["events"]["onInterceptResponse"]>[0] |
| private options: Required< | ||
| Omit< | ||
| CrawlerOptions, | ||
| | "requestQueue" | ||
| | "router" | ||
| | "requestHandler" | ||
| | "preNavigationHooks" | ||
| | "postNavigationHooks" | ||
| | "failedRequestHandler" | ||
| | "sessionPoolOptions" | ||
| > | ||
| > & { | ||
| requestQueue: RequestQueue; | ||
| router: Router; | ||
| requestHandler?: (context: CrawlingContext) => Promise<void>; | ||
| preNavigationHooks: PreNavigationHook[]; | ||
| postNavigationHooks: PostNavigationHook[]; | ||
| failedRequestHandler?: FailedRequestHandler; | ||
| sessionPoolOptions: CrawlerOptions["sessionPoolOptions"]; | ||
| }; |
There was a problem hiding this comment.
This should be its own HTTPCrawlerOptions type. This code is very difficult to read/understand/maintain.
| * CrawlerStore - Manages active crawler instances | ||
| */ | ||
|
|
||
| import type { HttpCrawler } from "../services/crawler"; |
There was a problem hiding this comment.
If the HttpCrawler class can be stored in a store, that means that it should we move it to the models folder.
The architecture we're going for here is:
- API imports the services
- Service imports the store/repository
- API/Services/Stores/Repositories import from the models
The stores should never import from the services like we're doing here.
| plugins: [ | ||
| { | ||
| kind: "backend", | ||
| id: "backend", |
There was a problem hiding this comment.
Lets' update the ID to crawler-backend
| }, | ||
| { | ||
| kind: 'frontend', | ||
| id: "frontend", |
|
Please Do Not Review YET! |
No description provided.