fix: update ChromeDriver options on restricted environments and add ChromeDriver options as function parameter#3043
Merged
ZanSara merged 20 commits intodeepset-ai:mainfrom Aug 22, 2022
Conversation
agnieszka-m
requested changes
Aug 16, 2022
Contributor
agnieszka-m
left a comment
There was a problem hiding this comment.
Hey, I added some minor language comments to comply with our writing guidelines. Thanks!
sjrl
reviewed
Aug 16, 2022
sjrl
reviewed
Aug 16, 2022
Contributor
|
@danielbichuetti This is looking good to me! Although I think we should have someone from the core-engineering team for the final approval. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Related Issues
Proposed Changes:
webdriver_optionsparameter, so the user can set the parameters that his specific scenario demands--disable-dev-shm-usage--no-sandbox--disable-gpuThere is no need to enable GPU acceleration on a text crawler. GPU is expensive in the cloud.--disable-dev-shm-usageUsually in container environments there is no access to shared memory, or it's set with the default size of 64 MB. Disabling its usage, Chrome will write to a temporary directory. Using shared memory may improve performance just in high testing workloads, which is not our use case.-
--single-processAs haystack doesn't support multi-tabs, there is no point to let Chrome spawn multiple processes. It will just increase the memory footprint.How did you test it?
Notes for the reviewer
This changes essentially will make haystack Crawler node able to run in multiple environments without any user extra effort. In case any user has more specific needs, it will be possible using the parameter.
Checklist