Why does this extension need full blown Chromium.app? #2166

martincerven · 2024-09-02T22:59:43Z

Before submitting your bug report

I believe this is a bug. I'll try to join the Continue Discord for questions
I'm not able to find an open issue that reports the same bug
I've seen the troubleshooting guide on the Continue Docs

Relevant environment info

- OS:
- Continue:
- IDE:
- Model:
- config.json:

Description

There is Chromium.app in ~/.continue/.utils/.chromium-browser-snapshots/chromium/ installed without any user consent at all.

To reproduce

No response

Log output

No response

martincerven · 2024-09-02T23:29:10Z

@Patrick-Erichsen indexing? Can you provide more info?
It seems that Chromium was downloaded with mere extension update...really?

Patrick-Erichsen · 2024-09-03T01:09:38Z

Hey @martincerven , appreciate the feedback. This is for the documentation service. We just added a note here about why this is needed: https://github.com/continuedev/continue/blob/dev/docs/docs/features/talk-to-your-docs.md#how-it-works

Docs crawling happens entirely on a users local machine, so to handle sites with Javascript enabled we decided to pull down Chromium on install. Without this the majority of docs sites can't be crawled.

Our aim with this is to be more privacy preserving by allowing users to perform indexing locally rather than through our own servers, but curious to know if this is still behavior you'd prefer to disable.

otopetrik · 2024-09-04T08:24:52Z

This is terrifying.

An extension should never just silently download and execute some binary files from the internet.

And definitely not without getting permission from the user first.
That is a very sneaky behavior, and it opens up the question, whether the code does anything else unexpected/unwanted.

This is for the documentation service. We just added a note here about why this is needed: https://github.com/continuedev/continue/blob/dev/docs/docs/features/talk-to-your-docs.md#how-it-works

As of now, the documentation page still does not list the information about chromium download.

There is no information about the origin of the chromium binary (who built it?).

On a NixOS machine with working "chromium" (and "chrome") accessible in PATH, the extension (JetBrains variant) silently downloaded chromium from somewhere, executed it, and it failed with:

Error: Failed to launch the browser process!
/home/<username>/.continue/.utils/.chromium-browser-snapshots/chromium/linux-1350578/chrome-linux/chrome: error while loading shared libraries: libglib-2.0.so.0: cannot open shared object file: No such file or directory

From the sources it looks like it uses binaries built by Google, and it looks like the download at least uses "https" (no idea if there is any verification of signatures or at least checksums).

Given the sneaky nature of silent installation, it would make sense to question/verify whether the installed extension is actually clean build of the source from github (without any malicious changes). Does it download clean or backdoored chromium binary?

(It looks like the contents of continue-binary file in the installed JetBrains extension matches github code at least at configuring PCR_CONFIG - it configures only downloadPath (no hosts set), and import_puppeteer_chromium_resolver/require_lib13 falls back to https://storage.googleapis.com. Of course that is not a guarantee that there are not any malicious changes in the code further down.)

As there is a funded company behind this plugin (and not just a pseudonymous developer as was the case in xz utils), it is likely not developed as a backdoor distribution mechanism, but the "silently download binary from internet and execute it" behavior looks terrifyingly close to one.

Docs crawling happens entirely on a users local machine, so to handle sites with Javascript enabled we decided to pull down Chromium on install. Without this the majority of docs sites can't be crawled.

It is possible that some sites cannot be crawled without a chromium browser.
It is impossible, that the majority of sites cannot be crawled without the extension downloading chromium browser.

Need chrome or chromium browser? Fine.
If it is possible to use normal installation of a browser, just check whether it is installed, and if not, ask the user to install it.
If a specific version of chromium is really required, then download it only after the user added something like "allowChromiumDownload": true, to the config file. If the line is not there, it might be good idea to explain what is going on, and present the user with URL of required chromium binary. Allow them to download it manually and save it in a specific directory as a fallback. That might also be useful for indexing internal documentation in an air-gapped network.

Our aim with this is to be more privacy preserving by allowing users to perform indexing locally rather than through our own servers, but curious to know if this is still behavior you'd prefer to disable.

Using local chrome/chromium could be reasonable idea (e.g. can index internal documentation sites, etc...) - assuming it does not use the user's actual chromium profile, chromium sandboxing is enabled, and the browser is kept updated.

martincerven · 2024-09-04T09:00:22Z

Yeah, it's very similar to xz and also Crowdstrike where they pushed update to prod and it crashed 10% of windows machines.

Here it was also just update , it's very contrasting with for example Llama.cpp where they want to reimplement functionality to be not dependend even on other FOSS libraries.

So for me questions are:

is it really needed?
where does the Chromium comes from? Is it build from source, by Google? Downloaded by npm?
can you use user's browser installation?
how can you prevent that some other malicious code is not run on the Chromium? I actually didn't know this was possible at all

For me, the point of using open source extension is that anything can be checked by community, sneakily downloading some random binary from god knows where runs directly in opposition to this.

@Patrick-Erichsen can you comment on these points?

Right now this just seems that instead of Chromium.app, you can also download Malware.app without any user consent, or anything really, which is very dangerous precedent, more so for free and open source vscode extension.

Huge · 2024-09-04T10:21:33Z

Oh, thank you @martincerven for bringing that up!
It's also very concerning for disk space savy individuals, 541M is accounted for /home/huge/.continue/.utils/.chromium-browser-snapshots which would be like 5 % of my workspace backup.

@martincerven : could you please tidy up the OP a bit? Maybe adding what commit or which version was the last safe one.
Edit: This went in most likely with this, which happened 2 weeks ago.
I'll try to look further to check whether the extension version 8.5 is clean of this...

Edit: removing it from CLI did not break the basic functionality for me, so I'd advise savy users to do that for now.

KMouratidis · 2024-09-04T10:32:07Z

Skipping the paranoia (which everyone should have), it would be nice if users had the option of managing the chromium installation themselves and simply adding a config with the path to it. This would also let users update (or pin?) their chromium binaries, and possibly using a custom-compiled chromium (or ungoogled-chromium?).

eirnym · 2024-09-04T10:33:58Z

@Patrick-Erichsen it's a not an acceptable implementation. User privacy and choices in open source products is not an option or a feature. It's basics

I'd consider this feature only If all following points will be implemented:

This would be an explicit opt-in feature
Only user would be responsible to download and install engine of some kind
Only user would be responsible of URLs accessed by the tool
Consider an option to use non-js documentation fetching, so no browser is used.
User will be given a choice which browser to use. There's plenty of them.
Please also remember about Firefox-only users. This is a fully capable browser to download required data

Huge · 2024-09-04T11:12:16Z

Small guidance on avoiding the bloating util for now: Download continue-linux-arm64-0.9.197.vsix or continue-linux-arm64-0.8.46.vsix from GH release page and install it manually:

Props to @sestinj to at least advertise clearly the headless browser is to be used, in the v8.47 release notes.

av · 2024-09-04T11:15:50Z

To everyone arguing about explicit opt-in, this is the same level/type of dependency as everything from the continue package.json, I doubt that you really mean that all of the dependencies have to be opt-in.

It's puzzling to see security/privacy concerns too, as the installation above happens in an extension which was already allowed to do everything it needs on the user's machine, so any malicious intents already had a chance to have been executed.

With that, it's a completely reasonable ask to allow configuring the type of crawling that is performed (plain/rich), try reusing already installed browser(s) and optimise downloads to use lighter Chromium versions when the download is necessary, or use VS Code's Web Views. I'm sure maintainers will get there once this feature will have enough use. It's not completely reasonable, however, to see such an acute backslash, as all of the concerns (third-party code execution, disk usage bloat) are pretty much a given when installing this or any other kind of extensions for VS Code.

eirnym · 2024-09-04T11:57:31Z

@av some dependencies can be opt-in as an external pre-installed application is used.

Some dependencies like chromium are ok if you want to do something fast or the only browser you know is chromium based. Also using an existing browser instead of a browser from a dependency provide some important cookies and more control from a user.

Also a preinstalled browser is usually managed in companies, which would require a way more settings than author envisioned for this project and more hassle for a user to set them all.

animaldomestico · 2024-09-04T12:24:26Z

Also, they should take care of executing the browser inside a sandbox environment and make sure it is updated to the most stable version. There are many exploits out there in the wild.

I'm not a hacker guys (I'm just a peaceful animal), but I as I'm using Ubuntu, I was a little bit concerned about somethings:

If you want to run developer builds of Chromium/Chrome on Ubuntu 23.10+ (or possibly other Linux distros in the future), you'll need to either globally or selectively disable an Ubuntu security feature.

But if you do this, they say:

For a while, user namespaces have been available to unprivileged (e.g. non-root) users on most Linux distros, but they exposed a lot of extra kernel attack surface.

One explanation found here:

In a report from Google, 44% of the exploits they saw required unprivileged user namespaces as part of their exploit chain.

I prefer to not turn off Ubuntu security feature, so I won't use this for now. Forgive me if I said anything wrong, I just tried to help!

sestinj · 2024-09-05T22:48:21Z

Thanks to everyone who shared their feedback in this thread. We heard you loud and clear and have taken steps to address this both immediately and in the future.

As a principle, we will not dynamically download executables without user visibility. PR #2192 makes the change so that we fall in line with this principle for Chromium (it is entirely opt-in):

If we can successfully index the site requested without a headless browser, we will try this first
If it fails without a headless browser, then we will ask for user permission to try with Chromium
At any given time, you can set useChromiumForDocsCrawling in your config.json in order to define the behavior

These updates are now available in VS Code pre-release v0.9.207, will be released later today in a Jetbrains EAP, and as soon as these pre-releases have undergone the same initial testing we do each time, they will become main releases

There were also a few points in this thread worth addressing:

Why can't we use the Chromium that is already installed for Google Chrome or otherwise?
Puppeteer, the package used to control the headless browser, requires a specific chromium_revision for each version of the library, so we can’t easily allow users to manage the download/installation, or use existing installations
It's not in the docs!!
We've added the reference here where we believe it is most likely to be found by folks using the docs feature: https://docs.continue.dev/features/talk-to-your-docs#crawling-dynamically-generated-sites-with-usechromiumfordocscrawling
Do we actually need a headless browser?
We've been consistently testing against a large list of very common docs sites, many directly requested by users, to check whether we can successfully crawl them. We'd tried a pretty exhaustive list of non-headless browser tools before coming to the conclusion that one is necessary to get even passable success rates. If anyone proves this wrong, we're open to hearing solutions.

Hopefully it is understood by now that Continue takes great effort to secure your code, to the point of operating as a local-first application. In considering the trade-offs between hosting our own web crawling servers, to which the extension would have to send requests, vs. following the local-first pattern, we took this lens, but more than anything we value feedback. So again, thanks all for being swift to call us out, and thanks @Patrick-Erichsen for being just as swift in taking the necessary action.

I'll hold off on closing the issue for a minute so as not to be discouraging of further discussion!

eirnym · 2024-09-06T07:00:32Z

@sestinj thank you for step out and answer our questions. My concern is still there about mandatory settings and addons, which managed by a company for all browsers.

Additionally, it has no managed settings by a user (including cookies) and plugins such as to block ads and/or improve privacy of any kind. I don't like an idea to be tracked via an application.

The other way around the issue would be provide a separate downloader program which would download nesesery raw data including the output format application uses. The latter is for a possibility to create an alternative downloader applications if anybody of us would be willing to address.

martincerven · 2024-09-06T08:01:42Z

Thanks @sestinj and @Patrick-Erichsen for quick action, I was being downvoted to hell for bringing this up, but I felt it was a security issue, although I couldn't put my finger on exactly what irked me.

Now, I know there are few security points, some independent of continue:

vscode doen't have good permission system for extensions
is the puppeteer run with sandbox option?
in github library you use there is args: ["--no-sandbox"] ** isn't that a huge security no no? 🚩 even at puppeteer repo they say it's huge security risk
are you sure actually downloading headless-chromium? On macos I got whole Chromium.app which is I'd say different from headless? Again there is flag for that headless: false
are you sure you are downloading chromium binary from trusted sources? The only mention of source is here.
I don't know how would you even check that' but I wouldn't put trust of my company on the line based on some random 40 star resolver library (that's harsh, I know that author probably doesn't mean bad, but still, he could change url and bang you're downloading malicious browsers to millions of your customers)

Lastly,

Hopefully it is understood by now that Continue takes great effort to secure your code, to the point of operating as a local-first application. In considering the trade-offs between hosting our own web crawling servers, to which the extension would have to send requests, vs. following the local-first pattern, we took this lens, but more than anything we value feedback.

I'm very happy you took local-first approach even when we voice our concerns here. I honestly doesn't see inside how this crawling works, but last imaginary scenario:

If someone works on proprietary code for new rocket at SpaceX, woudn't it crawl proprietary docs and private repositories which would be then sent (If he's using ClosedAI or ChineseAI) to LLM provider as part of context for prompt? Of course this falls on user for not configuring the settings, but still...

Anyway, thanks @sestinj for addressing this issue, it will ultimately make your product better and more secure.

Martin

itpofy2024o · 2024-09-07T08:23:14Z

https://discussions.apple.com/thread/8582300?sortBy=rank to remove the notification, rm -rf .continue, stop using continue extension, report this app until they actually improve

sestinj · 2024-09-17T05:41:33Z

Appreciate the further thoughts here! We've thought about this pretty deeply, trying to take into account all of the feedback received and where we want to go with the product. Without committing to a particular direction, we are tentatively looking into building out an indexing server.

Though things are much better with the headless browser being entirely opt-in, I still wanted to give an update so you know we haven't simply forgotten about this : )

I will make sure to update here as soon as we have more info!

remixer-dec · 2024-10-10T17:16:35Z

Using electron was not enough, now every extension of every electron app will install its own chromium!
Now I have 3 additional chromiums in my system, thanks!

remixer-dec · 2024-10-10T17:43:45Z

Why can't we use the Chromium that is already installed for Google Chrome or otherwise?
Puppeteer, the package used to control the headless browser, requires a specific chromium_revision for each version of the library, so we can’t easily allow users to manage the download/installation, or use existing installations

I am pretty sure that is not true, or at least it was the other way a few months ago when I worked with it.

You just need to set PUPPETEER_SKIP_CHROMIUM_DOWNLOAD=true to install it without chromium and then you can run any chromium binary if you have access to it, some features may be less compatible in different versions, but the core functionality remains the same.

Patrick-Erichsen · 2024-10-14T18:11:45Z

@remixer-dec thanks for sharing that screenshot, I believe we gave that a try and ran into issues with Puppeteer complaining about an incompatible Chromium revision though. Will plan to try it out again though when we circle back to some work we have planned around docs service in the near future 👍

dosubot bot added area:installation Relates to the installation process kind:bug Indicates an unexpected problem or unintended behavior labels Sep 2, 2024

continuedev deleted a comment Sep 3, 2024

Patrick-Erichsen mentioned this issue Sep 5, 2024

feat: make Chromium install configurable #2192

Merged

2 tasks

RomneyDa added the needs-triage Waiting to be triaged label Oct 31, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why does this extension need full blown Chromium.app? #2166

Why does this extension need full blown Chromium.app? #2166

martincerven commented Sep 2, 2024

martincerven commented Sep 2, 2024

Patrick-Erichsen commented Sep 3, 2024

otopetrik commented Sep 4, 2024

martincerven commented Sep 4, 2024

Huge commented Sep 4, 2024 •

edited

Loading

KMouratidis commented Sep 4, 2024

eirnym commented Sep 4, 2024 •

edited

Loading

Huge commented Sep 4, 2024

av commented Sep 4, 2024

eirnym commented Sep 4, 2024

animaldomestico commented Sep 4, 2024 •

edited

Loading

sestinj commented Sep 5, 2024 •

edited

Loading

eirnym commented Sep 6, 2024 •

edited

Loading

martincerven commented Sep 6, 2024

itpofy2024o commented Sep 7, 2024

sestinj commented Sep 17, 2024

remixer-dec commented Oct 10, 2024

remixer-dec commented Oct 10, 2024 •

edited

Loading

Patrick-Erichsen commented Oct 14, 2024 •

edited

Loading

Why does this extension need full blown Chromium.app? #2166

Why does this extension need full blown Chromium.app? #2166

Comments

martincerven commented Sep 2, 2024

Before submitting your bug report

Relevant environment info

Description

To reproduce

Log output

martincerven commented Sep 2, 2024

Patrick-Erichsen commented Sep 3, 2024

otopetrik commented Sep 4, 2024

martincerven commented Sep 4, 2024

Huge commented Sep 4, 2024 • edited Loading

KMouratidis commented Sep 4, 2024

eirnym commented Sep 4, 2024 • edited Loading

Huge commented Sep 4, 2024

av commented Sep 4, 2024

eirnym commented Sep 4, 2024

animaldomestico commented Sep 4, 2024 • edited Loading

sestinj commented Sep 5, 2024 • edited Loading

eirnym commented Sep 6, 2024 • edited Loading

martincerven commented Sep 6, 2024

itpofy2024o commented Sep 7, 2024

sestinj commented Sep 17, 2024

remixer-dec commented Oct 10, 2024

remixer-dec commented Oct 10, 2024 • edited Loading

Patrick-Erichsen commented Oct 14, 2024 • edited Loading

Huge commented Sep 4, 2024 •

edited

Loading

eirnym commented Sep 4, 2024 •

edited

Loading

animaldomestico commented Sep 4, 2024 •

edited

Loading

sestinj commented Sep 5, 2024 •

edited

Loading

eirnym commented Sep 6, 2024 •

edited

Loading

remixer-dec commented Oct 10, 2024 •

edited

Loading

Patrick-Erichsen commented Oct 14, 2024 •

edited

Loading