Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

http_proxy and https_proxy env vars are ignored when installing extensions. #3836

Open
1 of 2 tasks
JorgeGarciaIrazabal opened this issue Jun 13, 2022 · 25 comments
Open
1 of 2 tasks

Comments

@JorgeGarciaIrazabal
Copy link

What happens?

I am trying to install the httpfs extension executing INSTALL 'httpfs'; using a proxy, but setting the environment variables do not work.

Note: Using python request library to download the extension works for me:

 r=requests.get("http://extensions.duckdb.org/662041e2b/linux_amd64/httpfs.duckdb_extension.gz",
     headers={"User-Agent":"DuckDB 0.3.4 662041e2b linux_amd64"},
     proxies={'http': '{my http proxy', 'https': 'my https proxy'},
     timeout)

Is there any other way to define what proxy to use when downloading extensions?

To Reproduce

  1. set then environment variables http_proxy and https_proxy
  2. Execute INSTALL 'httpfs'
  3. duckdb does not use the proxy to download the extension

Environment (please complete the following information):

  • OS: Linux
  • DuckDB Version: 0.3.4
  • DuckDB Client: Python

Before Submitting

  • Have you tried this on the latest master branch?
  • Python: pip install duckdb --upgrade --pre
  • R: install.packages("https://github.com/duckdb/duckdb/releases/download/master-builds/duckdb_r_src.tar.gz", repos = NULL)
  • Other Platforms: You can find binaries here or compile from source.
  • Have you tried the steps to reproduce? Do they include all relevant data and configuration? Does the issue you report still appear there?
@anovember
Copy link

I'm also seeing this issue. Perhaps there is a way to install from a downloaded httpfs.duckdb_extension.gz? My attempts to do this in python master have resulted in Segmentation fault.

@Mause
Copy link
Member

Mause commented Sep 1, 2022

I'm also seeing this issue. Perhaps there is a way to install from a downloaded httpfs.duckdb_extension.gz? My attempts to do this in python master have resulted in Segmentation fault.

If you decompress the file first (assuming you downloaded the correct build for your build of duckdb) you should be able to load it

@anovember
Copy link

When I try to remote install, I see the following error:

IO Error: Failed to download extension "httpfs" at URL "http://extensions.duckdb.org/0d2d7930d/linux_amd64_gcc4/httpfs.duckdb_extension.gz"

so I run:

wget http://extensions.duckdb.org/0d2d7930d/linux_amd64_gcc4/httpfs.duckdb_extension.gz
gzip -d httpfs.duckdb_extension.gz 

then in python:

In [1]: import duckdb                                                                                                                                                                                 

In [2]: duckdb.__version__                                                                                                                                                                            
Out[2]: '0.4.1-dev2374'

In [3]: con = duckdb.connect()                                                                                                                                                                        

In [4]: con.install_extension("./httpfs.duckdb_extension")                                                                                                                                            

In [5]: con.execute("select * from duckdb_extensions()").df() 
   ...:                                                                                                                                                                                               
Out[5]: 
      extension_name  loaded installed                                       install_path                                        description
0              excel    True       NaN                                                                                                      
1                fts    True      True                                         (BUILT-IN)          Adds support for Full-Text Search Indexes
2             httpfs   False      True  /home/ec2-user/.duckdb/extensions/0d2d7930d/li...  Adds support for reading and writing files ove...
3                icu    True      True                                         (BUILT-IN)  Adds support for time zones and collations usi...
4               json    True      True                                         (BUILT-IN)                   Adds support for JSON operations
5            parquet    True      True                                         (BUILT-IN)  Adds support for reading and writing parquet f...
6   postgres_scanner   False     False                                                     Adds support for reading from a Postgres database
7     sqlite_scanner   False     False                                                        Adds support for reading SQLite database files
8              tpcds    True      True                                         (BUILT-IN)      Adds TPC-DS data generation and query support
9               tpch    True      True                                         (BUILT-IN)       Adds TPC-H data generation and query support
10        visualizer    True       NaN                                                                                                      
In [6]: con.load_extension("httpfs")                                                                                                                                                                  
Segmentation fault
>

FYI this is on an AWS EC2 machine, x86_64 architechture.

!uname -m                                                                                                                                                                                     
x86_64

Thanks!

@Mause
Copy link
Member

Mause commented Sep 2, 2022

Please make sure the git hash of the extension you download matches duckdb.__git_revision__ as well

@anovember
Copy link

anovember commented Sep 2, 2022

Confirmed the same - 0d2d7930d. Thanks for your help in debugging.

@Mause
Copy link
Member

Mause commented Sep 2, 2022

Confirmed the same - 0d2d7930d. Thanks for your help in debugging.

Hmm okay, can you please raise a separate issue with all that information so we can track it separately from this one?

@JorgeGarciaIrazabal
Copy link
Author

as a work around, is there an easy way to get the full url to download the extension from python?

@Alex-Monahan
Copy link
Contributor

The Url appears in the error message! Then you can just un-gzip it and use it!

@JorgeGarciaIrazabal
Copy link
Author

@Mause Thank you very much for implementing a fix for this issue. I see the PR hasn't had activity in the last 19 days, are you planning on discarding it? I'll be happy to contribute if that helps.

@Mause
Copy link
Member

Mause commented Oct 19, 2022

@Mause Thank you very much for implementing a fix for this issue. I see the PR hasn't had activity in the last 19 days, are you planning on discarding it? I'll be happy to contribute if that helps.

Hi, I still plan on finishing and merging it, I've just been having issues with some of the tests

@JorgeGarciaIrazabal
Copy link
Author

Thank you for the update Mause.

@github-actions
Copy link

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 30 days.

@github-actions github-actions bot added the stale label Jul 30, 2023
@github-actions
Copy link

This issue was closed because it has been stale for 30 days with no activity.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Aug 30, 2023
@include
Copy link

include commented Sep 19, 2023

Hi there,

I think this wasn't fixed. I still get errors when I try to INSTALL from behind a proxy.

D install spatial;
Error: IO Error: Failed to download extension "spatial" at URL "http://extensions.duckdb.org/v0.8.1/linux_amd64/spatial.duckdb_extension.gz"
Extension "spatial" is an existing extension.

Are you using a development build? In this case, extensions might not (yet) be uploaded. (ERROR Connection)

Help. kthxbye,
F

@benoitdr
Copy link

benoitdr commented Oct 5, 2023

I confirm this bug is still not fixed

duckdb.IOException: IO Error: Failed to download extension "json" at URL "http://extensions.duckdb.org/v0.8.1/linux_amd64_gcc4/json.duckdb_extension.gz"

@Mause Mause reopened this Oct 5, 2023
@github-actions github-actions bot removed the stale label Oct 6, 2023
@wrb2
Copy link

wrb2 commented Oct 30, 2023

We hit the same issue. It makes it basically impossible to use duckdb while connected to company network.

@ilyanoskov
Copy link

Bumping this issue, it's still very much needed

@gabrielcnr
Copy link

I'm also having this problem. 👍

@davidia
Copy link

davidia commented Nov 21, 2023

Seeing this too

@alexparunov
Copy link

I am having the same issue.

@e-kotov
Copy link

e-kotov commented Jan 10, 2024

this is not just about the extensions, it is also impossible to query s3 connections :(

@MarcusWenzel-Bayer
Copy link

I'm encountering the same issue. I was able to install httpfs locally but I still can't query any online datasources (like S3), which is a pity...

@pvaezi
Copy link
Contributor

pvaezi commented Feb 10, 2024

Same here, behind proxy httpfs is failing due to SSL error. The query performance is impressive, but we can't use it in our production setup due to the same issue reported above.

@MarcusWenzel-Bayer
Copy link

@hannes & @Mause : is there a timeline for implementing this feature? I saw that @Mause already did some work on this and I even tried compiling the httpfs extension from here but it does not seem to work yet (I'm no expert on CPP unfortunately, so I can't really contribute myself). I just want to stress how crucial this is for DuckDB to permeate business use cases in the future. I really love it and I would have already implemented it in some use cases but currently I can only do off-line analyses with it since connections to web-resources are not possible without proxy support. I hope there's some resolution to this on the horizon. I'm happy to test things for you!

@szarnyasg
Copy link
Collaborator

Hi @MarcusWenzel-Bayer, thanks for chiming in. I agree that this feature is an important one. That said, we do not have a publicly available roadmap and cannot promise an ETA on any specific feature request.

To allow customers to facilitate the development of certain features, DuckDB Labs offers flexible support and consulting options, which can cover feature request.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.